Postgres и искусственный интеллект в современном мире
Artificial intelligence, machine learning, and deep learning are intertwined capabilities that attempt to solve problems that defy traditional computational solutions — problems include fraud detection, voice recognition, and search result recommendations. While they defy simple computation, they are computationally expensive, involving computation of perhaps millions of probabilities and weights. While these computations can be done outside of the database, there are specific advantages of doing machine learning inside the database, close to where the data is stored. This presentation explains how to do machine learning inside the Postgres database.
Слайды
Видео
Другие доклады
-
Andrey Lepikhov Postgres Professional
Постгрессовый планнер с памятью
Postgres is able to build optimal query plans for most practical cases. However, sometimes, for objective reasons, for complex queries or because of open issues in the planner itself, it can make mistakes and produce a suboptimal plan. Because of this, the execution time of such a request can increase tenfold. If the query is executed frequently, then from time to time this query takes longer than it could, and the DBMS as a whole produces a lower TPS. If the planner is able to record his mistakes and take them into account in the subsequent planning of the same query, then this will improve the characteristics of the DBMS during its operation. We present the results of the development of a PostgreSQL DBMS extension that stores the query execution history and implements the planner recommendation mechanism. We show how knowledge about previously executed queries can improve the performance of subsequent ones.
-
Tatsuro Yamada NTT ComwareJulien Rouhaud
Построение автоматического консультанта и инструментов настройки производительности в PostgreSQL
PostgreSQL is a mature and robust RDBMS since it has 30 years of history. Over the year, its query optimizer has been enhanced and usually produces good query plans.
However, can it always come up with good query plans? The optimization process has to use some assumptions to produce plans fast enough. Some of those assumptions are relatively easy to check (e.g. statistics are up-to-date), some harder (e.g. correct indexes are created), and some nearly impossible (e.g. making sure that the statistic samples are representative enough even for skewed data repartition). For now, given those various caveats, DBA sometimes can't always realize easily that they miss a chance to get a meaningful performance improvement.
To help DBA to get a truly good query plan, we'll present below some tools that can help to fix some of those problems by providing a missing index adviser, looking for extended statistics to create, and row estimation error correction information to get appropriate join orders with join methods automatically.
- pg_qualstats: provides a new index and extended statistics suggestions to gather many predicate statistics on the production workload.
- pg_plan_advsr: provides alternative good query plans automatically to analyze iterative query executions information to fix estimation rows error.
In this talk, we will explain how those tools work under the hood and see what can be done, how they can work together. Also, we will mention what other tools also exist for related problems. Therefore, it will be useful for DBA who are interested in improving query performance or want to check whether current settings of indexes and statistics are adequate.
-
Alicja Kucharczyk MicrosoftSushant Pandey Microsoft
История одной миграции
In this talk we want to present how Microsoft team composed of people from two different teams approached the project and solved the migration issues using ora2pg and was able to prove that Postgres Single Server can perform equally well as Oracle Exadata. We will present our ways of working and also some main technical challenges that we faced including migration of BULK COLLECT’s, hierarchical queries, refcursors and others more complicated Oracle constructs.
The story about a challenging PoC that proved that Postgres can achieve the same performance as Oracle Exadata. The schema that was migrated wasn’t the simplest one you might see. It was quite the opposite. The code was loaded with dynamic queries, BULK COLLECT’s, nested loops, CONNECT BY statements, global variables and lot of dependencies. Ora2pg did a great job converting the schema but left a lot of work to do manually. Also estimates produced by the tool were highly inaccurate since the logic required not the migration but total re-architecture of the code. In this talk we want to present how Microsoft team composed of people from two different teams approached the project and solved the migration issues using ora2pg and was able to prove that Postgres Single Server can perform equally well as Oracle Exadata. We will present our ways of working and also some main technical challenges that we faced including:
- How estimates do (not) work
- How we handled BULK COLLECT’s
- Why we got rid of refcursors
- How we got stuck with testing of one the packages and how the help from a friend solved the problem
- How we handled hierarchical queries and drilling down the hierarchy
-
Anton Doroshkevich ИнфоСофт
Сжатие на уровне СУБД в реалиях 1С
Postgres Pro Enterprise has a great compression engine. The year 2020 was devoted to the study of this mechanism in the real work of 1C. We have accumulated some statistical data and of course the subtleties of the use and behavior of 1C compared to other popular DBMS, which I want to share.