Postgres как основа BI платформы, особенности, практический опыт
I will tell you about why Postgres is first-choice product as a foundation for your BI system with classical OLAP workload. Briefly it will be said about existing open source BI solutions.
I will also describe specific of our architecture, why we chose snowflake scheme and how we are doing extract, transformation and load procedures. It will be mentioned about special Postgres tuning for OLAP and massive data bulkload workloads. Also I will let you know about Postgres usage as a column database with cstore_fdw by Citus and results achieved. Cons and problems of our approach will be described in the end of the talk.
VIDEO
Слайды
Другие доклады
-
Vadim Yatsenko ООО Прогресс Софт
Очень большие таблицы в PostgreSQL. Или как превратить 60+ Tb в 10+ Tb
The talk will describe how we have implemented storage of large tables (+1 billion rows per day). The project exists in production 2 years. The total amount of data - 300 Tb (25 PostgreSQL servers * 2 Data Center). I'll tell about mistakes in organization of large tables storage in the initial phase of the project, and how these mistakes were corrected. I'll also talk about how to organize the data rotation and archiving. I voiced questions about what we were missing in PostgreSQL 9.4 out of what appeared in the 9.5 and 9.6. And also, what new features we are waiting for new releases of PostgreSQL.
-
Dmitry Melnik ИСП РАН
Динамическая компиляция SQL-запросов в PostgreSQL с использованием LLVM JIT
Currently, to execute SQL queries PostgreSQL uses interpreter, which implements Volcano-style iteration model. At the same time it’s possible to get significant speedup by dynamically JIT-compiling query “on-the-fly”. In this case it’s possible to generate code that is specialized for given SQL query, and perform compiler optimizations using the information about table structure and data types that is already known at run time. This approach is especially important for complex queries, which performance is CPU-bound.
-
Radoslav Glinsky Skype (Microsoft)
Тестовая среда по требованию
Do you test your PostgreSQL releases prior to Production in a dedicated test environment? Are you sure that your test environment (shortly Test) is equal to Production and in an appropriate state?
In Skype we were facing multiple challenges associated with database testing:
- Simplifying complex Production architecture of thousands of PostgreSQL instances, interconnected with RPCs and replications, infrastructure servers and external DB scripts, into their Test counterparts.
- Constantly growing hardware requirements, insufficient cleanup of data generated in Test.
- Differences between Test and Production were appearing and accumulating. Recognizing and fixing them required lots of effort.
-
Dmitry Beloborodov UIS, CoMagic
Опыт использования PostgreSQL в проектах UIS, CoMagic
Using PostgreSQL since 2003, we went all the way from a database of a couple of GB to a cluster of more than 5TB. At the moment, we have more than 700 tables and about 1500 stored procedures. We are ready to share with you the following: - Problems encountered at different development stages and how we resolved them. - Best practices in database administration. - Our own extension to work with several closely related databases. - Best known methods and tools that enable our several teams to work together without interference. - How we set up test equipment of different types. And, of course, we'll talk about optimization, and how we identify bottlenecks and high-load use cases.
VIDEO