title

text

Vadim Yatsenko
Vadim Yatsenko ООО Прогресс Софт
16:00 16 March
45 мин

Very large tables in PostgreSQL. Or how to make 60+ Tb to 10+ Tb

The talk will describe how we have implemented storage of large tables (+1 billion rows per day). The project exists in production 2 years. The total amount of data - 300 Tb (25 PostgreSQL servers * 2 Data Center). I'll tell about mistakes in organization of large tables storage in the initial phase of the project, and how these mistakes were corrected. I'll also talk about how to organize the data rotation and archiving. I voiced questions about what we were missing in PostgreSQL 9.4 out of what appeared in the 9.5 and 9.6. And also, what new features we are waiting for new releases of PostgreSQL.

Слайды

Видео

Другие доклады

  • Markus Nullmeier
    Markus Nullmeier University of Heidelberg
    45 мин

    Accelerating queries of set data types with GIN, GiST, and custom indexing extensions

    Sets are apparently a useful data type for many kinds of applications. While PostgreSQL offers no built-in set data type, sets may be emulated to some degree with its built-in array and JSONB data types. Also, acceleration of respective containment (subset) queries is readily available as a built-in feature of the GIN index type.

    Starting with the above, we will then explore the performance gains enabled by custom set data types, and especially by customisation code in C ("operator classes") for the GIN and GiST index types.

  • Dmitry Ivanov
    Dmitry Ivanov Postgres Professional
    Ildar Musin
    Ildar Musin Postgres Professional
    45 мин

    Partitioning with pg_pathman

    Partitioning is a long-awaited feature in PostgreSQL. Although Postgres supports partitioning via inheritance, this approach has some disadvantages, such as the need to manually create partitions and support triggers, significant planning overhead, and no query execution optimizations. In this talk, we’ll tell you about the pg_pathman extension we are developing. pg_pathman supports HASH and RANGE partitioning, performs planning and execution optimizations, supports fast insert by using Custom Node instead of triggers, provides functions for partition management (add, split, merge, etc.), supports FDW, non-blocking data migration, and more. We'll also speak about pg_pathman integration with Postgres Pro Enterprise Edition and Oracle-like syntax support for partitioning. Finally, we'll discuss new partitioning capabilities in PostgreSQL 10, the already implemented features and further development plans.

    VIDEO

  • Aleksei Plotnikov
    Aleksei Plotnikov Skype
    45 мин

    Database platform architecture and administrating PostgreSQL in Skype

    Most of the main Skype services use a database platform based on PostgreSQL and other open-source technologies, such as Skytools, plProxy, pgBouncer, etc. This platform consists of several hundreds of servers with thousands of databases, which process hundreds of thousands of transactions per second. At the same time, the platform architecture allows its users (applications and their developers) to work with "logical" databases, without any worries about their real “physical” structure.

    Our Skype Database Platform team is responsible for the database platform infrastructure. We develop automation systems for various processes that help us ensure service reliability and facilitate development, testing, and deployment of code. In this presentation, I will outline the database platform architecture, review its main components, and tell you about the methods we use in our every-day work to ensure high availability, scalability, replication, fault tolerance, and more.

  • Andreas Scherbaum
    Andreas Scherbaum Pivotal
    22 мин

    Introduction to Greenplum MPP Database

    Overview of the architecture of Greenplum MPP (Massively Parallel Processing) database. Explain the internals of GPDB. Show how to configure and setup GPDB. How to distribute data effectively for MPP

    VIDEO