title

text

Максим Милютин
Максим Милютин Wildberries
17:00 04 April
45 мин

Analytical open-source solutions based on PostgreSQL

Historically PostgreSQL was intended to transactional OLTP workload. This thesis is confirmed by row-based kind of storage and impossibility (or some complication) in building distributed engine of query execution based on MPP principles. However, due to extensibility of PostgreSQL core (first of all, by using of pluggable access methods) and tolerant license policy similar to BSD there were appeared new different forks and extensions allowing effective processing of big data in analytical manner.

In current talk I'm going to review the PostgreSQL fork called Greenplum and Citus and TimescaleDB extensions from system developer's perspective by comparing their common analytical engine features: column storage, data compression, distributed query execution and so on. The results of such overview will be helpful to database architects seeking PostgreSQL-based DBMS for analytical workload.

Слайды

Видео

Видео доступно участникам мероприятия, выполнившим вход в личный кабинет

Другие доклады

  • Maksim Afinogenov
    Maksim Afinogenov АО "ОКБМ Африкантов"
    22 мин

    Experience in porting the production management system database from Oracle DBMS to PostgresPro DBMS in a manufacturing enterprise

    The practice of transferring structure, logic and data from Oracle DBMS to PostgresPro DBMS. Features and main difficulties of migration. Advantages of PostgresPro in terms of porting logic.

  • Alexander Liubushkin
    Alexander Liubushkin ООО "ФОРС Телеком"
    Andrey Chibuk
    Andrey Chibuk ООО "ФОРС Телеком"
    45 мин

    How to transfer 10TB from Oracle to Postgres in 24 hours?

    We offer to your attention our experience in data migration and the Ora2PgCopy program written in Java for high-speed data transfer from Oracle to Postgres, which is used after creating tables and transferring the program code of application systems. High data transfer speed is provided by using the Postgres command “copy”, using multithreaded Java technology for file processing, managing the nologged/logged table option, and supporting LOB and CLOB data types. According to the test results, Ora2PgCopy works noticeably faster than such analogues as: Ispirer (convertum), oracle_fdw, ora2pg, Pentaho kettle. Ora2PgCopy can function as a module as part of the LUI4ORA2PG migration automation system or independently of it. The history of the growth of the Live Universal Interface (LUI) web application development tool and the LUI4ORA2PG migration tool can be found in previous presentations at PGConf conferences: https://pgconf.ru/2019/118109 , https://pgconf.ru/201911/264095 , https://pgconf.ru/2020/262456, https://pgconf.ru/2021/288310, https://pgconf.ru/2022/316022.

  • Alfred Stolyarov
    Alfred Stolyarov ООО "Еваппс" (EvApps)
    45 мин

    How we switched Oracle for PostgreSQL for a client, before it became mainstream

    The history of import substitution did not start last year after well-known events. Its launch dates back to 2014. It was from this year that state and near-state companies began to think of switching to the so called "recommended software". One of these companies approached us back in 2020 with a project to move from Oracle to PostgreSQL. This project was designed to solve the accumulated architectural problems (imperfect storage system for telemetry data, the DBMS itself worked inside a virtual machine), and optimize the use of disk space (make space in the main storage, debug saving archived data, ensure correct backup). Since the customer's system should have worked uninterrupted 24/7, it was necessary to switch from one DBMS to another "seamlessly" without downtime, with simultaneous operation of both to ensure step-by-step translation of subsystems and the ability to control the correctness of the data. And, of course, the work had to be completed as quickly as possible.

    In the report we will discuss how we managed to solve this case.

  • Pavel Tolmachev
    Pavel Tolmachev Postgres Professional
    22 мин

    Let's get acquainted with GEQO in 20 minutes

    -----------------------------------------------------------QUERY PLAN--------------------------------------------------------------
    Hash Join
      Hash Cond: (Subject = GEQO)
      -> Hash Join
            Hash Cond: (**Optimizer task = choose the best query execution plan**)
            -> Seq Scan on **The number of potential plans grows exponentially as the number of tables in a query increases**
            -> Hash
                  -> Seq Scan on **PostgreSQL solves this problem by using the genetic optimizer (GEQO)**
      -> Hash
            -> Seq Scan on **Topics of the report:**
                  Filter: (**(What is GEQO)** AND **(Pros and cons)** AND **(How it works)**)
    (10 rows)