Tatsuro Yamada NTT Comware
: December
22 мин

Auto plan tuning using feedback loop

As is often seen in OLAP and batch processing workloads, the more complex a query (containing many joins, filters, aggregates), the more there is a possibility of row count estimation errors, which leads to planner choosing an inefficient execution plan.

To address that problem, I developed a tool called pg_plan_advsr as a PostgreSQL extension, which corrects the estimation errors by repeatedly feeding back the information collected during query execution to the planner.

The tool has three features:

  1. Automatic plan tuning by repeatedly feeding execution information to planner
  2. Preserve all plans generated during plan tuning in a history table
  3. Create and store optimizer hints to be able to reproduce plans generated during tuning process

I verified the effectiveness of pg_plan_advsr by enabling it when running the join order benchmark (JOB) against PG 10.4 and observed its execution time shortening to 50% of the original. Therefore, it is useful for user who would like to do plan tuning for OLAP and batch processing.

I will talk about the following things in this presentation:

  • Principles behind pg_plan_advsr and its architecture
  • Detailed information about the measurements done with JOB
  • Possible future enhancements
  • Using aqo and pg_plan_advsr together (experimental)



Другие доклады

  • Maksim Viharev
    Maksim Viharev Alytics
    45 мин

    GreenHouseSQL as a scalable analytics system for postgresql, greenplum and clickhouse

    At pgconf’17 I talked about our analytics systems based on PostgreSQL. Afterwards we looked at hadoop, s3, presto, vertica, and other frights. Finally we stopped to suffer nonsense and just completed PostgreSQL with ready Greenplum and Clickhouse. As a result, we achieved amazing performance, fast migration, easy maintenance, reliability and horizontal scalability. We enabled to recover the system after fault in two commands, decreased infrastructure costs and expanded functionality due to ANSI SQL, MPP and In-memory. All within the open-source and full SQL paradigm. We called the product GreenHouseSQL, which is our inner whole cycle data platform. In the talk we will show the beauty of solution internals, explain the advantages and flaws, tips and tricks of starting with Greenplum, as well as why do we need Clickhouse, what is left to PostgreSQL, and eventually how does it all work.

  • Alex Lustin
    Alex Lustin SilverBulleters, LLC
    22 мин

    Analysis of troublesome queries as a means of recurrent refactoring of 1C code

    1. Principles of searching for troublesome queries in PostgreSQL.
    2. Evaluation of hypothetical indexes and their impact on query plans.
    3. The most common errors in 1C-programming.
    4. Basic methods of code refactoring, taking into account the features of PostgreSQL.
    5. Storing analytical information from the PostgreSQL log to assess the quality of refactoring

  • Anton Doroshkevich
    Anton Doroshkevich ИнфоСофт
    22 мин

    Russia's first BlockChain on 1C+PostgreSQL

    During the report, I would like to share the experience of implementing BlockChain in a real business task based on 1C+PostgreSQL. Where did this task come from? From whom do we protect data with the help of technology? How to get a chain integrity report of tens of millions of records in seconds?

  • Christopher Travers
    Christopher Travers DeliveryHero SE
    45 мин

    Data Recovery in PostgreSQL on a Damaged Filesystem

    This case study walks participants through a case where we decided to embark on a data recovery effort. This talk is applicable to all users, from novices to advanced PostgreSQL database administrators. Beginners will get an understanding of what data recovery is and is not, what expectations to have going into it, and how to work with contracted experts in order to ensure the best possible outcome, while more advanced users and experts will also get a fair bit out of the technical aspects of the case study.

    While the talk will emphasize non-technical operational aspects of data recovery, it will also include discussions of the internals of PostgreSQL we had to work with, as well as how we went about approaching difficulties so that we could retrieve the data we hoped to.