title

text

Alexander Korotkov
Alexander Korotkov Postgres Professional
17:00 04 February
45 мин

PostgreSQL extendability: Origins and new horizons

Postgres was initially designed to support access methods extendability. Well known citation about access method in Postgres claims: "It is imperative that a user be able to construct new access methods to provide efficient access to instances of nontraditional base types" Michael Stonebraker, Jeff Anton, Michael Hirohama. Extendability in POSTGRES, IEEE Data Eng. Bull. 10 (2) pp.16-23, 1987

Initially, heap was just one for access methods. So, extendability of access methods would also mean pluggable storage engines in modern terms. For now, only index access methods are defined in pg_am table of system catalog. Those index access methods also have well-defined interface. Therefore in order to meet initial design PostgreSQL need to support two features:

  • Pluggable index access methods, i.e. ability to implement new index types by adding new tuples to pg_am;
  • Pluggable storage engines, i.e. ability to implement completely different storages for tables without traditional heap.

Besides mechanical work like "CREATE ACCESS METHOD" command, extensible index access methods needs to be WAL-logged. For now, community doesn't want extensions to define their own WAL-records, because there is a chance to break both recovery and replication, which is not acceptable. Another approach is to define generic WAL-records, that specify a difference between pages in generalized way.

There are only few DBMS which support pluggable storage engines now. MySQL is the most common example here. However, dealing with different storage engines in MySQL is like dealing with different DBMS. This is not the way PostgreSQL should go from our view.

However, now PostgreSQL users realize benefits from other storages. Ideas of columnar storages and in-memory storages for PostgreSQL are very popular. Simultaneously, technical possibilities to implement them are growing. FDW and custom nodes are arrived. Generic WAL and extensible index access methods are pending for 9.6. Much work in the direction of pluggable storage engines is already done even if it had different aims.

It's time for PostgreSQL core developers to think about native support of pluggable storages without kludges. Finally, we should get "CREATE STORAGE ENGINE name ..." command as legal extendability mechanism.

In this talk we will show current state on pluggable index access method and design of pluggable storage engines.

Материалы к докладу

Слайды

Видео

Другие доклады

  • Gregory Stark
    Gregory Stark
    45 мин

    Sorting Through the Ages

    When new versions of Postgres are released most of the attention is focused on new features. Inevitably a release note claiming speed improvements seems relatively mundane and doesn't provide the compelling argument for upgrading. However the reality is that these speed improvements represent pain points that have been identified and solved.

    Reviewing the changes to the sort code in Postgres over the last 10 years clearly shows the kinds of problems users have run into. As usage patterns changed over years, databases scaled up, and hardware changed new problems arose and drove further development to solve them.

    Upcoming changes in 9.5 and 9.6 will dramatically change the experience further. Making sorting UTF8 and other encodings less of a problem and handling scaling to larger machines with many processors and memory cache more effectively.

  • Илья Космодемьянский
    Илья Космодемьянский Data Egret
    180 мин
  • Andres  Freund
    Andres Freund Citus Data
    45 мин

    Improving Postgres' Buffer Manager

    Postgresql's buffer manager has parts where it's showing its age. We'll discuss how it currently works, what problems there are, and what attempts are in progress to rectify its weaknesses.

    • Lookups in the buffer cache are expensive
    • The buffer mapping table is organized as a hash table, which makes efficient implementations of prefetching, write coalescing, dropping of cache contents hard
    • Relation extension scales badly
    • Cache replacement is inefficient
    • Cache replacement replaces the wrong buffers

  • Heikki Linnakangas
    Heikki Linnakangas Pivotal

    Index internals

    PostgreSQL includes several index types: GiST, SP-GiST, GIN, and of course, the regular B-tree. DBAs are familiar with using each of these for specific use cases, GIN for full-text search, GiST for geometrical data, and so on, but how do they work internally? What makes them suitable for the cases they're typically used for?

    In this presentation, I will walk through the internal structure of each of these index types, explaining what strengths and weaknesses each one of them have.