title

text

Arseniy Sher
Arseniy Sher Postgres Professional
: December
45 мин

Consensus, Postgres, Multimaster

Postgres Pro Multimaster is Postgres extension (and a set of core patches) providing high availability (HA) with strong consistency and read scalability. It forms symmetric shared-nothing cluster synchronously replicating the data and automatically performing disaster recovery. During the last year we've put significant efforts ensuring and proving that consistency is preserved in all scenarios. The new version, which will be released as part of Postgres Pro Enterprise 13 uses Paxos algorithm for determining transaction outcome and custom protocol governing the recovery process; we used TLA+ model checker to verify its correctness. I'll tell how these things work and why in some cases multimaster may be an attractive alternative to the traditional streaming replication based HA deployments.

multimaster is now open source, available at https://github.com/postgrespro/mmts

To make the talk less narrow specialized and more appealing to the wide audience, in the first part I will shed some light on how generally modern DBMSs (mostly so-called NewSQL) handle fault tolerance. In particular,

  • what is a strongly consistent DBMS and the associated overhead;
  • what is distributed consensus, Paxos, Raft;
  • how they help here;

I won't do an attempt to explain any algorithms line-by-line; it would be hardly useful given the time frames and there is a lot of literature available anyway. The goal is rather to waymark the field and get you a bit comfortable there.

Другие доклады

  • Konstantin Evteev
    Konstantin Evteev X5 FoodTech
    45 мин

    Building real-time analytics with PostgreSQL

    In today's world, operational reporting and real-time analytics are becoming a basic need. There are a huge number of tools, practices, and approaches, which in turn require different expertise and resources. In this talk, I will tell you how to transform your analytics using PostgreSQL. We'll discuss pitfalls when using different schemes. We'll talk about data quality and performance. I expect this talk to be of interest to both beginners and seasoned practitioners with many years of experience (Discussions and questions after the talk will be highly appreciated). The talk outline is as follows. 1. The evolution of reporting - migration from OLTP to OLAP. 2. Data delivery to DWH and related challenges. 3. Scaling architecture in response to growing data volumes. 4. Data quality issues. 5. Maintaining stability with a large number of changes. 6. Different approaches to organizing the work of the DWH team. 7. And, finally, we'll list the challenges we have successfully responded to with various solutions (pgAgent, PGWatch, working with the file system, the new reading of postgresql.conf).

  • Pavel Borisov
    Pavel Borisov Postgres Professional
    45 мин

    Speed up your fast text search queries with RUM index

    Fast text search queries can be made even faster with indexing on lexemes inside compound records of tsvector format. RUM index is an open-source PostgreSQL extension. It represents a big improvement of GIN index and it can index lexemes with additional information e.g. tsvector lexeme weight-mark. So it can support tsvector capabilities more.

    Until recently it was needed to recheck results of weight-containing queries by table. My modification (2020) is to make the processing of this kind of queries index-only and therefore much faster.

    Also, I will describe and provide benchmarks for different usage cases for fast text search. We'll see how RUM index can improve the performance and compare it with PostgreSQL internal GIN index.

  • Julien Rouhaud
    Julien Rouhaud
    22 мин

    Don't fear your next glibc upgrade

    PostgreSQL relies on the system collation libraries, such glibc or ICU, for text ordering. One know caveat is that when the library change its sort order for a collation, any index created using the old order is likely to be corrupted when the new version of the library is installed.

    In this talk, we'll see the improvements done in PostgreSQL 14 to keep track of the collation versions, detect and fix possible index corruption due to library upgrades and the work currently being done to further improve this area.

  • Yugo Nagata
    Yugo Nagata SRA OSS, Inc. Japan
    45 мин

    Updating Materialized Views Automatically and Incrementally

    Materialized view is a feature to store the results of view definition queries in DB in order to achieve faster query response. However, the data in the view gets stale after underlying tables are modified. Therefore, view maintenance is needed to keep the contents up to date. PostgreSQL has REFRESH MATERIALIZED VIEW command for updating a materialized view, but this command needs to recompute the contents from scratch, so this is not efficient in cases where only a small part of a base table is modified.

    Incremental View Maintenance (IVM) is a technique to maintain materialized views efficiently, which computes and applies only the incremental changes to the materialized views instead of recomputing. This feature is required for updating materialized views rapidly but not implemented on PostgreSQL yet.

    Therefore, we developed IVM on PostgreSQL and are proposing to implement this as a core feature. The patch is now under discussion on the hackers mailing list. Our implementation allows materialized views to be updated automatically and incrementally when a underlying table is modified. You don't need to write your own trigger function for updating views. As a result of continuous development, the current implementation supports some aggregates, subqueries, self-join, outer joins, and CTEs (WITH clauses) in a view definition query. The result of performance evaluation using TPC-H queries shows that our IVM implementation can update a materialized view more than 200 times faster than re-computation by REFRESH command.

    In this talk, we will describe our IVM implementation and its features.