title

text

Доклады

  • Nikita Glukhov
    Nikita Glukhov Postgres Professional
    Oleg Bartunov
    Oleg Bartunov Postgres Professional
    45 мин

    Inside JSONB

    JSONB is a popular data type in Postgres, and there is demand from users to improve its performance. In particular, we want to optimize a typical pattern of using jsonb as a storage for relatively short metadata and big blobs, which is currently highly inefficient. We will discuss several approaches to jsonb improvement and present the results of our experiments.

  • Anastasia Lubennikova
    Anastasia Lubennikova Postgres Professional
    45 мин

    PostgreSQL partitioning. Work In Progress

    Native partitioning was introduced in PostgreSQL 10, and every new release comes with more features and optimizations to this area. Yet, there is still room for improvement.

    This talk briefly compares the built-in PostgreSQL partitioning with third-party extensions (pg_pathman and pg_partman), to understand what we still don't have in the core. This talk also includes an overview of partitioning-related features that are currently being developed and aimed for PostgreSQL 14.

  • Ibrar Ahmed
    Ibrar Ahmed Percona LLC
    45 мин

    All about PostgreSQL Security

    PostgreSQL provides different levels of security. This talk will cover all the available security techniques used in PostgreSQL 13. We’ll look at client-side security (LibPq, JDBC) through to server-side security. It will cover all supported authentication methods and the pros and cons of all these methods. Some of the key features of the talk are:

    • Introduction to Cryptography
    • SSL, TLS, GSSAPI, and OpenSSL
    • Client-Side Encryption
    • Securing Authentication
    • Securing Data on the disk
    • Securing Backup & Basebackup
    • Securing Replication
    • Database Roles and Privileges

    It’s important to be familiar with all the security levels such as (1)network-level security (2) on-disk security (3) row-level, (4), and column level security. The talk will cover all the aspects with some real-life use cases and examples.

  • Arseniy Sher
    Arseniy Sher Postgres Professional
    45 мин

    Consensus, Postgres, Multimaster

    Postgres Pro Multimaster is Postgres extension (and a set of core patches) providing high availability (HA) with strong consistency and read scalability. It forms symmetric shared-nothing cluster synchronously replicating the data and automatically performing disaster recovery. During the last year we've put significant efforts ensuring and proving that consistency is preserved in all scenarios. The new version, which will be released as part of Postgres Pro Enterprise 13 uses Paxos algorithm for determining transaction outcome and custom protocol governing the recovery process; we used TLA+ model checker to verify its correctness. I'll tell how these things work and why in some cases multimaster may be an attractive alternative to the traditional streaming replication based HA deployments.

    multimaster is now open source, available at https://github.com/postgrespro/mmts

    To make the talk less narrow specialized and more appealing to the wide audience, in the first part I will shed some light on how generally modern DBMSs (mostly so-called NewSQL) handle fault tolerance. In particular,

    • what is a strongly consistent DBMS and the associated overhead;
    • what is distributed consensus, Paxos, Raft;
    • how they help here;

    I won't do an attempt to explain any algorithms line-by-line; it would be hardly useful given the time frames and there is a lot of literature available anyway. The goal is rather to waymark the field and get you a bit comfortable there.

  • Yugo Nagata
    Yugo Nagata SRA OSS, Inc. Japan
    45 мин

    Updating Materialized Views Automatically and Incrementally

    Materialized view is a feature to store the results of view definition queries in DB in order to achieve faster query response. However, the data in the view gets stale after underlying tables are modified. Therefore, view maintenance is needed to keep the contents up to date. PostgreSQL has REFRESH MATERIALIZED VIEW command for updating a materialized view, but this command needs to recompute the contents from scratch, so this is not efficient in cases where only a small part of a base table is modified.

    Incremental View Maintenance (IVM) is a technique to maintain materialized views efficiently, which computes and applies only the incremental changes to the materialized views instead of recomputing. This feature is required for updating materialized views rapidly but not implemented on PostgreSQL yet.

    Therefore, we developed IVM on PostgreSQL and are proposing to implement this as a core feature. The patch is now under discussion on the hackers mailing list. Our implementation allows materialized views to be updated automatically and incrementally when a underlying table is modified. You don't need to write your own trigger function for updating views. As a result of continuous development, the current implementation supports some aggregates, subqueries, self-join, outer joins, and CTEs (WITH clauses) in a view definition query. The result of performance evaluation using TPC-H queries shows that our IVM implementation can update a materialized view more than 200 times faster than re-computation by REFRESH command.

    In this talk, we will describe our IVM implementation and its features.

  • Evgeniy Dyukov
    Evgeniy Dyukov Yandex
    Andrey Borodin
    Andrey Borodin Яндекс
    45 мин

    How to manage an open source HA RDBMS in a cloud environment

    High availability solutions have become extremely popular in the past few years. They play a critical role in building reliable systems based on affordable hardware. In this presentation, we will pay attention to some of the subtle aspects of the design and maintenance of such systems. In addition, the issues of capturing changes on a HA cluster will be addressed.

  • Maxim Orlov
    Maxim Orlov Postgres Professional
    22 мин

    Hot minor update in Postgres Pro 13

    In this presentation, we'll explain how we do minor version update for Postgres Pro DBMS without instance restart and/or active backend termination.

  • Vasiliy Puchkov
    Vasiliy Puchkov ООО
    45 мин

    Development of an integration database for production data of oil depots based on PostgreSQL

    An architectural approach as the basis for a sustainable solution. Old and new technologies - unity and struggle of opposites. Information security and business requirements - is there a compromise?

  • Egor Rogov
    Egor Rogov Postgres Professional
    45 мин

    What's new in Postgres Professional training courses

    Our company's educational projects are intended to facilitate learning of PostgreSQL. Last year, we focused on courses for application developers: we updated DEV1 basic course and released a brand new DEV2 course. What has changed in training materials representation, how we see the further development of the courses and what else do we have besides the courses, whether the courses for DBAs will be updated and how this will affect the certification and that's what I will talk about.

  • Nikolay Samokhvalov
    Nikolay Samokhvalov Nombox LLC
    45 мин

    Automated database migration testing

    In a heavily-loaded project, any change implies non-zero risks of downtime or performance degradation. We constantly see how system complexity, number of database nodes, deployments per week constantly grow. How grows the level of automatization of various activities in CI/CD pipelines, containers, Kubernetes.

    Meanwhile, when we look at the topic of testing of database changes–from trivial index creation to complex, almost "surgical" operations like converting int4 PK to int8 one in a multi-terabyte database under load–here we observe an obvious lag in technology and methodology development. In the best case, the changes are verified and approved visually, and here it all depends on the level of experience and tiredness of the reviewer.

    In this talk, we will look at how we (Postgres.ai) solve this problem using our solution, Database Lab:

    • instant provisioning of independent thin clones of multi-terabyte databases, ready to be used for testing,
    • integration with existing CI/CD solutions and workflow,
    • collection of metrics that are the most useful for decision support on whether or not each database migration has to be approved (and even fully automated rejection of the most dangerous actions).

  • Ivan Panchenko
    Ivan Panchenko Postgres Professional
    22 мин

    Postgres Pro DBMS: what’s new & what’s on the roadmap

    In this talk, you'll learn about Postgres Pro DBMS from the co-founder of Postgres Professional. Ivan will explain the philosophy behind this enhanced variant of PostgreSQL, reveal the differences between Postgres and Postgres Pro and provide the roadmap for the further evolution of Postgres Pro DBMS.

  • Andrey Lepikhov
    Andrey Lepikhov Postgres Professional
    22 мин

    How query execution history can help with re-execution

    Postgres is able to build optimal query plans for most practical cases. However, sometimes, for objective reasons, for complex queries or because of open issues in the planner itself, it can make mistakes and produce a suboptimal plan. Because of this, the execution time of such a request can increase tenfold. If the query is executed frequently, then from time to time this query takes longer than it could, and the DBMS as a whole produces a lower TPS. If the planner is able to record his mistakes and take them into account in the subsequent planning of the same query, then this will improve the characteristics of the DBMS during its operation. We present the results of the development of a PostgreSQL DBMS extension that stores the query execution history and implements the planner recommendation mechanism. We show how knowledge about previously executed queries can improve the performance of subsequent ones.

  • Álvaro Hernández
    Álvaro Hernández OnGres
    180 мин

    Deconstructing Postgres into a Cloud Native Platform

    Is deploying Postgres in Kubernetes just repackaging it into a container? Can’t Postgres leverage the wide range of Cloud-Native software and integrate well with K8s? Join this journey that will cover and demonstrate, with demos running on StackGres:

    • How to structure Postgres into an init-less container, plus several sidecar containers for connection pooling, backups, agents, etc.
    • Defining high level CRDs as the single API to interact with the Postgres operator.
    • Using K8s RBAC for user authentication of a web UI management interface.
    • Using Prometheus for monitoring; bundling a node, Postgres and PgBouncer exporters together.
    • Proxying Postgres traffic through Envoy. Terminate Postgres SSL with an Envoy plugin, that also exports wire protocol metrics to Prometheus.
    • Using Fluentbit to capture Postgres logs and forward them to Fluentd, which stores them on a centralized Postgres database.

    You will be able to follow the session on your own Kubernetes cluster, and go from zero to a Postgres hero on Kubernetes with little effort! Create in minutes your own Postgres-as-a-Service on your Kubernetes.

  • Nikolay Samokhvalov
    Nikolay Samokhvalov Nombox LLC
    180 мин

    Seamless SQL optimization, v2.0

    There are two types of SQL query analysis:

    1. "Macro": analyzing the workload as a whole (three major approaches: using metrics provided by pg_stat_statements or similar, log analysis with pgBadger or similar, and sampling of pg_stat_activity)

    2. "Micro": diving into details of single query execution (EXPLAIN command being the central tool here)

    And there are huge gaps between them that become noticeable at scale. The main challenges:

    • Switching between "macro" and "micro" without a huge overhead
    • Verifying optimization ideas reliably
    • Deploying changes risk-free

    Solving these tasks at a scale requires advanced DBA experience and–sometimes–intuition. Or better tools that (fortunately!) very recently started to appear.

    In this tutorial, we will learn how to establish a smooth and seamless SQL optimization process in your organization: * what tools should you choose in your particular case? * how to close the gaps mentioned above?

  • Christopher Travers
    Christopher Travers DeliveryHero SE
    45 мин

    When it All Goes Wrong: Database Incident Response Dos and Don'ts

    Once at Adjust we faced a problem of impending xid wraparound in a very central database due to a long-stalled autovacuum run. Because we spotted warnings 5 hours before impending disaster we were able to minimize the customer impact. Come learn how we used the time to prepare, and what lessons this has for others facing unusual problems in large databases.

  • David Steele
    David Steele Crunchy Data
    45 мин

    Backup Best Practices with pgBackRest

    Backups are a critical part of any enterprise database solution but they are often done poorly or skipped altogether, which can lead to data loss in the event of hardware failure or some other disaster.

    In this talk we'll cover database backup best practices and how to implement them with pgBackRest, including:

    • WAL archiving and retention
    • Backup frequency and retention
    • How to meet recovery time/point objectives
    • Configuration options
    • Performance considerations

  • Bruce Momjian
    Bruce Momjian EnterpriseDB
    45 мин

    Postgres and the Artificial Intelligence Landscape

    Artificial intelligence, machine learning, and deep learning are intertwined capabilities that attempt to solve problems that defy traditional computational solutions — problems include fraud detection, voice recognition, and search result recommendations. While they defy simple computation, they are computationally expensive, involving computation of perhaps millions of probabilities and weights. While these computations can be done outside of the database, there are specific advantages of doing machine learning inside the database, close to where the data is stored. This presentation explains how to do machine learning inside the Postgres database.

  • Bo Peng
    Bo Peng SRAOSS, Inc. Japan
    45 мин

    Deploying PostgreSQL Cluster with Query Load Balancing and Monitoring Capabilities on Kubernetes

    Kubernetes is an open source container orchestration platform for automating deployment, scaling and management of application containers. Nowadays, more and more applications are being deployed in containers on Kubernetes.

    There are several solutions that can help us to run a PostgreSQL cluster on Kubernetes. However, these solutions don't provide query load balancing capability. In this talk, I will show you how to combine PostgreSQL Operator with Pgpool-II to deploy a PostgreSQL cluster with query load balancing capability on Kubernetes.

    Monitoring is a very important part in production environments. Although Kubernetes provides a basic way to monitor the status of a PostgreSQL cluster, this is not sufficient for managing a PostgreSQL cluster in production. An important improvement of Pgpool-II 4.2 release is the ability to output more useful statistics of the PostgreSQL cluster. In this talk, I will describe how to monitor and visualize the PostgreSQL cluster statistics in Prometheus for extensive cluster monitoring.

  • Tatsuro Yamada
    Tatsuro Yamada NTT Comware
    Julien Rouhaud
    Julien Rouhaud
    22 мин

    Building automatic adviser and performance tuning tools in PostgreSQL

    PostgreSQL is a mature and robust RDBMS since it has 30 years of history. Over the year, its query optimizer has been enhanced and usually produces good query plans.

    However, can it always come up with good query plans? The optimization process has to use some assumptions to produce plans fast enough. Some of those assumptions are relatively easy to check (e.g. statistics are up-to-date), some harder (e.g. correct indexes are created), and some nearly impossible (e.g. making sure that the statistic samples are representative enough even for skewed data repartition). For now, given those various caveats, DBA sometimes can't always realize easily that they miss a chance to get a meaningful performance improvement.

    To help DBA to get a truly good query plan, we'll present below some tools that can help to fix some of those problems by providing a missing index adviser, looking for extended statistics to create, and row estimation error correction information to get appropriate join orders with join methods automatically.

    • pg_qualstats: provides a new index and extended statistics suggestions to gather many predicate statistics on the production workload.
    • pg_plan_advsr: provides alternative good query plans automatically to analyze iterative query executions information to fix estimation rows error.

    In this talk, we will explain how those tools work under the hood and see what can be done, how they can work together. Also, we will mention what other tools also exist for related problems. Therefore, it will be useful for DBA who are interested in improving query performance or want to check whether current settings of indexes and statistics are adequate.

  • Alexey Fadeev
    Alexey Fadeev Sibedge
    22 мин

    Multicorn Foreign Data Wrapper vs plpython

    Multicorn technology allows you to develop FDWs in Python, which is much easier and faster than creating FDWs in C. However, there is a downside, Multicorn FDWs work well with primitive WHERE conditions, but more complex cases cause difficulties, which I will talk about. Cases will be considered on the example of my Multicorn FDW for getting OpenStreetMap data. I will also show examples of using the same code in Multicorn FDW and plpython functions, including performance comparison. In conclusion, I will share my findings on when it is better to use plpython, and when Multicorn FDW is more preferable.

  • Nikolai Ryzhikov
    Nikolai Ryzhikov Health Samurai
    45 мин

    SQL as data

    Almost every business app is essentially just a SQL generator. How to easily build and compose SQL queries? I will explain the "Clojure way" of representing SQL as data (data DSL) and show how it may help you to dynamically build and compose SQL queries up to macros and query analysis.

  • Anton Doroshkevich
    Anton Doroshkevich ИнфоСофт
    Teodor Sigaev
    Teodor Sigaev Postgres Professional
    45 мин

    1С ERP Platform + Postgres = ...

    CTO Postgres Professional Teodor Sigaev and 1C ERP platform expert Anton Doroshkevich will discuss the existing maintenance issues for 1C ERP on Postgres and their potential solutions.

  • Andrey Fefelov
    Andrey Fefelov Mastery.pro
    22 мин

    How-to obfuscate Postgres database for load testing in web apps

    Postgres is a well-known database for high load web applications. Such apps require stress/load testing itself to run properly in production. Besides obvious difficulties in preparation a test environment identical to production, generating proper traffic there is another one issue - database preparation for the test environment. And it seems it is not good to use the database from production in the testing environment in the era of personal data protection (GDPR, HIPAA). Data obfuscation is the rescue.

    There are few instruments for data obfuscation in Postgres. During this session, I will tell you which of them we've selected and why what type of issues we faced, and if our solution was successful. You will know if it is possible to get an identical response on the test database without real data from production, we will observe some restriction on obfuscation, I'll present our utility which simplify things.

  • Igor Kosenkov
    Igor Kosenkov Postgres Professional
    90 мин

    High availability cluster using crmsh

    Some OS distributions do not have a pcs configuration utility to create a high availability cluster PostgreSQL. In this case, the crm utility from the crmsh package will help us. It is more difficult to use, but powerful and effective.

    In my master class, I will show how to use this utility, as well as configure a failover cluster in different configurations.

  • Alexey Lesovsky
    Alexey Lesovsky Data Egret
    22 мин

    Noisia - a PostgreSQL incident generator

    Noisia is the result of my past attempts to gather all the tools for reproducing various incidents with Postgres.

    Noisia is an utility for easy creation of artificial incidents in a Postgres database. This talk will describe the reason for artifical incidents creation, what Noisia does and what is it's possible use. Also I will share the future development roadmap.

  • Anton Doroshkevich
    Anton Doroshkevich ИнфоСофт
    45 мин

    Postgres Pro Data Compression (CFS) for 1C ERP platform data

    Postgres Pro Enterprise has a great compression engine. The year 2020 was devoted to the study of this mechanism in the real work of 1C. We have accumulated some statistical data and of course the subtleties of the use and behavior of 1C compared to other popular DBMS, which I want to share.

  • Andrey Zubkov
    Andrey Zubkov Postgres Professional
    45 мин

    New features of pg_profile/pgpro_pwr - historical workload profiler

    This talk is about postgres extension pg_profile - simple historic database workload profiler. I'll describe it's architecture, features and use cases. There is a new branch of pg_profile called pgpro_pwr, designed to run in PostgresPro Enterprise Edition and PostgresPro Standard Edition databases. It is using extended performance statistics of those databases providing some valuable benefits.

  • Andreas Scherbaum
    Andreas Scherbaum Pivotal
    45 мин

    Managing PostgreSQL with Ansible

    Ansible is an open-source configuration management and deployment tool, which can be used to manage servers and software installations. This talk will briefly cover Ansible itself, and then explain how Ansible is used to install and configure PostgreSQL on a server. Examples will round up the talk.

  • Simon Riggs
    Simon Riggs Enterprise DB
    45 мин

    PostgreSQL and the SQL Standard

    PostgreSQL is one of the most standards-compliant databases available. Talk discusses the impact that PostgreSQL has made on the SQL Standard and the features PostgreSQL has implemented from the SQL Standard, as well as upcoming features in PG14 and beyond.

  • Ibrar Ahmed
    Ibrar Ahmed Percona LLC
    90 мин

    High-Performance PostgreSQL

    PostgreSQL is one of the leading open-source databases. Out of the box, the default PostgreSQL configuration is not tuned for any particular workload. The default configuration is designed in such a way that PostgreSQL can run on any system using minimum resources. Consequently, a default installation of PostgreSQL does not give optimum performance on the high-performance machine because it is set up to use all available resources. PostgreSQL provides mechanisms that allow you to tune your database according to your workload and machine specification. Outside of PostgreSQL, though, we can tune the Linux kernel to allow the database load to work optimally. In this talk, we will learn how to tune some of the PostgreSQL’s parameters, and we will see the effect of that tuning, but we will focus on demonstrating how to tune Linux for better Postgres performance. As there are so many Linux kernel parameters that can be tuned to improve the performance of PostgreSQL, I will also share the results of benchmarks obtained when tuning some of the Linux parameters.

  • Darafei Praliaskouski
    Darafei Praliaskouski Juno
    22 мин

    PostGIS 3.1 news

    PostGIS is a Spatial Extension to PostgreSQL database. This talk will look like reading the changelog aloud with some comments and real-world usage examples from the core developer.

  • Mikhail Tsvetkov
    Mikhail Tsvetkov Intel
    45 мин

    Intel Technologies for PostgreSQL

    In this presentation, we'll discuss Intel products and solutions intended for the Data Platform Group segment, such as Xeon 3rd Gen (4S Cooper Lake) server CPUs, PMEm 200 Series RAM and FPGA.

  • Julien Rouhaud
    Julien Rouhaud
    22 мин

    Don't fear your next glibc upgrade

    PostgreSQL relies on the system collation libraries, such glibc or ICU, for text ordering. One know caveat is that when the library change its sort order for a collation, any index created using the old order is likely to be corrupted when the new version of the library is installed.

    In this talk, we'll see the improvements done in PostgreSQL 14 to keep track of the collation versions, detect and fix possible index corruption due to library upgrades and the work currently being done to further improve this area.

  • Fabrízio Mello
    Fabrízio Mello OnGres Inc
    Álvaro Hernández
    Álvaro Hernández OnGres
    45 мин

    PostgreSQL Network Filter for EnvoyProxy

    How do you monitor Postgres? What information can you get out of it, and to what degree does this information help to troubleshoot operational issues? What if you want/need to log all the queries? That may bring heavy trafficked databases down.

    At OnGres we’re obsessed with improving PostgreSQL’s observability. So we worked together with Tetrate folks on an Envoy’s Network Filter extension for PostgreSQL, to provide and extend observability of the traffic inout a cluster infrastructure. This extension is public and open source. You can use it anywhere you use Envoy. It allows you to capture automated metrics and to debug network traffic. This talk will be a technical deep-dive into PostgreSQL’s protocol decoding, Envoy proxy filters and will cover all the capabilities of the tool and its usage and deployment in any environment.

    Resources:

  • Dmitry Ursegov
    Dmitry Ursegov Postgres Professional
    45 мин

    Shardman - the native approach to sharding in PostgreSQL

    The amount of data that is handled today by Enterprises and Web companies is constantly growing. At the same time, it becomes increasingly difficult to have and synchronize several copies of data in different systems. As a result there is a demand to work with large amounts of data directly in a transactional DBMS. This requirement is often imposed by the logic of applications that need real-time results. In this talk we will consider what a universal distributed transactional DBMS can be. We will analyze such aspects as the types of load and their prioritization, dynamic resource allocation and the level of consistency. What tools in PostgreSQL can be used to build such system, what we have already done and what is still missing.

  • Daniele Varrazzo
    Daniele Varrazzo Codice Lieve
    45 мин

    psycopg3: all the love between Python and PostgreSQL

    Python is today one of the most used programming languages in the world: simple to learn and to use and ready to interface to any known service and protocol. psycopg2 is the most used PostgreSQL driver for Python: it offers good performance and makes the communication between the language and the database as smooth as possible.

    Python has evolved enormously in the past years and its first-class support for async programming is changing the way new programs are written. PostgreSQL has evolved too: a new generation of the driver is needed to make the most of all the features it has to offer.

    psycopg3 is the new generation of the most used Python-PostgreSQL adapter: it offers a familiar interface and smooth upgrade path, but behind the scenes it is engineered to obtain the best performance from the database and the language: async programming, prepared statements, binary parameters.

    psycopg3 is also experimenting with innovative JSONB support and query pipelining! Come and discover the forefront of the research between your most loved language and database!

  • Henrietta Dombrovskaya
    Henrietta Dombrovskaya Braviant Holdings
    45 мин

    NORM - No ORM Framework

    It's a well-known fact, that although the database performance is great, and each query is executed in milliseconds, the overall application response time may be slow, making the users wait for a response for an extended period of time. We know that the problem is not the database, but the way the application developers communicate with the database. Specifically, we are talking about ORMs - Object-Relational Mappers. Database developers hate them, but application developers love them because they allow developing applications without any knowledge of database internals. As a result, the system performance is often unacceptably slow.

    The only way to change this behavior is to provide application developers with a tool, which is as easy to use, as an ORM, but which will allow escaping the common ORM pitfalls. That's why we developed NORM - No-ORM Framework. During this presentation, we will go over examples of code from https://github.com/hettie-d/NORM repo and learn how to build "transport objects" for efficient data transfer between applications and databases

  • Robert Haas
    Robert Haas EnterpriseDB
    45 мин

    Avoiding, Detecting, and Recovering From Corruption

    PostgreSQL databases can become corrupted for a variety of reasons, including hardware failure, software failure, and user error. In this talk, I’ll talk about some of my experiences with database corruption. In particular, I’ll mention some of the things which seem to be common causes of database corruption, such as procedural errors taking or restoring backups; some of the ways that database corruption most often manifests when it does occur, such as errors indicating inconsistencies between a table and its indexes or a table and its toast table; and a little bit about techniques that I have seen used to repair databases or recover from corruption, including some experiences with pg_resetxlog. This talk will be based mostly on my experiences working with EnterpriseDB customers; I hope that it will be useful to hackers from the point of view of thinking about possible improvements to PostgreSQL, and to end users from the point of view of helping them avoid, diagnose, and cope with corruption.

  • Daria Vilkova
    Daria Vilkova Postgres Professional
    22 мин

    Zabbix Agent 2: new features and how to configure the agent for PostgreSQL monitoring

    In Zabbix Server 5.0.1 the PostgreSQL monitoring plugin has become available for Zabbix Agent 2. It was developed by Postgres Professional in collaboration with Zabbix. In the presentation, we will talk about how the plugin works, discuss some options for its configuration, as well as how to add custom metrics to it.

  • Daniele Varrazzo
    Daniele Varrazzo Codice Lieve
    90 мин

    Python for PostgreSQL: how to use it, how to be good at it

    Let's see, with practical examples, how to make Python and PostgreSQL talk to each other seamlessly: how to connect to a server, how to exchange data, manage transactions, passing parameters in a safe and expressive way, how to manage notifications.

    We will cover psycopg2, the most used PostgreSQL adapter for Python, but also look and the up and coming psycopg3: what will remain the same, what will change, how to better organise a Python program to make the most of PostgreSQL.

  • Kohei KaiGai
    Kohei KaiGai HeteroDB
    45 мин

    GPU version of PostGIS and GiST-Index

    This talk introduces GPU version of PostGIS and GiST-Index that we have developed as a new feature of PG-Strom.

    Nowadays, our devices (like mobile phones) generate geolocational data time-by-time, and it is often utilized for area-marketing, push-delivery, disaster notification, and so on. People often use GIS technology to pick up users based on their current location. Even if area definitions are complicated polygons, PostGIS functions can generate right intersections, however, it is often highly computing intensive workloads.

    GPU is designed for massive parallel computing workloads, with more than thousands cores per chip. And, we have developed PG-Strom extension to run a part of SQL workloads on GPU devices. At the upcoming PG-Strom v3.0, it newly supports several PostGIS functions and GiST-index for the computing intensive geolocational workloads.

    In this talk, we will introduce the technology background, usage, implementation and benchmark result of GPU version of PostGIS and GiST-Index.

  • Amit Kapila
    Amit Kapila Fujitsu
    45 мин

    What's next in Logical Replication?

    Logical replication has been there since 10.0 and with each release, it is getting better. This talk will start with the basic architecture of Logical replication in PostgreSQL and then cover the various ways in which it can be helpful to users.

    One of the shortcomings of logical replication as compare to physical replication is that currently, it allows the transaction to be replicated only once it is committed. This can create a large apply lag on the subscriber side for long-running transactions. We will discuss the solution implemented for this problem for PostgreSQL.

    We will also discuss the other major work being done in logical replication which is to allow the streaming of transactions at the prepare time. This will help us in implementing conflict-free logical replication. This can be used for scaling-out reads as well. Because of 2PC, we can ensure that on subscribers we have all the data committed on the master. Now, we can design a system where different nodes are owners of some set of tables and we can always get the data of those tables reliably from those nodes, and then one can have some external process that will route the reads accordingly.

    In the end, this will cover the new enhancements, improvements related to Logical Replication in recent PostgreSQL releases.

  • Artem Kartasov
    Artem Kartasov Postgres.ai
    45 мин

    The Catcher of WAL-G

    What do we expect from a backup system? What makes a good backup system stand out? And, most importantly, how do you select the right tools for this process? When preparing backups, many pressing questions arise. In my talk, I will tell you a story of building a system for taking and verifying backups in one company. We will discuss the issues of choosing a tool for working with backups, adapting to changing realities, the problems of cloud storage, and the limitless possibilities of open-source collaboration. This talk will comprise the experience I gained in two years. You'll take the same journey, but much faster. Welcome aboard!

  • Dmitry Dolgov
    Dmitry Dolgov Zalando SE
    45 мин

    How many engineers does it take to make subscripting work?

    Recently landed in PostgreSQL, jsonb subscripting support doesn't look as exciting as some other improvements around jsonb. But it's user visible changes are only tip of the iceberg. How many people were involved to make it, and what decisions choices were made? How long did it take, and what are the good/bad ideas to promote a patch? These and few other questions will be our targets in this talk.

  • Mahmoud SAKR
    Mahmoud SAKR université libre de bruxelles
    Esteban Zimányi
    Esteban Zimányi ULB
    90 мин

    Managing moving objects data with MobilityDB

    MobilityDB is a moving object database extension to PostgreSQL and PostGIS. It has types and functions for storing an querying geospatial trajectories, as first class citizens. The main type is called tgeompoint (temporal geometry point). It represents a complete movement track of a geometry point, such as a car, a bird, or a person. The function speed(tgeompoint) computes the time varying speed of the object, as a tfloat (temporal float). Similar to these examples, MobilityDB has 6 temporal types, and over 300 functions. As such, it is a function-rich platform for Mobility Data Management.

    In this tutorial you will:

    • learn about moving object databases
    • write MobilityDB SQL queries and explore a database of geospatial trajectories
    • walk through the different type, indexes, and functions of MobilityDB.

  • Sushant Pandey
    Sushant Pandey Microsoft
    Alicja Kucharczyk
    Alicja Kucharczyk Microsoft
    22 мин

    The Story About The Migration

    In this talk we want to present how Microsoft team composed of people from two different teams approached the project and solved the migration issues using ora2pg and was able to prove that Postgres Single Server can perform equally well as Oracle Exadata. We will present our ways of working and also some main technical challenges that we faced including migration of BULK COLLECT’s, hierarchical queries, refcursors and others more complicated Oracle constructs.

    The story about a challenging PoC that proved that Postgres can achieve the same performance as Oracle Exadata. The schema that was migrated wasn’t the simplest one you might see. It was quite the opposite. The code was loaded with dynamic queries, BULK COLLECT’s, nested loops, CONNECT BY statements, global variables and lot of dependencies. Ora2pg did a great job converting the schema but left a lot of work to do manually. Also estimates produced by the tool were highly inaccurate since the logic required not the migration but total re-architecture of the code. In this talk we want to present how Microsoft team composed of people from two different teams approached the project and solved the migration issues using ora2pg and was able to prove that Postgres Single Server can perform equally well as Oracle Exadata. We will present our ways of working and also some main technical challenges that we faced including:

    • How estimates do (not) work
    • How we handled BULK COLLECT’s
    • Why we got rid of refcursors
    • How we got stuck with testing of one the packages and how the help from a friend solved the problem
    • How we handled hierarchical queries and drilling down the hierarchy

  • Dimitri Fontaine
    Dimitri Fontaine
    45 мин

    Postgres Architectures in Production

    When using PostgreSQL in production it is important to implement a strategy for High Availability. With a database service, the HA properties apply to both the service itself and of course to the data set.

    In this talk we learn how to think about specific HA needs of your production environment and how to achieve your requirements with Open Source tooling when it comes to your database of choice, PostgreSQL. In particular, we dive in many options that could be implemented for Postgres to evolve its offering from being a toolset to being “batteries included”. What does it mean in the context of HA? How to achieve it?

  • Ivan Frolkov
    Ivan Frolkov Postgres Professional
    22 мин

    Trying to gain peace of mind by using constraints

    There's a common delusion that constraints should never be used as they affect performance in a negative way, interfere with regular work, and are, all in all, useless. The database is commonly perceived as just a storage without any logic. I'll explain why it isn't so and what this careless approach may lead to.

  • Alexander Liubushkin
    Alexander Liubushkin ООО "ФОРС Телеком"
    Yulia Golubeva
    Yulia Golubeva ООО "ФОРС Телеком"
    22 мин

    New development of LUI (Live Universal Interface) - LUI4ORA2PG, migration tool

    The report will talk about a new tool for migrating application systems from the Oracle environment to the Postgres environment. The tool is developed on the basis of the ora2pg tool (by Gill Darold) and the domestic LUI application development tool. Talks on LUI were given at past PGConfs in 2019 and 2020:

  • Pavel Stehule
    Pavel Stehule freelancer
    22 мин

    How to use pspg

    pspg is unix pager specially developed for usage in psql Postgres client. Today it allows more than usual browsing data. It can work in application mode or it work as csv or tsv viewer too. I will try to show the main possibilities of this application.

  • Konstantin Evteev
    Konstantin Evteev X5 FoodTech
    45 мин

    Building real-time analytics with PostgreSQL

    In today's world, operational reporting and real-time analytics are becoming a basic need. There are a huge number of tools, practices, and approaches, which in turn require different expertise and resources. In this talk, I will tell you how to transform your analytics using PostgreSQL. We'll discuss pitfalls when using different schemes. We'll talk about data quality and performance. I expect this talk to be of interest to both beginners and seasoned practitioners with many years of experience (Discussions and questions after the talk will be highly appreciated). The talk outline is as follows. 1. The evolution of reporting - migration from OLTP to OLAP. 2. Data delivery to DWH and related challenges. 3. Scaling architecture in response to growing data volumes. 4. Data quality issues. 5. Maintaining stability with a large number of changes. 6. Different approaches to organizing the work of the DWH team. 7. And, finally, we'll list the challenges we have successfully responded to with various solutions (pgAgent, PGWatch, working with the file system, the new reading of postgresql.conf).

  • Иван Чувашов
    Иван Чувашов ООО Calltouch
    45 мин

    Life of DBA in the OKKO online cinema

    Okko is one of the largest legal online cinemas in Russia. Our catalog contains 60 000 movies, cartoons and TV series. Since its starting, the service has been visited by over 20 million users. The monthly audience is 2.8 million people. All these figures speak of a reliable high-load service.

    As DBA, I will mainly talk about databases (PostgreSQL, Cassandra, Redis) that are used in the company. We'll take a closer look at PostgreSQL on the topics of high loads, monitoring, optimization, backup and recovery.

  • Daniel Westermann
    Daniel Westermann dbi services
    45 мин

    How to get data from Oracle to PostgreSQL and vice versa

    PostgreSQL has become a reality in a lot of shops today. In most cases PostgreSQL is established beside the current Oracle deployment and quite soon one question pops up regularily: How can we push data from Oracle to PostgreSQL and vice versa? Way back, in March 2001, a new extension to the SQL standard made it's way to define common APIs for managing external data: SQL/MED (ISO/IEC 9075-9:2008). The PostgreSQL community picked that up quite fast and implemented a framework for plugging in so called foreign data wrappers back in 2011 with PostgreSQL 9.1. Since then a wide range of these foreign data wrappers popped up and thanks to those, PostgreSQL today is able to integrate data from almost every external source, no matter if it is coming from flat files, other relational database systems or even unstructured sources. In this talk we will look at the foreign data wrapper for Oracle and how it can be used to get data from Oracle to PostgreSQL. But this is not a one way game: data can also be pushed from PostgreSQL to Oracle, and this might become important depending on the requirements. It is guaranteed that this talk is splitted by half: Slides and a lot of demos.

  • Robert Bernier
    Robert Bernier Percona
    45 мин

    pg_upgrade, Advanced Techniques

    When it comes to performing postgres upgrades between major versions the command line utility pg_upgrade is the most popular method used today. But as with all things wonderful, there are caveats. One of the more critical issues is what to do when there's a failure. The purpose of this talk is to present those little covered "tricks" of the trade that allows one to improve and enhance the upgrade experience.

    We'll start out by reviewing its basic mode operation. We'll then learn what makes it so darn fast when upgrading multi-terabyte datacluster in a matter of minutes. Finally, you will be introduced to those feared failures and how to deal with them with confidence and certainty.

    Herein is a summary of the topics in this talk:

    • How pg_upgrade works: the big picture
    • About pgupgrade (comand line invocation)

      • arguments and options

    • The steps performing an upgrade
    • About the replication ROLE

      • with replication
      • with login

    • Upgrade options: COPY vs Hard Link
    • Post-upgrade

      • about performance
      • aboutanalyze
      • repack
      • reindexing

    • When something goes wrong at the point of no return (or is it?)
    • Upgrading the REPLICA
      • default method: pg_basebackup
      • the cool method:
        • - leveraging rsync
        • caveat; about vacuum wraparound
  • Pavel Borisov
    Pavel Borisov Postgres Professional
    45 мин

    Speed up your fast text search queries with RUM index

    Fast text search queries can be made even faster with indexing on lexemes inside compound records of tsvector format. RUM index is an open-source PostgreSQL extension. It represents a big improvement of GIN index and it can index lexemes with additional information e.g. tsvector lexeme weight-mark. So it can support tsvector capabilities more.

    Until recently it was needed to recheck results of weight-containing queries by table. My modification (2020) is to make the processing of this kind of queries index-only and therefore much faster.

    Also, I will describe and provide benchmarks for different usage cases for fast text search. We'll see how RUM index can improve the performance and compare it with PostgreSQL internal GIN index.

  • Yana Krasteva
    Yana Krasteva Swarm64
    22 мин

    Modern DWH with open-source PostgreSQL

    PostgreSQL has a long history in DWH. Netezza, Redshift, and Greenplum have turned specific PostgreSQL releases into DWH solutions. Nowadays, with the trends in PostgreSQL towards performance improvements (better partitioning, better statistics, JIT Compilation, etc.) and advanced PostgreSQL extensions, like the Swarm64 Data Accelerator, you can create a forward-looking, no lock-in, versatile, and reliable DWH. This talk will cover the PostgreSQL and DWH trends and touch on key arguments for choosing open source PostgreSQL for DWH.

  • Alexander Nikitin
    Alexander Nikitin ЗАО ЦФТ
    22 мин

    Pitfalls we face when cloning and transferring PostgreSQL databases & clusters

    Cloning and transferring PostgreSQL databases & clusters often looks simple.

    However, you can get confused while performing these simple operations, too. During my presentation, I will explain which pitfalls you may face while cloning and transferring PostgreSQL databases & clusters. We'll see what can be done to improve the performance of these operations and list the unexpected issues that arise while performing these seemingly simple operations.