Talks and tutorials
Postgres monitoring with USE and RED
Talk russianBrendan Gregg’s USE (Utilization, Saturation, Errors) method for monitoring is quite known. There’s also Tom Wilkie’s RED (Rate, Errors, Durations) method, which is suggested to be better suited to monitor services than USE. I want to talk about how we employ these methodologies when we develop our Postgres monitoring in okmeter.io.
Slides
Video
Forward to the past
Talk russianSometimes there is a great desire to return the database to the past, for a day or two or more days. The reasons are diverse, but most often one is to see what has changed. Or to see if the application behaved incorrectly after the update. Or it was just a command from the boss. The classic way everyone knows is to keep full backups and sets of WAL-logs to be able to recover to an arbitrary moment. This method is a real headache for DBAs/administrators, and it will not work quickly. Sure, there are some ways to optimize this process, but downtime is inevitable. PostgresPro offers a new way — database snapshots and the ability to return to them.
Slides
Video
Training and certification: news and roadmap
Talk russianThe first training course "DBA1. PostgreSQL Administration. Basic Course" was introduced by Postgres Professional in September 2015. The course was based on the then up-to-date version of PostgreSQL 9.4. A set of new courses has been introduced since then. In turn, the PostgreSQL database didn't stand still either.
The report will present the current line of courses for administrators and application developers. We will also show the roadmap for the near future, which includes specialist certification.
Slides
Video
Integrating PostgreSQL servers into corporate network
Talk russianMeeting corporate standarts for information security, business continuity and software unification: Kerberos Authentification (Windows and Linux) in Active Directory Environment. 1C Enterprise specifics. Using backup and recovery software (HP Data Protector). Integration with corporate monitoring system (Solarwinds Mointor).
Slides
Video
Troubleshooting in Postgres with pgCenter
Talk russianSometimes problems arise during Postgres operation, and the faster they are identified and resolved, the happier users eventually will be. pgCenter is a set of CLI powerful utilities for troubleshooting in the "here and now" mode. In the talk I will show how to use pgCenter for efficient troubleshooting, the directions in which to search, and how to respond to certain problems, in particular how to:
- check if Postgres is in the normal state;
- identify promptly the faulty clients and stop them;
- reveal too heavy queries;
- and other tips and tricks of pgCenter.
Slides
Video
The Power of Logical Replication
Talk russianOne of the most soft after features of Postgres v10 is logical replication. In this presentation we will cover what Logical Replication is, how it compares to Binary (Streaming Replication), how Logical Replication works, configuring Logical Replication, Logical Replication limitations, gotchas, security and management. We will also discuss potential deployed architectures with Logical and Binary Replication and some neat features of the underlying technology.
At the end of this presentation an attendee with a reasonable understanding of how to manage Postgres will be able to configure Logical replication for use.
Video
MobilityDB: A PostgreSQL extension for mobility data management
Talk russianWe will be presenting MobilityDB, a PostgreSQL extension that extends the type system of PostgreSQL and PostGIS with abstract data types for representing moving object data. These types can represent the evolution on time of values of some element type, called the base type of the temporal type. For instance, temporal integers may be used to represent the evolution on time of the number of employees of a department. In this case, the data type is “temporal integer” and “integer” is the base type. Similarly, a temporal float may be used to represent the evolution on time of the temperature of a room. As another example, a temporal point may be used to represent the evolution on time of the location of a car, as reported by GPS devices. Temporal types are useful because representing values that evolve in time is essential in many applications, for example in mobility applications.
The temporal types in MobilityDB are based on the bool, int, float, and text base types provided by PostgreSQL, and on the geometry and geography base types provided by PostGIS (restricted to 2D or 3D points). MobilityDB follows the ongoing OGC standards on Moving Features (http://www.opengeospatial.org/standards/movingfeatures), and in particular the OGC Moving Features Access, which specifies operations that can be applied to time-varying geometries.
A rich set of functions and operators is available to perform various operations on temporal types. In general there are three classes:
- Lifed functions and operators: the operators on the base types (such as arithmetic operators and for integers and floats, spatial relationships and distance for geometries) are intuitively generalized when the values evolve in time. Spatiotemporal functions in MobilityDB generalize spatial functions provided by PostGIS for both "geometry" and "geography" types, for instance the "ST_Intersection". Basically, MobilityDB takes care of the temporal aspects and delegates the spatial processing to PostGIS.
- Temporal functions and operators: they process the temporal dimension of the value which can be an instant, a range, an array of instant, or an array of ranges. Examples are the atperiods function that restricts a temporal type to a given array of time ranges, and the duration function that extracts the definition time of a value.
- Spatiotemporal functions and operators: all remaining functions fall into this category. Examples are speed(tgeompoint/tgeogpoint), azimuth(tgeompoint/tgeogpoint), maxValue(tfloat/tint), twAvg(tfloat) a time weighted average, etc.
Both GiST and SP-GiST have been extended to support the temporal types. The GiST index implements an R-tree for temporal alphanumeric types and a TB-tree for temporal point types. The SP-GiST index implements a Quad-tree for temporal alphanumeric types and an Oct-tree for temporal point types. The approach used for developing SP-GIST indexes for MobilityDB allowed us to add SP-GIST indexes for 2-dimensional, 3-dimensional and n-dimensional geometries in PostGIS.
Two types of numeric aggregate functions are available. In addition to the traditional functions min, max, count, sum, and avg, there are window (also known as cumulative) versions of them. Given a time interval w, the window aggregate functions compute the value of the function at an instant t by considering the values during the interval [t − w, t]. In contrast to standard aggregation, temporal aggregation may return a result which is of a bigger size than the input. For this reason, the temporal aggregate functions have been extremely optimized in order to perform efficiently.
MobilityDB has a preliminary implementation of the statistic collector functions and the selectivity functions for the temporal types.
In terms of size, the extension is made of 67k lines of C code, 19k lines of SQL code, 67k lines of SQL unit tests. It defines 40 types, 2300 functions, and 1350 operators.
The talk will illustrate the spatiotemporal concepts and the data model of the temporal types. It will briefly describe the components of MobilityDB: indexing, aggregations, functions and operators, and the SQL interface. Query examples and uses cases will be illustrated allover the talk. The current status of MobilityDB and the planned development will also be presented.
The talk shall be given by: - Esteban Zimányi: Professor and Director of the Department of Computer and Decision Engineering of the Universite Libre de Bruxelles. - Mahmoud SAKR: Postdoc researcher in the Universite Libre de Bruxelles.
Slides
Video
PostgreSQL bottlenecks
Talk russianIt's so good when database behaves predictable. When the performance is lacking, you just add CPU cores, terabytes of RAM and millions of IOPS, and everything becomes good again. But it's rather unpleasant, when server have plenty of free resources, while database is still running slow. And it's especially sad if stress testing detects no problems, while real life workload of the same volume makes your database hang.
In this talk I will consider bottlenecks of PostgreSQL, which we met in our practice, and which causes sad behavior described above. I'll also explain what can be done at user level in order to evade these bottlenecks, and what developers are planning to do in order to eliminate those bottlenecks. I'm also planning give some recipes of stress testing, which could have to evade surprises in production.
Slides
Video
How we cook PostgreSQL at 2GIS in the DevOps age
Talk russianThe dark age for PostgreSQL started at 2GIS after transitioning to the microservice architecture. Every team tried to cook database on their own — by installing instances, juggling versions, trying to code deployments with numerous tools or using manual operations. It was the right time to develop a “silver bullet” — a common set of tools to solve all the problems at once. We created our own cluster solution based on well-known PostgreSQL, repmgr, pgbouncer and Barman. Despite of the complexity of our final solution, we developed a repeatable flexible deployment to accelerate postgresql cluster deployment and management. Also we deployed the our own cluster to consolidate all databases. It helped to eliminate team efforts for database management and focus on their main goals. Failover works, we tried it :-)
Slides
Video
High Load Distributed Control Systems for Nuclear Power Plants operation
Talk russianThis talk will represent a new platform of Distributed Control System for Nuclear Power Plant operation. Participants will learn about control system for very complicated automation objects. In a hard real time node more than 150 special subsystems are operating in order to control various technological processes of nuclear power plant (NPP), such as reactor control system for more than 1000 MW power unit with a turbine weighing more than 2000 tons. More than 100K of data gained from sensors are resulting in up to 500K of parameters representing 5 branches of physical processes: neutron kinetics, hydrodynamics, chemistry and radiochemistry, and physics of strength. Deviations may cause the whole system to become a huge DDoS source made of useful diagnostic information which is always much larger than the network and hardware are capable to manage. This may lead to normal operation failure. The talk will reveal the approaches to solve the issue.
You will learn about hardware and software architecture of such systems, about backup and replication, data redundancy and technological diversity. How to manage high loads, what is QoS, and what will happen in case of normal operation system failure, as for example was at Fukushima. But, hey, there should be a talk about coding! So, no SSD and HDDs, only InMemory, data structures from tens of millions of elements, and forget about processor cache as it does not work. Imagine your newest 4-generation Xeon has lost all the advantages and turned into a "pumpkin", so let's roll up your sleeves and examine timings, synchronicity, and try to make the most of your hardware, discovering the weakest link from processor, operating system and a network.
Slides
Video
What should we expect from PostgreSQL 12?
Talk russianPostgreSQL 12 Feature Freeze is scheduled for April 2019, which didn't come yet. But general shapes of upcoming release are already visible. In this talk I'll consider patches already committed to PostgreSQL 12 as well as patches, which would be committed very likely. I'll talk with special passion about SQL/JSON, Merge, pluggable table access methods and zheap.
Slides
Video
PostGIS for Disaster Management
Talk russianStandby in production
Talk russianMy talk will be about different cases of usage and setup of the standby server; examples how to setup standby linked to your archive( to make an opportunity to recreate standby from archive after primary crashing and promoting your old standby); Avito experience of usage of standby server for read-only queries: problems and solutions; monitoring of standby.
Slides
Video
How not to screw up when building HA cluster
Talk russianYou just set up your first PostgreSQL cluster, created a database schema, loaded some data, did some fine tuning of configuration. Now you want to make your cluster highly available. Unfortunately, PostgreSQL doesn't offer built-in automatic failover, but luckily for us, there are plenty of external tools for that. As a next logical step you start choosing a tool, and... you already doing it wrong, because first you have to define SLA, RTO, and RPO. In this talk I am going to cover most of the common mistakes people do when setting up a highly available cluster.
Slides
Video
Towards more efficient query plans: PostgreSQL 11 and beyond
Talk russianA major responsibility of a database engine is to convert a declarative SQL query to an efficient execution plan, employing various methods to scan and join the relations. There is always a development effort to improve this area. What clever execution plans can PostgreSQL generate, what's new in version 11 and what is in development? To name a few things, the joins are optimized by removing unneeded outer and inner joins, and reducing joins from outer and semi to inner. There is work to enable merge joins on inequality and range overlap, and to improve join selectivity estimates with multi-column statistics. When it comes to scanning a single relation, covering indexes allow to use index-only scans more often. Incremental sort and more precise estimation of sorting costs help generate better paths when sorted output is required, e.g. when using GROUP BY and ORDER BY or performing merge joins. This talk aims to give an overview of such optimizations that already exist and that are being developed now.
Slides
Video
Professional Postgres
Talk russianMajor upgrade for PostgreSQL
Tutorial russianIn this master class, I will take you step-by step through a major upgrade of PostgreSQL. Through our practice we see a lot of different PostgreSQL servers in production environment and often, almost too often teams who once made PostgreSQL their database of choice never update it following the initial installation. The are many different reasons for it, the result, however, is the same - they all miss out on the new useful features of the newer releases and reduced database performance.
The goal of my masterclass is to equip attendees with necessary tools for performing PostgreSQL upgrade.
I will take you through each step of the major upgrade and will dive into each executed command. I will also explain the particular order in which I perform an upgrade and explain the consequences of not following this order or missing a particular step. We will perform an upgrade of PostgreSQL 9.0 to 11.
My hope is that following this masterclass number of outdated PostgreSQL database will reduce since participants will then go back to their databases and make sure that they are running the most recent version.
Video
Using JSONB
Tutorial russianJSONB in PostgreSQL has interesting properties for design and development of business systems with heavy domain, helping to fight against complexity and variability. We will discuss the trade-offs of the JSONB approach. Using an open source project for Health IT - fhirbase we will:
- load synthetic medical data into PostgreSQL
- learn how to search and index this data (gin, jsquery, json-knife)
- use JSON aggregation to build complex queries (GraphQL)
- see how this data can be modified and validated
- discuss architecture consequences of JSONB usage
Technical requirements:
- docker
- docker-compose
Video
Postgres + patroni + consul + s3 + walg + ansible = Simple HA cluster
Tutorial russianPatroni is getting art of state standard framework for building HA clusters with postgres now.
During session we will build simple 3 node cluster using mentioned stack.
We will discuss patroni's architecture, and most interesting parameters from it's configuration. We will check how actually failover works and how could you initialise cluster.
After session you will be able to built such cluster from scratch in minutes using given ansible playbooks.
Slides
Video
NoSQL/PL: Programming in non-SQL-like procedural languages
Tutorial russianLive Universal Interface (LUI) - team development tool WEB interface applications for Postgres
Talk russianOur company has developed a software product Live Universal Interface (LUI) is a tool to quickly create and modify standardized on-screen forms to WEB browsers without compiling code, know just enough SQL. LUI is aimed at B2B, B2G, G2C and B2C segments and intended for use in billing systems, financial management, accounting and control of production, where you must decide first and foremost functional tasks, rather than demonstrate unnecessary graphics elements.
Collective elaboration is provided by storing all elements of program code in the total database, which can be located on the corporate server or in the "cloud" on a third-party site.
Slides
Video
Configuring and profiling VMware virtual infrastructure for PostgreSQL intensive input/output
Talk russianVirtualization in companies has become an alternative to the conservative "one task-one server" approach, which allows efficient use of hardware resources, centralized management of server infrastructure, saving energy and cooling resources. The report explains how to configure the VMware environment for intensive input / output PostgreSQL and profiling tools virtual infrastructure to monitor performance and resolve identified problems.
Slides
Video
Manage and Protect your PostgreSQL data with Commvault
Talk russianReliable backup and recovery, at enterprise level for the PostgreSQL environment. No more traditional backups. CBT (Change Block Tracking) technology is the next generation incremental backup. Faster than snapshots, CBT back up blocks that change, not all of your data, reducing server and network traffic and eliminating the need for traditional backups. Benefits: • Data protection mode close to Real-Time • Update with ease
Slides
Video
PostgreSQL inside Docker
Talk russianCTE Queries Usage for Business Logic
Talk russianTechnical features of porting T-SQL code to plpgsql and data from MS SQL to PG on the example of transition ECM "Priority" to Postgres
Talk russianThis report focuses on the continuation of transferring our ECM “Priority” from MS SQL to Postgres. Technical solutions, issues of rewriting from T-SQL to plpgsql, optimization of the effective code and moving data will be covered. Additionally, there will be considered aspects of pgplsql performance testing to find the “bad code” of pgplsql as a candidate for optimization. The main objective of the presentation is to answer the question: "We have it in T-SQL - how to transfer it in PG?". The report is intended for junior Postgres developers and is a continuation of the previous report made at the conference in 2017(https://youtu.be/v6_4Szr8t14).
Slides
Video
The Ultimate Tool: Xobot IDE
Talk russianIIn the world of programming, the creation of source code for databases "procedural extensions" is something solitary. Most DBMS offers procedural languages and "stored procedures" to create procedural extensions. In Postgres the number of supported procedural languages has already exceeded a dozen.
Traditionally, stored procedures have many applications: it is difficult to resist the attraction to perform a data operation directly in the storage, especially in the Enterprise development. This approach quickly leads to the spreading of the business logic and dramatically increases the cost of support and development of the system as a whole.
The life cycle of stored procedures makes it difficult to use standard Change Management tools and practices. It is necessary to adapt the operations upon the stored procedures to the standards of Change Management, yet staying within the comfortable development practices.
We'll look at the typical tasks of the procedural extensions development and discuss the solutions we are implementing in the Xobot IDE.
Slides
Video
LTREE: extending the syntax
Talk russianLess known but worthwile PostgreSQL extensions
Talk russianPostgres is known for it extensibility, which made it the universal database, that means it can meet the requirements of practically any project. Many extensions are well-known and widely used, for example, PostGIS extension - de-facto standard of open source GIS, hstore - an extension for storing and manipulation of arbitrary key/value pairs. I will talk about less known but useful PostgreSQL extensions, which provides a new functionaliy and/or improve the performance of PostgreSQL.
PostgreSQL was designed to be extensible, it provides an API to application developers to extend PostgreSQL functionality and/or improve the performance for specific data and workloads. It is important that there is no need of having expertise of core developers, and these new functionality could be added online without restarting of database server. Application developer can create various database objects, such as functions, data types, operators, indexes, and even new access methods.
I will present my choice of two extensions out of hundreds:
vops - greatly improves the performance of Postgres for OLAP queries using vector operations, pg_variables - provides session variables for storing scalars and relations, useful for generating reports on read-only replicas.
Slides
Video
PostgreSQL + PostGIS + TimescaleDB - storage for transport monitoring systems
Talk russianPostgreSQL + PostGIS + TimescaleDB is a ready-to-use symbiosis from a reliable RDBMS, a powerful set of geographical objects and calculations, and work with time-series data. This bundle perfectly solves the problem of storing telemetry, while leaving the whole PostgreSQL ecosystem in your hands.
Slides
Video
Greenplum: Internal Structure of MPP PostgreSQL for Analytics
Talk russianAs we all know, PostgreSQL is a classic vertically scalable database for OLTP loads. In parallel with PostgreSQL for many years there is its alternative horizontal-scalable MPP version of PostgreSQL, that is called Greenplum, sharpened for big data and OLAP workload. In my pitch I will show the internal architecture of Greenplum (distributed transactions, data sharding, partitioning with hybrid storage in external systems, column storage engines with compression, and much more), a comparison with the internal structure of PostgreSQL and the application areas of each solution are shown.
Slides
Video
Backups with WAL-G. What's new in 2019?
Talk russianDIY database index
Talk russianI'm going to talk about emerging technologies in the area of general purpose RDBMS indexing. I will describe different approaches suitable for different workloads. We will discuss ideas from academic researches and corresponding industrial response from developers, communities, and companies. There will be the short live-coding session on creating DIY index in PostgreSQL.
Slides
Deep Dive into the RDS PostgreSQL Universe
Talk russianIn this session we will deep dive into the exciting features of Amazon RDS for PostgreSQL, including new versions of PostgreSQL releases, new extensions, larger instances. We will also show benchmarks of new RDS instance types, and their value proposition. We will also look at how high availability and read scaling works on RDS PostgreSQL. We will also explore lessons we have learned managing a large fleet of PostgreSQL instances, including important tunables and possible gotchas around pg_upgrade.
Slides
Video
How and what we'd choose between patroni, stolon, repmgr for out HA Postgres
Talk russianWhile doing development of one of our project we were asked to build HA database using Postgres, geographically distributed.
First our choice was obvious, we started to work with big 3 cloud providers, but soon it was quite understand that everything costs big enough for us. Also there were a bunch of incompatibilities with unsupported extensions as well as londiste replication we were heavily used.
I will talk about why we chose patroni, what types of problem we faced with and patroni's special features can dramatically simplify deploy and everyday usage.
Slides
Video
How to break your DBMS with arised-from-nothing high loads?
Talk russianAs any ordinary software developers, we just pursued a goal to develop a system robust for high loads, and even succeeded. The system architecture was fine, but the data volume was keeping increased and revealed the painful issues and errors that nobody had expected. We faced very strange queries seemed to be unbelievable. In my short talk I would like to share sad experience of arised-from-nothing high loads in DBMS and solving the challenge.
Slides
Video
Typical errors in application software working on PostgreSQL
Talk russianSoftware applications working on PostgreSQL is a very typical case in my practice. Some of them manage to work well, some of them do not. In the talk I will focus on errors and problems of the last ones.
Gallery
Slides
GreenHouseSQL as a scalable analytics system for postgresql, greenplum and clickhouse
Talk russianAt pgconf’17 I talked about our analytics systems based on PostgreSQL. Afterwards we looked at hadoop, s3, presto, vertica, and other frights. Finally we stopped to suffer nonsense and just completed PostgreSQL with ready Greenplum and Clickhouse. As a result, we achieved amazing performance, fast migration, easy maintenance, reliability and horizontal scalability. We enabled to recover the system after fault in two commands, decreased infrastructure costs and expanded functionality due to ANSI SQL, MPP and In-memory. All within the open-source and full SQL paradigm. We called the product GreenHouseSQL, which is our inner whole cycle data platform. In the talk we will show the beauty of solution internals, explain the advantages and flaws, tips and tricks of starting with Greenplum, as well as why do we need Clickhouse, what is left to PostgreSQL, and eventually how does it all work.
Slides
Video
Tips and Tricks with Amazon RDS for PostgreSQL
Talk russianManaged database services are gaining in popularity. In this session we look at how best to configure Amazon RDS for PostgreSQL and also look at common user operations of using RDS for PostgreSQL. We will also look beyond common user operations and into some specific optimizations related to upgrade, logical replication, performance, and reducing downtime.
Slides
Video
Postgres Performance and Maintenance
Tutorial russianWhen you are optimizing Postgres it is usually maintenance that goes by the wayside. How do we fix autovacuum? Where did all of this bloat come from? Why am I getting IO spikes? How do I get RDS to behave?! Why are commits so slow on replication? The answer to all of these questions is understanding the relationship between proper Postgres maintenance and performance. Join us for a 3 hour jaunt through the wily world of making Postgres Go!
Video
Practical Postgres Replication
Tutorial russianIn this tutorial we will discuss Binary and Logical replication in a practitioner format. The topics that will be included are native Postgres replication technologies, configuring and managing them. We will also discuss performance and draw backs of various architectures (sync vs async etc...). At the end of this presentation the attendees will be able to configure a basic replication deployment with HOT Standby and well as have an understanding of other technologies such as Point in Time Recovery and cascading replication.
Migration to PostgreSQL/Postgres Pro on multi-core Bull servers. With your own hands-on experience.
Talk russianTo migrate to a PostgreSQL/Postgres Pro we need multi-core servers to be carefully tuned for correct parallelism. What settings make multi-terabyte installations work fast and correctly?
We will share our PostgreSQL/Postgres Pro on BullSequana S and Bullion S servers testing experience.
Differences between Oracle DB and PostgreSQL in terms of administrating
Talk russianAs an experienced Oracle DBA, I've discovered some distinctive features of PostgreSQL which I'd like to share with you. We'll review a set of essential tools for DBAs, their capabilities and usefulness in comparison to their Oracle counterparts. Also I'm going to summarize the core differences between Oracle DB and PostgreSQL in terms of administrating.
Slides
Non-standard 1C cluster
Talk russianAutovacuum: what can we learn if we read the code instead of the documentation
Talk russianIn this talk we will have a look at the details of autovacuum's implementation and see what kind of practical implications they have. The talk will also provide a short overview of patches for autovacuum that are currently being considered by the developer community and that may be included in newer versions of PostgreSQL.
Slides
Video
Russia's first BlockChain on 1C+PostgreSQL
Talk russianDuring the report, I would like to share the experience of implementing BlockChain in a real business task based on 1C+PostgreSQL. Where did this task come from? From whom do we protect data with the help of technology? How to get a chain integrity report of tens of millions of records in seconds?
Slides
Video
Magic tricks followed by exposure (1C+PG)
Talk russian1С:Enterprise and PostgreSQL
Talk russianOptimization of optimized and not quite
Talk russianWhat stops the transition on PostgreSQL and how to overcome it
Talk russianSeveral real cases from those who stopped the migration to PostgreSQL.
Video
Auto plan tuning using feedback loop
Talk russianAs is often seen in OLAP and batch processing workloads, the more complex a query (containing many joins, filters, aggregates), the more there is a possibility of row count estimation errors, which leads to planner choosing an inefficient execution plan.
To address that problem, I developed a tool called pg_plan_advsr as a PostgreSQL extension, which corrects the estimation errors by repeatedly feeding back the information collected during query execution to the planner.
The tool has three features:
- Automatic plan tuning by repeatedly feeding execution information to planner
- Preserve all plans generated during plan tuning in a history table
- Create and store optimizer hints to be able to reproduce plans generated during tuning process
I verified the effectiveness of pg_plan_advsr by enabling it when running the join order benchmark (JOB) against PG 10.4 and observed its execution time shortening to 50% of the original. Therefore, it is useful for user who would like to do plan tuning for OLAP and batch processing.
I will talk about the following things in this presentation:
- Principles behind pg_plan_advsr and its architecture
- Detailed information about the measurements done with JOB
- Possible future enhancements
- Using aqo and pg_plan_advsr together (experimental)
Slides
Video
Analysis of troublesome queries as a means of recurrent refactoring of 1C code
Talk russian- Principles of searching for troublesome queries in PostgreSQL.
- Evaluation of hypothetical indexes and their impact on query plans.
- The most common errors in 1C-programming.
- Basic methods of code refactoring, taking into account the features of PostgreSQL.
- Storing analytical information from the PostgreSQL log to assess the quality of refactoring
Slides
Video
Best Practices and Practical Nuances when working with Postgres using Go
Talk russianIn my talk I'll tell you about practices of working with Postgres in the Go-services. I’ll describe general advantages and disadvantages of the basic tools that are commonly used when working with Postgres using Go. Of course, we will touch on the nuances that need to be taken into account when your services are running inside the Kubernetes. I will also talk about Avito’s experience in providing a database of product’s developers. This presentation will be of interest to developers who want to avoid problems when working with Postgres, and will be useful to DBA who want to know what difficulties customers face in their database.
Slides
Video
Using pg_variables as temporary tables
Talk russianPostgreSQL provides possibility to create temporary tables. Though a temporary table is accessible only to a single session and is removed at the end of the session, all information about it is stored in the system catalogs of PostgreSQL. This is related to several issues, which make it difficult or impossible to use temporary tables in some cases. There are attempts to solve this feature, including in our company. But they have not yet succeeded, mainly because of the PostgreSQL engine. In the talk I want to tell about simple and small pg_variables extension. It allows you to create table variables along with scalar ones. I will tell how it can replace temporary tables, what advantages and disadvantages it has.
Slides
Video
Enterprise-level approach to PostgreSQL tuning: database experiments
Talk russianShared_buffers = 25% – is it too much or not enough? Or it's the right value?
How can we ensure that this – pretty much outdated – recommendation suit well our needs?
It is time to start apply enterprise-level approach to tuning postgresql.conf. Not using various blind auto-tuners or advices from old articles and blog posts, but based on the following two aspects:
- comprehensive database experiments, conducted in automated fashion, repeated multiple times in conditions as close to production as possible, and
- deep understanding of DBMS and OS internals.
Using Nancy CLI (https://gitlab.com/postgres.ai/nancy) we will consider a concrete example: infamous shared_buffers, under various circumstances, in various projects. We will try to figure out, how to optimize this settings for given infrastructure, database, and workload.
Slides
Video
Bitemporality: Tracking Reproducible Revisions in PostgreSQL Using RANGE Types
Talk russianSo you finally have your database model for your application and you fill it in with current data. How do you keep it up to date? While INSERT may still be transparent, UPDATE and DELETE will overwrite your previous data, so you won't be able to reproduce them. Cloning the whole huge content for each minor update is not an option. For rich and complex data about hundreds of thousands of power generators in Germany and worldwide, I built a model using range data types in recent PostgreSQL which allows me to insert, update and delete data while granting the full access to the whole state of the database at any historical moment. I'll present a very simplified version of the database so the audience will be immediately able to apply it for their cases. I'll also show a few tricks in Python and Psycopg2 that will allow a whole team to prepare, review, and deploy all revisions to this database without merge conflicts. And I'll give a few ideas on how to retrieve this data efficiently.
Slides
Video
HypoPG 2: Hypothetical Partitioning support for PostgreSQL
Talk russianDeclarative partitioning was a long-awaited feature and has been enhanced since its introduction in PostgreSQL 10. However, for many users, finding optimal partitioning schemes to have the best benefits from partitioning is not an easy task. Therefore, we added in HypoPG a new hypothetical partitioning feature which helps users to design partitioning. In this presentation, I will provide a brief introduction of HypoPG and explain declarative partitioning, and then I'll show the usage of hypothetical partitioning feature and explain how the extension is working.
Slides
Video
Hacking with Postgres 11 - pg_threads
Talk russianMy presentation is about writing extensions in Postgres. I have written pg_threads that implements simplified POSIX thread API inside Postgres database. It adds a new powerful abstraction giving database developers new opportunities for writing parallel code thus taking advantage of multicore CPUs. There is an extra API for transactional and non-transactional IPC between threads. I also have an example application that takes advantage of this new API that scales linearly even across 2 nodes. The presentation is with live working demo using vagrant project with 2 VMs running Ubuntu and 2 Postgres 11 databases.
Slides
Video
Data Recovery in PostgreSQL on a Damaged Filesystem
Talk russianThis case study walks participants through a case where we decided to embark on a data recovery effort. This talk is applicable to all users, from novices to advanced PostgreSQL database administrators. Beginners will get an understanding of what data recovery is and is not, what expectations to have going into it, and how to work with contracted experts in order to ensure the best possible outcome, while more advanced users and experts will also get a fair bit out of the technical aspects of the case study.
While the talk will emphasize non-technical operational aspects of data recovery, it will also include discussions of the internals of PostgreSQL we had to work with, as well as how we went about approaching difficulties so that we could retrieve the data we hoped to.
Slides
Video
ORM: how to write queries and not to drive database crazy
Talk russianMany DBMS specialists do not like these three letters - ORM because they have repeatedly seen the enormous queries ORM-generated for simplest operations. However practice shows that the origin of the problem is not ORM itself but rather those developers who are not able to use ORM properly. In this report I will tell you the basic principles of how to write code for ORM which generates "good" queries and also show you "bad" code samples and what you get out of them. The main idea is we have to think in SQL-style when writing the code, and so to learn to foresee what kind of query will be generated. But even having mastered that you must always check the output SQL for complex queries. I will show an example when a slight change in ORM-logic increases the volume of output SQL by dozens of times(!). I will tell you about additional tools and tricks. Namely - disabling tracking, INCLUDE construction, alternative syntax for JOIN, how to get more data using a smaller number of queries, how to effectively write queries with grouping, and what do we need mappings for. I will not bypass the cases when it is not possible to effectively solve the problem by means of ORM (for example, queries with recursion). In addition to SELECT requests, there are some Batch-Update/Delete tools that allow you to update and delete data using ORM tools without downloading data to the client side. We'll also talk on how to force the ORM to insert large volumes of data quickly via Multi-Insert and COPY. I will also discuss how ORM supports PostgreSQL-specific data types i.g. arrays, hstore and jsonb. But does it make sense to use ORM at all, since there is so much to learn? Sure it does. There are advantages of using ORM, and we will discuss them as well. All examples are based on Entity Framework technology for .Net Core and .Net Framework in C#. There are some subtle differences in ORM usage in Hibernate/NHibernate, but the basic principles remain the same, so the report will be useful for developers using various technologies.
Slides
Video
Asynchronous Python and PostgreSQL Using asyncpg
Tutorial russianPython may not be the fastest programming language on the CPU, but its fast and easy development saves a lot of costs between the keyboard and the chair. Since database clients spend most of their time waiting for a response from the database server, Python's asynchronous functionality available in the recent versions (3.5+) may help to optimize the application's runtime considerably by working on something else while server's response is being prepared. The asynchronous interface between Python and PostgreSQL is called "asyncpg". In the workshop we'll explore this library and write a short application using some of its useful features.
Video
Python and PostgreSQL Using psycopg2
Tutorial russianPython may not be the fastest programming language on the CPU, but its fast and easy development saves a lot of costs between the keyboard and the chair. In this workshop we'll explore the most popular library for the access to the PostgreSQL server named "psycopg2" and write a short application using some of its useful features.
Video
Oracle to PostgreSQL Migration Tutorial
Tutorial russianStop paying for Larry’s boat!
PostgreSQL is a fully-featured, enterprise-grade and open source database, named two years in a row database of the year! And it’s also the best candidate to migrate off of Oracle, supporting very advanced SQL, easy administration and pl/pgsql, a replacement for Oracle’s PL/SQL. Stop spending hundreds of thousands or millions of $ on database licenses and reinvest them in your team, infrastructure and software.
Join this half-day tutorial to learn the best practices, tricks and tools to perform a successful Oracle to PostgreSQL migration. Learn from the team that performed 150M users Blackberry migration to PostgreSQL on GCP with near-zero downtime, and the key author behind orafce, a set of Oracle compatible functions on top of PostgreSQL.