title

text

Ivan Panchenko
Ivan Panchenko Postgres Professional
11:45 05 February
90 мин

Full text search from A to Ω

Tutorial on Full Text Seach in PostgreSQL, containing all recent improvemets. All recipies necessary for building an application will be given: dictionary and parser configuration, faceted search, fuzzy search, multilanguage search, ranking etc. Participants will be provided with a test database for exercises.

Слайды

Видео

Другие доклады

  • Alexey Klyukin
    Alexey Klyukin Zalando SE
    Alexander Kukushkin
    Alexander Kukushkin Zalando SE
    180 мин

    Tutorial: Management of High-Availability PostgreSQL clusters with Patroni

    Patroni is a Python application to create high-availability PostgreSQL clusters based on the streaming replication. It is used by Red Hat, IBM Compose, Zalando and many other companies. This tutorial will highlight Patroni architecture, provide attendees with hands-on experience of configuring high-availability PostgreSQL clusters with Patroni, describe how to take advantage of numerous additional features and give an opportunity to learn more about common mistakes related to running Patroni and its troubleshooting.

    In order to take most out of the Patroni tutorial one needs a laptop with git, vagrant and virtual box installed.

    Vagrant can be obtained from https://www.vagrantup.com Virtualbox is at https://www.vagrantup.com

    Alternatively, one can install your Linux distribution packages (or use homebrew on Mac).

    Once Vagrant and Virtualbox are installed one can run the Patroni VM by issuing the following commands:

    $ git clone https://github.com/alexeyklyukin/patroni-training
    $ cd patroni-training
    $ vagrant up
    

    When the setup concludes Patroni box can be accessed via ssh using vagrant ssh command.

  • Andrei Salnikov
    Andrei Salnikov Data Egret
    45 мин

    PostgreSQL upgrade is not as painful as it sounds

    For the majority of System Administrators and DBAs performing an upgrade for RDBMS, let alone a major one, is a pain. That’s because one of the key factors that plays a role in a decision if and when to perform an upgrade is the downtime that it might come to during the process. This is true for any databases but especially important for those that are in production or under a high load.

    Often, a major upgrade get’s cancelled and a DBA needs to go back to an older version due to the lack of experience or some basic errors that could have been easily avoided at the planning stage.

    In our consultancy, we perform upgrades for our clients regularly and it allowed us to streamline the process and take some preventative measures that help us to perform it quickly, efficiently and with minimal or no downtime.

    In this talk, I will share some key steps and tools that will help any DBA to become better at major upgrade performance. I will answer the following questions:

    How to prepare for an upgrade of PostgreSQL? What one needs to do at the planning stage? How to plan your actions during the actual upgrade process? How to perform an upgrade successfully without going back to the older version? What actions one must perform following an upgrade?

    I will also go through the two most popular processes of an upgrade: pg_upgrade и pg_dump/pg_restore, will compare some of the benefits and downfalls using each of these. I will also discuss some of the main issues one might face throughout the process and ways to avoid them.

    This talk would be of interest to those who are new to PostgreSQL, as well as experienced DBAs who would like to learn more about upgrades or those who, in general, would like to understand why major upgrades should NOT be avoided like the plague.

  • Olivier Courtin
    Olivier Courtin DataPink
    45 мин

    Advanced spatial analysis with PostgreSQL, PostGIS and Python

    PostGIS is well known and widely used since two decades, as the best OpenSource database solution for Spatial Analysis. This talk will focus on: spatial and advanced spatial analysis with pure PostGIS (including cutting edge PostGIS functions available); how to go further throught GeoDataScience, with Python libs and framework tied with PostgreSQL/PostGIS (including Machine and DeepLearning)

  • Nikita Glukhov
    Nikita Glukhov Postgres Professional
    Oleg Bartunov
    Oleg Bartunov Postgres Professional
    45 мин

    Jsonb flexible indexing. Parameterized access methods operator classes.

    Jsonb is a popular data type in PostgreSQL, it provides the web developers an ability to work with ubiquitous json inside the database and use all the power of proven relational database. Fast querying of jsonb data is a challenge for database and PostgreSQL provides several options for indexing jsonb. We present the new way of efficient indexing of jsonb, based on improvement of indexing infrastructure.

    It's known, that json is a greedy data type, it may contains many auxiliary data not interesting for searching and that affects the size of index. Partial index will not helps, since it filters the rows before indexing, while we are interested in extracting of parts of jsonb. Functional indexes on specific keys could introduce too big overhead. We present an improvement of indexing infrastructure, which allows to control the index behaviour by passing parameters to operator class at index creation. For example, to index a user-defined subset of jsonb it is possible to pass to operator class the powerful path expression (either jsonpath of upcoming sql/json or jspath from jsquery extension), which can be used to extract the parts of jsonb tree. That makes index more effective and reduces the overhead of its maintaining.

    Another use of parameterized operator classes is to allow a user to specify parameters instead of hard coding them, for example, the GiST signature size is currently hard coded inside the implementations of several opclasses (tsvector, hstore, intarray, pg_trgm, ltree), while it is natural to use different signature length for different data to have optimal size of index and its performance.

    Full text search on parts of document can be improved by passing labels to the operator class and letting him index only specified parts of document, that allow to avoid currently used recheck of the rows returned by the index.