title

text

Boris Neiman
Boris Neiman Mellanox
Andrei Nikolayenko
Andrei Nikolayenko Скала-Р
Arthur Zakirov
Arthur Zakirov Postgres Professional
14:00 06 February
45 мин

Networking acceleration in Skala-SR / Postgres Pro Database Appliance: Modernity and Future

Last year we announced Skala-SR / Postgres Pro database machine with a hardware and software support for remote direct memory access (RDMA) as a key feature. The first appliances have been already installed at customer sites, and even with the first version customers got some constructions, that were impossible without RDMA and CPU offload (got with Mellanox networking). However, capabilities of this equipment are much wider, and this talk is dedicated to current works and prospective developments for this topic.

Слайды

Видео

Другие доклады

  • Olivier Courtin
    Olivier Courtin DataPink
    180 мин

    Tutorial: Advanced spatial analysis with PostgreSQL, PostGIS and Python

    • Spatial and advanced spatial analysis with pure PostGIS (including cutting edge PostGIS functions available)
    • How could we mix and tied efficiently PostgreSQL and Python data types (as NumPy ndarray, and Pandas DataFrames)
    • Tools to improve our data manipulation environment (Jupyter tricks, easy dataviz...)
    • How to go further throught GeoDataScience, with Python libs and framework tied with PostgreSQL/PostGIS (including Machine and DeepLearning)

  • Konstantin Evteev
    Konstantin Evteev X5 FoodTech
    Mikhail Tyurin
    Mikhail Tyurin ИТ предприниматель
    45 мин

    Recovery use cases for Logical Replication in PostgreSQL 10

    Avito is the biggest classified site of Russia, and the third largest classified site in the world (after Craigslist of USA and 58.com of China). In Avito, ads are stored in PostgreSQL databases. At the same time, for many years already the logical replication is actively used. With its help, the following issues are successfully solved: the growth of data volume and growth of number of requests to it, the scaling and distribution of the load, the delivery of data to the DWH and the search subsystems, inter-base and internetwork data synchronization etc. But nothing happens "for free" - at the output we have a complex distributed system. Hardware failures can happen - it is natural - you need to be always ready for it. There is plenty of samples of logical replication configuration and lots of success stories about using it. But with all this documentation there is nothing about samples of the recovery after crashes and data corruptions, moreover there are no ready-made tools for it. Over the years of constantly using PgQ replication, we have gained extensive experience, rethought a lot, implemented our own add-ins and extensions to restore and synchronize data after crashes in distributed data processing systems. In this report, we would like to show how our experience can be shifted to a new logical replication subsystem in 10th version of PostgreSQL. In the current implementation, these are only non-trivial solutions - there is a number of issues for the community, that come down to implementing simple recovery mechanisms - as simple as configuring the replication in 10th version.

  • Olivier Courtin
    Olivier Courtin DataPink
    45 мин

    Advanced spatial analysis with PostgreSQL, PostGIS and Python

    PostGIS is well known and widely used since two decades, as the best OpenSource database solution for Spatial Analysis. This talk will focus on: spatial and advanced spatial analysis with pure PostGIS (including cutting edge PostGIS functions available); how to go further throught GeoDataScience, with Python libs and framework tied with PostgreSQL/PostGIS (including Machine and DeepLearning)

  • Nikita Glukhov
    Nikita Glukhov Postgres Professional
    Oleg Bartunov
    Oleg Bartunov Postgres Professional
    45 мин

    Jsonb flexible indexing. Parameterized access methods operator classes.

    Jsonb is a popular data type in PostgreSQL, it provides the web developers an ability to work with ubiquitous json inside the database and use all the power of proven relational database. Fast querying of jsonb data is a challenge for database and PostgreSQL provides several options for indexing jsonb. We present the new way of efficient indexing of jsonb, based on improvement of indexing infrastructure.

    It's known, that json is a greedy data type, it may contains many auxiliary data not interesting for searching and that affects the size of index. Partial index will not helps, since it filters the rows before indexing, while we are interested in extracting of parts of jsonb. Functional indexes on specific keys could introduce too big overhead. We present an improvement of indexing infrastructure, which allows to control the index behaviour by passing parameters to operator class at index creation. For example, to index a user-defined subset of jsonb it is possible to pass to operator class the powerful path expression (either jsonpath of upcoming sql/json or jspath from jsquery extension), which can be used to extract the parts of jsonb tree. That makes index more effective and reduces the overhead of its maintaining.

    Another use of parameterized operator classes is to allow a user to specify parameters instead of hard coding them, for example, the GiST signature size is currently hard coded inside the implementations of several opclasses (tsvector, hstore, intarray, pg_trgm, ltree), while it is natural to use different signature length for different data to have optimal size of index and its performance.

    Full text search on parts of document can be improved by passing labels to the operator class and letting him index only specified parts of document, that allow to avoid currently used recheck of the rows returned by the index.