title

text

Alexey Fadeev
Alexey Fadeev Sibedge
: December
22 мин

Multicorn Foreign Data Wrapper vs plpython

Multicorn technology allows you to develop FDWs in Python, which is much easier and faster than creating FDWs in C. However, there is a downside, Multicorn FDWs work well with primitive WHERE conditions, but more complex cases cause difficulties, which I will talk about. Cases will be considered on the example of my Multicorn FDW for getting OpenStreetMap data. I will also show examples of using the same code in Multicorn FDW and plpython functions, including performance comparison. In conclusion, I will share my findings on when it is better to use plpython, and when Multicorn FDW is more preferable.

Видео

Другие доклады

  • Konstantin Evteev
    Konstantin Evteev X5 FoodTech
    45 мин

    Building real-time analytics with PostgreSQL

    In today's world, operational reporting and real-time analytics are becoming a basic need. There are a huge number of tools, practices, and approaches, which in turn require different expertise and resources. In this talk, I will tell you how to transform your analytics using PostgreSQL. We'll discuss pitfalls when using different schemes. We'll talk about data quality and performance. I expect this talk to be of interest to both beginners and seasoned practitioners with many years of experience (Discussions and questions after the talk will be highly appreciated). The talk outline is as follows. 1. The evolution of reporting - migration from OLTP to OLAP. 2. Data delivery to DWH and related challenges. 3. Scaling architecture in response to growing data volumes. 4. Data quality issues. 5. Maintaining stability with a large number of changes. 6. Different approaches to organizing the work of the DWH team. 7. And, finally, we'll list the challenges we have successfully responded to with various solutions (pgAgent, PGWatch, working with the file system, the new reading of postgresql.conf).

  • Dmitry Ursegov
    Dmitry Ursegov Postgres Professional
    45 мин

    Shardman - the native approach to sharding in PostgreSQL

    The amount of data that is handled today by Enterprises and Web companies is constantly growing. At the same time, it becomes increasingly difficult to have and synchronize several copies of data in different systems. As a result there is a demand to work with large amounts of data directly in a transactional DBMS. This requirement is often imposed by the logic of applications that need real-time results. In this talk we will consider what a universal distributed transactional DBMS can be. We will analyze such aspects as the types of load and their prioritization, dynamic resource allocation and the level of consistency. What tools in PostgreSQL can be used to build such system, what we have already done and what is still missing.

  • Sushant Pandey
    Sushant Pandey Microsoft
    Alicja Kucharczyk
    Alicja Kucharczyk Microsoft
    22 мин

    The Story About The Migration

    In this talk we want to present how Microsoft team composed of people from two different teams approached the project and solved the migration issues using ora2pg and was able to prove that Postgres Single Server can perform equally well as Oracle Exadata. We will present our ways of working and also some main technical challenges that we faced including migration of BULK COLLECT’s, hierarchical queries, refcursors and others more complicated Oracle constructs.

    The story about a challenging PoC that proved that Postgres can achieve the same performance as Oracle Exadata. The schema that was migrated wasn’t the simplest one you might see. It was quite the opposite. The code was loaded with dynamic queries, BULK COLLECT’s, nested loops, CONNECT BY statements, global variables and lot of dependencies. Ora2pg did a great job converting the schema but left a lot of work to do manually. Also estimates produced by the tool were highly inaccurate since the logic required not the migration but total re-architecture of the code. In this talk we want to present how Microsoft team composed of people from two different teams approached the project and solved the migration issues using ora2pg and was able to prove that Postgres Single Server can perform equally well as Oracle Exadata. We will present our ways of working and also some main technical challenges that we faced including:

    • How estimates do (not) work
    • How we handled BULK COLLECT’s
    • Why we got rid of refcursors
    • How we got stuck with testing of one the packages and how the help from a friend solved the problem
    • How we handled hierarchical queries and drilling down the hierarchy

  • Pavel Borisov
    Pavel Borisov Postgres Professional
    45 мин

    Speed up your fast text search queries with RUM index

    Fast text search queries can be made even faster with indexing on lexemes inside compound records of tsvector format. RUM index is an open-source PostgreSQL extension. It represents a big improvement of GIN index and it can index lexemes with additional information e.g. tsvector lexeme weight-mark. So it can support tsvector capabilities more.

    Until recently it was needed to recheck results of weight-containing queries by table. My modification (2020) is to make the processing of this kind of queries index-only and therefore much faster.

    Also, I will describe and provide benchmarks for different usage cases for fast text search. We'll see how RUM index can improve the performance and compare it with PostgreSQL internal GIN index.