title

text

Yana Krasteva
Yana Krasteva Swarm64
16:00 03 March
22 мин

Modern DWH with open-source PostgreSQL

PostgreSQL has a long history in DWH. Netezza, Redshift, and Greenplum have turned specific PostgreSQL releases into DWH solutions. Nowadays, with the trends in PostgreSQL towards performance improvements (better partitioning, better statistics, JIT Compilation, etc.) and advanced PostgreSQL extensions, like the Swarm64 Data Accelerator, you can create a forward-looking, no lock-in, versatile, and reliable DWH. This talk will cover the PostgreSQL and DWH trends and touch on key arguments for choosing open source PostgreSQL for DWH.

Другие доклады

  • Yugo Nagata
    Yugo Nagata SRA OSS, Inc. Japan
    45 мин

    Updating Materialized Views Automatically and Incrementally

    Materialized view is a feature to store the results of view definition queries in DB in order to achieve faster query response. However, the data in the view gets stale after underlying tables are modified. Therefore, view maintenance is needed to keep the contents up to date. PostgreSQL has REFRESH MATERIALIZED VIEW command for updating a materialized view, but this command needs to recompute the contents from scratch, so this is not efficient in cases where only a small part of a base table is modified.

    Incremental View Maintenance (IVM) is a technique to maintain materialized views efficiently, which computes and applies only the incremental changes to the materialized views instead of recomputing. This feature is required for updating materialized views rapidly but not implemented on PostgreSQL yet.

    Therefore, we developed IVM on PostgreSQL and are proposing to implement this as a core feature. The patch is now under discussion on the hackers mailing list. Our implementation allows materialized views to be updated automatically and incrementally when a underlying table is modified. You don't need to write your own trigger function for updating views. As a result of continuous development, the current implementation supports some aggregates, subqueries, self-join, outer joins, and CTEs (WITH clauses) in a view definition query. The result of performance evaluation using TPC-H queries shows that our IVM implementation can update a materialized view more than 200 times faster than re-computation by REFRESH command.

    In this talk, we will describe our IVM implementation and its features.

  • Andrey Borodin
    Andrey Borodin Яндекс
    Evgeniy Dyukov
    Evgeniy Dyukov Yandex
    45 мин

    How to manage an open source HA RDBMS in a cloud environment

    High availability solutions have become extremely popular in the past few years. They play a critical role in building reliable systems based on affordable hardware. In this presentation, we will pay attention to some of the subtle aspects of the design and maintenance of such systems. In addition, the issues of capturing changes on a HA cluster will be addressed.

  • Bruce Momjian
    Bruce Momjian EnterpriseDB
    45 мин

    Postgres and the Artificial Intelligence Landscape

    Artificial intelligence, machine learning, and deep learning are intertwined capabilities that attempt to solve problems that defy traditional computational solutions — problems include fraud detection, voice recognition, and search result recommendations. While they defy simple computation, they are computationally expensive, involving computation of perhaps millions of probabilities and weights. While these computations can be done outside of the database, there are specific advantages of doing machine learning inside the database, close to where the data is stored. This presentation explains how to do machine learning inside the Postgres database.

  • Henrietta Dombrovskaya
    Henrietta Dombrovskaya Braviant Holdings
    45 мин

    NORM - No ORM Framework

    It's a well-known fact, that although the database performance is great, and each query is executed in milliseconds, the overall application response time may be slow, making the users wait for a response for an extended period of time. We know that the problem is not the database, but the way the application developers communicate with the database. Specifically, we are talking about ORMs - Object-Relational Mappers. Database developers hate them, but application developers love them because they allow developing applications without any knowledge of database internals. As a result, the system performance is often unacceptably slow.

    The only way to change this behavior is to provide application developers with a tool, which is as easy to use, as an ORM, but which will allow escaping the common ORM pitfalls. That's why we developed NORM - No-ORM Framework. During this presentation, we will go over examples of code from https://github.com/hettie-d/NORM repo and learn how to build "transport objects" for efficient data transfer between applications and databases