title

text

Will Leinweber
Will Leinweber Heroku
11:15 05 February

Heroku Postgres: architecture of a cloud database service

In addition to providing a general purpose web platform, Heroku has a large, supporting Postgres service. Over the years, we've learned a lot about running Postgres at scale.
In this talk, we'll cover:

  • why Postgres is attractive to run as a cloud service
  • how to provision, manage, and monitor a Postgres fleet
  • tradeoffs needed to make Postgres work in this environment
  • automating failure recovery
  • and more

Слайды

Видео

Другие доклады

  • Alexander Korotkov
    Alexander Korotkov Postgres Professional
    45 мин

    PostgreSQL extendability: Origins and new horizons

    Postgres was initially designed to support access methods extendability. Well known citation about access method in Postgres claims: "It is imperative that a user be able to construct new access methods to provide efficient access to instances of nontraditional base types" Michael Stonebraker, Jeff Anton, Michael Hirohama. Extendability in POSTGRES, IEEE Data Eng. Bull. 10 (2) pp.16-23, 1987

    Initially, heap was just one for access methods. So, extendability of access methods would also mean pluggable storage engines in modern terms. For now, only index access methods are defined in pg_am table of system catalog. Those index access methods also have well-defined interface. Therefore in order to meet initial design PostgreSQL need to support two features:

    • Pluggable index access methods, i.e. ability to implement new index types by adding new tuples to pg_am;
    • Pluggable storage engines, i.e. ability to implement completely different storages for tables without traditional heap.

    Besides mechanical work like "CREATE ACCESS METHOD" command, extensible index access methods needs to be WAL-logged. For now, community doesn't want extensions to define their own WAL-records, because there is a chance to break both recovery and replication, which is not acceptable. Another approach is to define generic WAL-records, that specify a difference between pages in generalized way.

    There are only few DBMS which support pluggable storage engines now. MySQL is the most common example here. However, dealing with different storage engines in MySQL is like dealing with different DBMS. This is not the way PostgreSQL should go from our view.

    However, now PostgreSQL users realize benefits from other storages. Ideas of columnar storages and in-memory storages for PostgreSQL are very popular. Simultaneously, technical possibilities to implement them are growing. FDW and custom nodes are arrived. Generic WAL and extensible index access methods are pending for 9.6. Much work in the direction of pluggable storage engines is already done even if it had different aims.

    It's time for PostgreSQL core developers to think about native support of pluggable storages without kludges. Finally, we should get "CREATE STORAGE ENGINE name ..." command as legal extendability mechanism.

    In this talk we will show current state on pluggable index access method and design of pluggable storage engines.

  • Valentine Gogichashvili
    Valentine Gogichashvili Zalando

    Data Integration in the World of Microservices

    Since its launch in 2008, Zalando has grown with tremendous speed. The road from startup to multinational corporation has been full of challenges, especially for Zalando's technology team. Distributed across Berlin, Helsinki, Dublin and Dortmund — and nearly 900 professionals strong — Zalando Technology still plans to expand by adding 1,000 more developers through the end of 2016. This rapid growth has showed us that we need to be very flexible about developing processes and organizational structures, so we can scale and experiment. In March 2015, our team adopted Radical Agility: a tech management strategy that emphasizes Autonomy, Purpose, and Mastery, with trust as the glue holding it all together. To make autonomy possible, teams can now choose their own technology stacks for the products they own. Microservices, speaking with each other using RESTful APIs, promise to minimize the costs of integration between autonomous teams. Isolated AWS accounts, run on top of our own open-source Platform as a Service (called STUPS.io), give each autonomous team enough hardware to experiment and introduce new features without breaking our entire system.

    One small issue with having microservices isolated in their individual AWS accounts: Our teams keep local data for themselves. In this environment, building an ETL process for data analyses, or integrating data from different services, becomes quite challenging. PostgreSQL's new logical replication features, however, now make it possible to stream all the data changes from the isolated databases to the data integration system so that it can collect this data, represent it in different forms, and prepare it for analysis.

    In this talk, I will discuss Zalando's open-source data collection prototype, which uses PostgreSQL's logical replication streaming capabilities to collect data from various PostgreSQL databases and recreate it for different formats and systems (Data Lake, Operational Data Store, KPI calculation systems, automatic process monitoring). The audience will come away with new ideas for how to use Postgres streaming replication in a microservices environment.

  • Ronan Dunklau
    Ronan Dunklau Dalibo
    45 мин

    Multicorn: writing FDWs in Python

    Multicorn is a generic Foreign Data Wrapper which goal is to simplify development of FDWs by writing them in Python.

    We will see:

    • what is an FDW what Multicorn is trying to solve how to use it, with a brief tour of the FDWs shipping with Multicorn.
    • how to write your own FDW in python, including the new 9.5 IMPORT FOREIGN SCHEMA api.
    • the internals: what Multicorn is doing for you behind the scenes, and what it doesn't

    After a presentation of FDWs in general, and what the Multicorn extension really is, we will take a look at some of the FDWs bundled with Multicorn.

    Then, a complete tour of the Multicorn API will teach you how to write a FDW in python, including the following features:

    • using the table definition
    • WHERE clauses push-down
    • output columns restrictions
    • influencing the planner
    • writing to a foreign table
    • IMPORT FOREIGN SCHEMA
    • ORDER BY clauses pushdown
    • transaction management

    This will be a hands-on explanation, with code snippets allowing you to build your own FDW in python from scratch.

  • Eugeniy Tyumentcev
    Eugeniy Tyumentcev ООО "Здравствуй мир! Технологии"
    22 мин

    Using JSONB in Real Projects

    We will consider the advantages and disadvantages of solutions based on JSONB compared to traditional relational approach on real projects, including: 1. Performance 2. Data Versioning 3. Scalability 4. Reliability 5. Report building