title

text

Sangwook (Shawn) Kim
Sangwook (Shawn) Kim Apposha
15:45 04 February
45 мин

Make Your PostgreSQL 10x Faster on Cloud in Minutes

Cloud storage has some unique characteristics compared to traditional storage mainly because it is virtualized and controlled by software. One example is that AWS EBS shows higher throughput with larger I/O size up to 256 KiB without hurting latency. Hence, a user can get only about 4 MiB/sec with 1,000 IOPS EBS volume if the I/O request size is 4 KiB, whereas a user can get about 250 MiB/sec if the I/O request size is 256 KiB. This is because EBS consumes one I/O in a given IOPS budget for every I/O request regardless of the I/O size (up to 256 KiB). Unfortunately, PostgreSQL cannot exploit the full potential of cloud storage because PostgreSQL has designed without considering the unique characteristics of cloud storage.

In this talk, I will introduce the AppOS extension that improves the throughput of a write-intensive workload by 10x by transparently making PostgreSQL cloud storage-native. AppOS works like a storage driver that efficiently exploits the characteristics of cloud storage, such as I/O size dependency to storage throughput and latency, atomic write support in cloud block storage, and fast, but non-durable local SSDs. To do this, AppOS comprises a Linux-compatible file I/O stack including virtual file system, page cache, block I/O layer, cloud storage driver. On top of the file I/O stack, syscall module supports registering pre- and post-handler for file I/O-related system calls in order to transparently work without modifying PostgreSQL codes.

I will focus on presenting key use cases and performance results of the AppOS extension after explaining the internals. Specifically, I will show the performance results of OLTP and some batch workloads using standard benchmarking tools like pgbench and sysbench. I will also present performance results and implications on multiple clouds including AWS, GCP, and Azure.

Материалы к докладу

Слайды

Видео

Другие доклады

  • Bruce Momjian
    Bruce Momjian EnterpriseDB
    45 мин

    Non-Relational Postgres

    Postgres has always had strong support for relational storage. However, there are many cases where relational storage is either inefficient or overly restrictive. This talk shows the many ways that Postgres has expanded to support non-relational storage, specifically the ability to store and index multiple values, even unrelated ones, in a single database field. Such storage allows for greater efficiency and access simplicity, and can also avoid the negatives of entity-attribute-value (eav) storage. The talk will cover many examples of multiple-value-per-field storage, including arrays, range types, geometry, full text search, xml, json, and records.

  • Ivan Frolkov
    Ivan Frolkov Postgres Professional
    45 мин

    Transaction Isolation Levels in PostgreSQL

    Everyone has heard something about transaction isolation levels, but oddly enough, almost no one can clearly explain what it is any why it is important. At the same time, for many operations, it is critical to have a clear understanding of isolation levels and how they can affect the result. Indeed, if a customer has been paid twice and the developer has to pay back the losses, it won't seem unimportant. We'll discuss how to avoid such unpleasant situations.

  • Heikki Linnakangas
    Heikki Linnakangas Pivotal
    45 мин

    Writing a User-defined datatype

    Walk-through of extending PostgreSQL with a user-defined type. The journey begins from the basics, from creating simple domain types over existing types, and continues to implementing a full-blown datatype from scratch in C.

    PostgreSQL's advanced index types, GiST, GIN, and SP-GiST, are covered in enough detail to give an understanding of what each of them is good for. Support functions for each of them are shown for the example 'color' datatype.

  • Алексей Лесовский
    Алексей Лесовский PostgreSQL Consulting LLC
    45 мин

    PostgreSQL Scaling Usecases

    Today no one is surprised by cloud infrastructure anymore, but not all its components are easy to deploy in cloud. For example, the database is always very demanding in terms of performance and resources. Scaling and fault tolerance are the most acute problems, that's why we have been observing rapid development of alternative DBMS in the recent years. However, traditional relational DBMS have already accumulated a lot of various features, so they often remain the first choice. Besides, they are constantly evolving and offer a wide variety of scaling tools. I will mainly speak about PostgreSQL, when you should consider scaling, and how to do it right.

    We will touch upon the following topics:
    - Streaming replication and balancing read/write workloads
    - Logical replication and data sharding
    - High availability and fault tolerance

    This talk should be interesting to DBAs, system administrators, team leads, infrastructure architects, as well as wider audience dealing with PostgreSQL.