Linux VMM for database developers

We'll discuss how does Linux work with virtual memory. The following topics will be covered: * x86-64 page table, context switch and page fault; * internals of virtual memory management (VMM) in Linux; * page eviction methods in Linux, page cache and anonymous pages; * huge and gigantic pages, transparent huge pages; * how mmap(2) works and what madvise(2), msync(2) etc. provide; * why large databases don't use mmap(2), but rather implement buffer pool on their own; * ans surely how to tune Linux VMM using sysctl.



    Speeding up query execution in PostgreSQL using LLVM JIT compiler

    Currently, PostgreSQL uses the interpreter to execute SQL-queries. This yields an overhead caused by indirect calls to handler functions and runtime checks, which could be avoided if the query were compiled into the native code "on-the-fly" (i.e. JIT-compiled): at a run time the specific table structure is known as well as data types used in the query. This is especially important for complex queries, which performance is CPU-bound. At the moment there are two major projects that implement JIT-compilation in PostgreSQL: a commercial database Vitesse DB and an open-source project PGStorm. The former uses LLVM JIT to achieve up to 8x speedup on selected TPC-H benchmarks, while the latter JIT-compiles the query using CUDA and executes it on GPU, which allows to speed up execution of specific query types by an order.

    Our work is dedicated to adding support for SQL query JIT-compilation to PostgreSQL using LLVM compiler infrastructure. In the presentation we'll discuss how JIT-compilation can be used to speed up various stages of query execution in PostgreSQL, and the specifics of translating an SQL query into LLVM bitcode to achieve good performing native code. Also we'll present preliminary results for our JIT-compiler on TPC-H benchmark.

    My Five Slides About Postgres

    My experience of working with PostgreSQL has provided clear understanding of its main advantages, making us choose and recommend choosing it.
    1. Beginning
    2. Documentation
    3. Community
    4.1 Transactional DDL
    4.2 WAL and True Physical Replication
    4.3 Transactional Snapshot and True Logical Replication and PGQ
    4.4 Exciting extensibility
    5. Success

