PgConf.Russia 2017 talks

Dynamic Compilation of SQL Queries in PostgreSQL Using LLVM JIT

Dmitry Melnik
ISP RAS, lead developer

Dmitry has been working on development and optimization of compilers at the Institute for System Programming of Russian Academy of Sciences for more than 10 years. Before practising PostgreSQL, he had made an impact in open source projects such as GCC and LLVM compilers, JIT-compilers of JavaScript (WebKit/JSC и V8). Currently he is working on JIT-compiler for PostgreSQL.

Currently, to execute SQL queries PostgreSQL uses interpreter, which implements Volcano-style iteration model. At the same time it’s possible to get significant speedup by dynamically JIT-compiling query “on-the-fly”. In this case it’s possible to generate code that is specialized for given SQL query, and perform compiler optimizations using the information about table structure and data types that is already known at run time. This approach is especially important for complex queries, which performance is CPU-bound. In our talk we’ll discuss how dynamic compilation with LLVM compiler infrastructure can be used to speed up various stages of SQL query execution: - compiling expressions; - compiling Scan, Aggregation, Sort and Join methods of Executor tree nodes; - compiling indexing method; - saving compiled native code for PREPARED statements to speedup OLTP queries.

Also we’ll show the necessary architectural changes in PostgreSQL for dynamic compilation to be effective. E.g. we had to switch from PostgreSQL original “pull” iteration model to “push” model, because the former model didn’t allow JIT compiler to optimize the code effectively (due to virtual calls and saving internal state in Executor tree nodes).

We’ll also discuss the technique for automatic translation of PostgreSQL backend functions used in queries. This method allows to reuse the same backend source code both for JIT compiler and the interpreter.

As the result we have achieved significant speedup on TPC-H benchmark. For expressions JIT the speedup is 20% (the source code is publicly available at github.com/ispras/postgres). Also we have developed PostgreSQL extension, which implements all described optimization techniques at different query execution stages, for which on the same TPC-H benchmark we have achieved up to 5x speedup.

VIDEO

Slides