Dynamic Compilation of SQL Queries in PostgreSQL Using LLVM JIT
ISP RAS, lead developer
Currently, to execute SQL queries PostgreSQL uses interpreter, which implements Volcano-style iteration model. At the same time it’s possible to get significant speedup by dynamically JIT-compiling query “on-the-fly”. In this case it’s possible to generate code that is specialized for given SQL query, and perform compiler optimizations using the information about table structure and data types that is already known at run time. This approach is especially important for complex queries, which performance is CPU-bound.
Also we’ll show the necessary architectural changes in PostgreSQL for dynamic compilation to be effective. E.g. we had to switch from PostgreSQL original “pull” iteration model to “push” model, because the former model didn’t allow JIT compiler to optimize the code effectively (due to virtual calls and saving internal state in Executor tree nodes).
We’ll also discuss the technique for automatic translation of PostgreSQL backend functions used in queries. This method allows to reuse the same backend source code both for JIT compiler and the interpreter.
As the result we have achieved significant speedup on TPC-H benchmark. For expressions JIT the speedup is 20% (the source code is publicly available at github.com/ispras/postgres). Also we have developed PostgreSQL extension, which implements all described optimization techniques at different query execution stages, for which on the same TPC-H benchmark we have achieved up to 5x speedup.