
Managing Petabytes of Logs in PostgreSQL
Where I used to work, we had pushed ElasticSearch to its breaking point. We needed an even more scalable replacement for a write-heavy, read-seldom workload. So we built one on PostgreSQL. Now, many of us are building the successor as an open source project.
This talk goes over the design of Bagger (named after the giant mining machines), which can manage logs into tens or hundreds of petabytes. More than just a review of the architecture, this talk focuses on the whys and the tradeoffs made in the design.
The talk is intended both to showcase how programmable and powerful PostgreSQL is, but also illustrate the fundamental tradeoffs which must be faced when pushing any technology into the big data space.