Data analysis for all. Developing BI system, working on limited and even shared hardware
Experience we've got after 5 years of developing, deploying and improving BI system http://colibri365.ru used in government. I would talk about government IT reality and our way over it. Postgres performance improvements, using of latest features, overwriting of user generated queries to help query optimizer and other tweaks and hacks to tackle limited hardware problems. These lead us to number of computer science papers and (now committed) patches to Postgres (see Andrey Borodin talks for details).
VIDEO
Слайды
Другие доклады
-
Hans-Jürgen Schönig Cybertec Schönig & Schönig GmbH
Processing 1 BILLION rows per second with PostgreSQL
Database systems are increasing in size and so is the need to process huge amounts of data in real time. As commercial database vendors are bragging about their capabilities we decided to push PostgreSQL to the next level and exceed 1 billion rows per second to show what we can do with Open Source. To those who need even more: 1 billion rows is by far not the limit - a lot more is possible. Watch and see how we did it.
VIDEO
-
Marco Slot Citus Data
Towards 1M writes/sec: Scaling PostgreSQL using Citus MX
Citus allows you to distribute postgres tables across many servers. It extends postgres to transparently delegate or parallelise work across a set of worker nodes, enabling you to scale out the CPU and memory available for queries.
One year ago, we began a long journey to allow Citus to scale out another dimension: write throughput. With writes being routed through a single postgres node, write throughput in Citus was ultimately bottlenecked on the CPUs of a single node. Citus MX is a new edition of Citus which allows distributed tables to be used from from any of the nodes, enabling NoSQL-like write-scalability.
-
Aleksei Plotnikov Skype
Database platform architecture and administrating PostgreSQL in Skype
Most of the main Skype services use a database platform based on PostgreSQL and other open-source technologies, such as Skytools, plProxy, pgBouncer, etc. This platform consists of several hundreds of servers with thousands of databases, which process hundreds of thousands of transactions per second. At the same time, the platform architecture allows its users (applications and their developers) to work with "logical" databases, without any worries about their real “physical” structure.
Our Skype Database Platform team is responsible for the database platform infrastructure. We develop automation systems for various processes that help us ensure service reliability and facilitate development, testing, and deployment of code. In this presentation, I will outline the database platform architecture, review its main components, and tell you about the methods we use in our every-day work to ensure high availability, scalability, replication, fault tolerance, and more.