Bagger: How we migrated 1 PB of data from Elasticsearch into PostgreSQL
In this talk, I will tell you the story of how a bunch of sysadmins got sick of having to resuscitate their petabyte-sized Elasticsearch cluster and decided to replace it with some tried technologies: PostgreSQL, Kafka, a bit of Redis, lots of glue, and the typical sysadmin stubbornness. The result is Bagger: the sysadmin answer to Big Data. A fast, fairly reliable, fault-tolerant store, used mostly for logging timestamped events for some amount of time. Bagger is named the Bagger series of bucket-wheel excavators, feats of German engineering and some of the largest land vehicles ever produced by man. Just like the excavators that dig through tons of material, our Bagger digs through tons data.
Слайды
Видео
Другие доклады
-
Valery Kosarev -
Pluggable storage for large objects
Storing binary data in database tables is sometimes a good solution for a particular project. But sometimes, due to changes in conditions or insufficient consideration of decisions, such storage is becoming a real nightmare. If there is an understanding of how and where to place these data, the transition to the new solutions are often very hard, often require modification in the application code and downtime the system for migration. The presentation is a particular solution of such problems. Our extension allows to move binary data from database to the storage Ceph and not only. And does it seamless for the applications.
-
Egor Rogov Postgres Professional
Tutorial: More indexes, good and various
"And telling GIN from SP-GIST was quite beyond his wit, we found", said the classic. Can you? This masterclass is about not-so-often used index types (compared to conventional B-tree) which however can do a great job for you. We will look into internal mechanics of these indexes and discuss cases where they can be successfully applied. Also we will talk about some peculiarities of PostgreSQL index access. To spend time efficiently, listeners are required to have basic knowledge of PostgreSQL and should be used to read plans of simple queries.
Materials of the master class
Backup copy of the database with demo data can be downloaded here:
- Recovery with pg_restore (338 MB)
-
Dmitriy Pavlov Arenadata
How to train your Greenplum
In the pitch I will talk about the most important nuances of deployment and operations of the distributed analytical open-source database based on PostgreSQL - Greenplum. I will analyze the typical mistakes in its use, give the best practices and warn about bottlenecks.
-
Дмитрий Шитов ООО "ЦТП"
How I Met Your Linux
What is a real cost of not paying for Windows for 1C-user? Is there life without COM? Addressing and other issues for the bunch of PostgreSQL. Scheduling disk resources. How to overcome OS CentOS crash.