Talks
Talks archive
-
Nikolai Shaplov PostgresProFuzzing research is feeding random input data to a program (or a part of it) (in fact, randomness is very conditional) and seeing what we get out of it. And we repeat it many times on many processors.
Fuzzing a large monolithic program complex is never a simple task. It requires extraordinary solutions. In this talk, I will tell you what and how we searched with the help of fuzzing and what results it led to.
- Investigation of data type parsing functions (input-functions): for warming up;
- Investigation of functions implementing operations between types (op-functions): it is better to consider the structure here;
- Network subsystem fuzzing: let's pretend we are POSIX calls, it's cheaper that way;
- Recovering disk context: we need Groundhog Day.
A story about funny bugs and ridiculous hand gestures will be included.
-
Artem Sergienko PostgresProHardening is the process of strengthening the security of a system in order to reduce risks from possible threats. In my presentation, I will tell you how to protect service cluster communications using TLS connections, in order to avoid accidental or unauthorized access to Patroni's REST API and ETCD storage.
-
Anton Doroshkevich InfoSoftBackup is still a stumbling block when migrating to PostgreSQL from other DBMSs. Its size directly depends on your experience and knowledge about the types of backups in PostgreSQL. In this talk, I will tell you about different types of backups, their pros and cons, and scenarios for using each type.
-
Владимир Комаров SberTechThere are a lot of different databases. We need some formal criteria to compare databases to each other. The very first idea is to divide SQL and NoSQL. NoSQL is a popular class of platforms developed in 2000s. Indeed, the rejection of SQL is not a fresh idea because there were predecessors of the relational database model, such as network and hierarchical models. The fresh «NoSQL» stream consists of the graph, object, and key-value models. Time-series, wide column, and «document-oriented» models are just extensions of the key-value model. Their advantage is the possibility to parse either key or value on a database server. The facilities of SQL are much more extensive than the key-value interface. So, the simplified interface is just a charge for the ability to build a distributed database. So, the data model is the first axis, and the distribution is the second one. It’s not trivial to release a distributed relational database. The reason is that distributed transaction is one of the most complex problems in IT, and one SQL operator can involve all the nodes in a single transaction. There are attractive efforts to create a distributed relational database. You should pay attention to Cockroach or Yugabyte. But these platforms haven’t got widespread. One day a man invented the in-memory cache. As random access memory got cheaper, in-memory technologies came to databases. Every considered class of platforms contains at least one in-memory member. TimesTen and SolidDB are relational and monolithic; Tarantool, Ignite, etc. are key-value and distributed; VoltDB is relational and distributed. Now the storage environment becomes the third axis. You can remember Teradata, Greenplum, MS PDW, and a few more distributed relational platforms. They are very successful commercial software. It’s true, but these platforms are not intended to process transactions. So the fourth axis is the load type: OLTP vs. OLAP. I would like to draw a 4-dimension cube on the blackboard, but I can’t :) There are no clear borders between the described classes. Relational databases get some non-relational facilities, while non-relational platforms implement SQL. Disk-based systems become in-memory features, while in-memory databases learn to store data on disk. Monolithic platforms become distributed versions. The main idea of this presentation is the following: you have first to define the class of platforms for your solution and then choose a platform inside a class. Not all the classes are equal. Monolithic platforms are much more robust than distributed ones. Relational model is universal in contrast to NoSQL. On-disk storage is cheaper than in-memory. That’s why a relational monolithic on-disk platform is almoast always the right choice. So, choose PostgreSQL! This platform really covers more than 90% of problems.
Photos
Photo archive