International PostgreSQL conference
PgConf.Russia 2016 was held in Moscow at 3-5 February 2016
PgConf.Russia 2016 talks
- Panel discussion
PostgreSQL Consulting LLC
Занимаюсь вопросами внедрения решений на платформе 1С:Предприятие с 2005 года, решениями с использованием PostgreSQL с 2008 года. Имею большой практический опыт запуска проектов внедрения 1С в среде linux
Bruce Momjian is co-founder and core team member of the PostgreSQL Global Development Group, and has worked on PostgreSQL since 1996. Bruce is employed by EnterpriseDB. Previously, he was employed by SRA Japan and other PostgreSQL support companies. He has spoken at many international open-source conferences and is the author of PostgreSQL: Introduction and Concepts, published by Addison-Wesley. Prior to his involvement with PostgreSQL, Bruce worked as a consultant, developing custom database applications for some of the world's largest law firms. As an academic, Bruce holds a Masters in Education, was a high school computer science teacher, and is currently an adjunct professor at Drexel University.
Postgres 9.5 adds many features designed to enhance the productivity of developers: UPSERT, CUBE, ROLLUP, JSONB functions, and PostGIS improvements. For administrators, it has row-level security, a new index type, and performance enhancements for large servers. This talk covers the top ten new features that appeared in the Postgres 9.5 release. It will also cover some of the major focuses for post-9.5 releases.
The talk will be articulated around all the traditional arguments to "how chose PostgreSQL over other choices in the database domain"... But also, and that's quite new in the comunity, what are the consequences of this choice. Because the PostgreSQL adoption brings adoption of other things like Linux, but also, Open Source thinking, the fast pace of PostgreSQL will command new methods of validation the company must adapt to... etc.
Recent Update about Postgres Adoption in China. Postgres is getting its momentum in China, especially in 2015, one of the biggest insurance company is adopting Postgres, and Alibaba is providing Postgres service in their public cloud, also there are a lot of significant progress about the adoption. This talk will give an overview about the Postgres adoption in 2015 in China.
Postgres and Oracle DBA since 2002. Great experience with critical and large environments, development, security and performance in databases. Made dozens of lectures in Brazil and publish hundreds of articles about databases on his blog http://savepoint.blog.br (only portuguese) in the last 10 years. Participated in the regional PSL (Free Software Projects organization), brazilian Debian comunity, and brazilian postgres comunity. Loves good beers and cycling in every city he could.
Challenges and solutions found in documents dematerialization and bank cheque processing system used in the Bank of Brazil.
A year passed after birth of Postgres Pro, the Russian PostgreSQL company. The talk will describe the main achievements of the year are the future plans, including development, certification, russian documentation translate, education program.
PostgreSQL Global Development Group
Магнус является членом PostgreSQL Core Team, разработчиком и коммиттером PostgreSQL Global Development Group, президентом совета PostgreSQL Europe. Был одним из первоначальных разработчиков Windows-порта PostgreSQL. Сейчас работа Магнуса сосредоточена на функциональных возможностях движка PostgreSQL, с особым упором на вопросы безопасности, средства для мониторинга и контроля, инструментарий и программные интерфейсы для решения задач резервирования и репликации данных. Помимо этого, Магнус является одной из ключевых фигур инфраструктурной команды postgresql.org, которая отвечает за сервера, обслуживающие проект. Также, Магнус поддерживает актуальность Web-сайта postgresql.org и вносит свой вклад в иные проекты, такие как pgAdmin.
Unlike most other databases, PostgreSQL is developed by a community, and not by a company or even a foundation. Those who have been members of this community for a long time generally consider this a strength, but it can often be confusing to outsiders who are more used to dealing with traditional organization. For those who are not already on the inside, this talk will give an introduction to how the PostgreSQL community works and how the different parties interact, as well as how this has evolved over the years.
I'm working for Bitnine that provides the solution, called Agens SQL. I'm a member of Korean Community and want to lead the Korea Community.
The Korean PostgreSQL User Group has been relatively small and inactive for many years. However, recently things are changing in Korea. Companies are seeking to alternatives for their expensive proprietary RDBMS in order to cut their TCO. And the government institutes also participate in this trend. We, Bitnine, are leading these changes in Korea. We launched the first version of our PostgreSQL solution, Agens SQL in 2015. We are translating the PostgreSQL documentation into Korean and operating the PostgreSQL User Group. And we are trying to contribute the PostgreSQL Global Development Group. Also, the first Korean PostgreSQL Conference will be hold in 2016. We will lead the organization of this conference. In this talk, we will present the current status of the Korean PostgreSQL User Group and the PostgreSQL DBMS market in Korea. And we also present our activities in Korea and introduce our successful migration cases of the proprietary RDBMS into PostgreSQL.
Postgresql's buffer manager has parts where it's showing its age. We'll discuss how it currently works, what problems there are, and what attempts are in progress to rectify its weaknesses.
- Lookups in the buffer cache are expensive
- The buffer mapping table is organized as a hash table, which makes efficient implementations of prefetching, write coalescing, dropping of cache contents hard
- Relation extension scales badly
- Cache replacement is inefficient
- Cache replacement replaces the wrong buffers
Postgres was initially designed to support access methods extendability. Well known citation about access method in Postgres claims: "It is imperative that a user be able to construct new access methods to provide efficient access to instances of nontraditional base types" Michael Stonebraker, Jeff Anton, Michael Hirohama. Extendability in POSTGRES, IEEE Data Eng. Bull. 10 (2) pp.16-23, 1987
Initially, heap was just one for access methods. So, extendability of access methods would also mean pluggable storage engines in modern terms. For now, only index access methods are defined in pg_am table of system catalog. Those index access methods also have well-defined interface. Therefore in order to meet initial design PostgreSQL need to support two features:
- Pluggable index access methods, i.e. ability to implement new index types by adding new tuples to pg_am;
- Pluggable storage engines, i.e. ability to implement completely different storages for tables without traditional heap.
Besides mechanical work like "CREATE ACCESS METHOD" command, extensible index access methods needs to be WAL-logged. For now, community doesn't want extensions to define their own WAL-records, because there is a chance to break both recovery and replication, which is not acceptable. Another approach is to define generic WAL-records, that specify a difference between pages in generalized way.
There are only few DBMS which support pluggable storage engines now. MySQL is the most common example here. However, dealing with different storage engines in MySQL is like dealing with different DBMS. This is not the way PostgreSQL should go from our view.
However, now PostgreSQL users realize benefits from other storages. Ideas of columnar storages and in-memory storages for PostgreSQL are very popular. Simultaneously, technical possibilities to implement them are growing. FDW and custom nodes are arrived. Generic WAL and extensible index access methods are pending for 9.6. Much work in the direction of pluggable storage engines is already done even if it had different aims.
It's time for PostgreSQL core developers to think about native support of pluggable storages without kludges. Finally, we should get "CREATE STORAGE ENGINE name ..." command as legal extendability mechanism.
In this talk we will show current state on pluggable index access method and design of pluggable storage engines.
B-tree is the most widely used index type in PostgreSQL. This data structure and concerned algorithms are developed about forty years ago. But there is still an area for optimisations. In this presentation I'm going to talk about B-tree data structure, and its features important for the optimal index usage. Furthermore, I'll present a couple of new features which are expected to be included in PostgreSQL 9.6 release.
Whenever multiple users, processes, or threads are concurrently modifying data which is shared among them, problems can occur if race conditions are not handled somehow. These problems are particularly acute in a database which provide ACID semantics. A set of changes grouped into a database transaction must appear atomically, both to concurrent transactions and in terms of crash recovery. Each transaction must move the database from one consistent state (with regard to business rules) to another. For programming efficiency, each transaction must be able to be coded independently of what other transactions may happen to be running at the same time. In the event of a crash, all modifications made by transactions for which the application was notified of successful completion, and all modifications which had become visible to other transactions, must still be completed upon crash recovery. Over the years, various strategies have been employed to provide these guarantees, and sometimes the guarantees have been compromised in one way or another. This talk will cover the approaches taken to provide these guarantees or compromised variations of them, with an emphasis on the Serializable Snapshot Isolation (SSI) technique available in PostgreSQL (and so far not in any other production product). While SSI already performs faster and with higher concurrency than any other technique for managing race conditions with most common workloads, there are many opportunities for further enhancing performance, some of which would require the assistance of people expert in the various index access methods; these issues will be discussed. The talk will also present some rough ideas about how SSI techniques might be used with XTM in a distributed system.
Time will be reserved at the end of the talk for group discussion of optimizations and possible application in distributed environments.
Heikki is a long-time PostgreSQL developer and committer. He has worked on many parts of PostgreSQL backend, authoring features such as Two-Phase Commit and the new Free Space Map implementation in PostgreSQL 8.4.
PostgreSQL includes several index types: GiST, SP-GiST, GIN, and of course, the regular B-tree. DBAs are familiar with using each of these for specific use cases, GIN for full-text search, GiST for geometrical data, and so on, but how do they work internally? What makes them suitable for the cases they're typically used for?
In this presentation, I will walk through the internal structure of each of these index types, explaining what strengths and weaknesses each one of them have.
Gregory Stark has been hacking on Postgres since 2006 when he submitted CREATE INDEX CONCURRENTLY. Subsequent contributions included denser data storage, MergeAppend support for partitioned tables, and
When new versions of Postgres are released most of the attention is focused on new features. Inevitably a release note claiming speed improvements seems relatively mundane and doesn't provide the compelling argument for upgrading. However the reality is that these speed improvements represent pain points that have been identified and solved.
Reviewing the changes to the sort code in Postgres over the last 10 years clearly shows the kinds of problems users have run into. As usage patterns changed over years, databases scaled up, and hardware changed new problems arose and drove further development to solve them.
Upcoming changes in 9.5 and 9.6 will dramatically change the experience further. Making sorting UTF8 and other encodings less of a problem and handling scaling to larger machines with many processors and memory cache more effectively.
Author of several open source OODBMSes, member of cluster team in Postgres Professional
Enterprises need enterprise-level databases. The existing Postgres clustering solutions are not supported by the community. Postgres needs a community-supported cluster solution. There have been multiple attempts like Postgres-XC/XL, but they are still being developed separately and have low chance to be accepted by the community. Other solutions, like pg_shard, plproxy, FDW-based, etc. lack the notion of global transactions. We developed a Distributed Transaction Manager (DTM) as a Postgres extension to achieve global consistency over a number of Postgres instances. To demonstrate the capabilities of the DTM we present examples of distributed transaction processing using pg_shard and postgres_fdw. We hope that the proposed approach will be included into Postgres 9.6. This will make the development of the clustering solutions easier for all interested parties.
Andreas Scherbaum is working with PostgreSQL since 1997. He is involved in several PostgreSQL related community projects, member of the Board of Directors of the European PostgreSQL User Group and also wrote a PostgreSQL book (in German). Since 2011 he is working for EMC/Greenplum/Pivotal and tackles very big databases.
Greenplum is a PostgreSQL fork, optimized for Analytics and Data Warehouse use cases. Pivotal announced in early 2015 that a number of products will go Open Source, one of them is Greenplum Database. This talk provides an overview over the history of Greenplum, the entire process of bringing the product into Open Source, all the stumbling blocks we ran into, and explains how contributors can participate.
The talk is about PostgreSQL clusters using streaming replication and pgpool-II, which are quite popular in Japan. Plus, the next version of pgpool-II will be released this winter, so the talk will be about what's new in the version.
CitusDB is an extension for PostgreSQL that can distribute tables across a cluster of PostgreSQL servers. Data is stored in shards that can use append-partitioning for bulk-loading of time series data or hash-partitioning for real-time data ingestion. SELECT queries on distributed tables are transparently parallelised across the cluster, using all available cores. Distributed tables can also be joined in parallel, even if they are not partitioned along the same column. CitusDB is especially suitable for real-time analytics use-cases such as dashboards which require fast analytical queries over live data, and can simultaneously act as a scalable operational database. This talk will describe the internals of CitusDB and give a live demo of a large-scale CitusDB cluster.
Database engineer. Started professional career as a developer and later worked as a devops. Participated in the development of video platforms for Saint Petersburg Economics Forum 2013-2015, and the United State Exam in 2015. Now employed at Postgres Professional.
The talk describes performance benchmarking results of PostgreSQL on modern Hi-End servers. The main attention was paid to the locks for shared data access and associated bottlenecks. The testing propose was to test the linear read scalability limits with an increase of cores number allocated for PostgreSQL. Testing was performed for different postgres versions (9.4, 9.5, 9.6) to check new features designed to increase performance on multiprocessing architectures.
Architecture features of IBM Power 8 allowing to gain high performance in comparison with x86_64. CPU, memory, IO etc. Experience of real benchmarking.
Currently, PostgreSQL uses the interpreter to execute SQL-queries. This yields an overhead caused by indirect calls to handler functions and runtime checks, which could be avoided if the query were compiled into the native code "on-the-fly" (i.e. JIT-compiled): at a run time the specific table structure is known as well as data types used in the query. This is especially important for complex queries, which performance is CPU-bound. At the moment there are two major projects that implement JIT-compilation in PostgreSQL: a commercial database Vitesse DB and an open-source project PGStorm. The former uses LLVM JIT to achieve up to 8x speedup on selected TPC-H benchmarks, while the latter JIT-compiles the query using CUDA and executes it on GPU, which allows to speed up execution of specific query types by an order.
Our work is dedicated to adding support for SQL query JIT-compilation to PostgreSQL using LLVM compiler infrastructure. In the presentation we'll discuss how JIT-compilation can be used to speed up various stages of query execution in PostgreSQL, and the specifics of translating an SQL query into LLVM bitcode to achieve good performing native code. Also we'll present preliminary results for our JIT-compiler on TPC-H benchmark.
I am a student of machine learning department in the Lomonosov Moscow State University, a winner of a lot of olympiads in informatics. Currently I work in Postgres Professional. My responsibility is applying machine learning methods for obtaining better query execution plans.
In the speech we consider the current PostgreSQL planner model, then the possibilities of applying machine learning methods for planner improvement and the obtained results.
Git in Sky
Alex Chistyakov is a principal engineer in Git in Sky, a small consulting company focused on web operations, IT automation and performance engineering. He was working for a number of big software companies including Motorola, Mail.Ru, DataArt and Borland. Can read, understand and manually execute query plans.
We love to stress test software, since we are a performance engineering company. Our friends from a hosting company servers.com provided us with a modern dedicated server so we immediately started to test PostgreSQL in different environments, including SmartOS, DragonFly and Windows. We would like to present our results (and all the gory details) to community.
Alexander is CEO at Tempesta Technologies Inc. and lead developer of Tempesta FW. He's also founder of NatSys Lab., a company providing consultancy in high performance computing in Linux/x86-64 environment. Alexander has more than 10 years of experience in Linux kernel and 5 years in MySQL, InnoDB and Galera Cluster.
We'll discuss how does Linux work with virtual memory. The following topics will be covered: * x86-64 page table, context switch and page fault; * internals of virtual memory management (VMM) in Linux; * page eviction methods in Linux, page cache and anonymous pages; * huge and gigantic pages, transparent huge pages; * how mmap(2) works and what madvise(2), msync(2) etc. provide; * why large databases don't use mmap(2), but rather implement buffer pool on their own; * ans surely how to tune Linux VMM using sysctl.
Web developer, PhD student, author of the jsonbx extension and co-author of new functionality for the jsonb in PostgreSQL 9.5
Schema-less is definitely a trend in the data storage nowadays, and it's not only about NoSQL, but also about traditional RDBMS. Many relational databases (e.g. PostgreSQL, Oracle, db2, Mysql) allow to storing data in the schema-less json format and use their own more or less unique way to do that.
This talk contains two parts:
- Comparison of the json support in PostgreSQL and different relational databases, namely Mysql, Oracle, db2, MSSql in terms of supported features, functions and so on.
- Performance benchmarks for databases with the advanced json support, namely PostgreSQL and Mysql, and the MongoDB on different workload types and configurations.
- the architecture
- the design and implementation of PL/pgSQL
- the difference between PL/SQL and PL/pgSQL
- the advantage and issues of PL/pgSQL
Nikita Volkov is a software engineer with 15 years of experience, who lately specializes in functional programming languages like Haskell, Clojure, Scala. He is the author of a multitude of open-source projects for Haskell, including ["stm-containers"](http://nikita-volkov.github.io/stm-containers/), ["record"](http://nikita-volkov.github.io/record/) and [the "hasql" library](https://github.com/nikita-volkov/hasql), which is the subject of this talk. He is also the author of [the "SORM" project](http://sorm-framework.org), which is an ORM-library for Scala. Nikita [blogs about functional programming](http://nikita-volkov.github.io/), and is employed by a Norwegian company [Sannsyn AS](http://sannsyn.com/), which provides a recommendation engine as a service as well as the general consulting services in IT.
This talk will cover "hasql", a highly-efficient library for integration of Haskell and PostgreSQL. The library provides an API for declarative programming, which is also quite flexible and terse. The talk will cover the benefits of declarative programming as well as the architectural and technical solutions behind the library, including the implementation of the PostgreSQL binary protocol. Hasql is used in PostgREST, a popular modern restful API for Postgres.
Действующий программист, люблю open source, postgresql и функциональное программирование
Около 10 лет занимаюсь разработкой софта под различные СУБД(Oracle, Firebird, MSSQL, PostgreSQL), постгрес около 2-х лет. В данный момент в НИПК Электрон СПб разрабатываю медицинские информационные системы с использованием plpgsql и plv8, дорабытываю pllua как интересный мне проект.
Álvaro is a 36 year-old IT entrepreneur, based in Madrid, Spain. Founder and CTO at 8Kdata (www.8kdata.com), a database R&D company, he spends most of his time working on the ToroDB (www.torodb.com) project, the first NoSQL-on-SQL database, a MongoDB-compatible database that runs on top of PostgreSQL. He is a passionate software developer and open source advocate. Álvaro is a Java software developer, member of JavaSpecialists.eu, but also a DBA, trainer and frequent lecturer at international conferences. He also founded the PostgreSQL Spanish User Group (www.postgrespaña.es), one of the largest PUGs in the world, with more than 500 members.
Java is the most used programming language in the world. Yet how is it supported in PostgreSQL? What are the gotchas and the best practices? Now that Java is evolving significantly, how will PostgreSQL follow?
Despite Java's age, language is stronger than ever. It's the de facto programming language in the enterprise world. And since Java 8, it is having a come back in the startup and open source world. PostgreSQL is accessed more from Java than any other interface but, how's Java supported in PostgreSQL?
This talk will analyze how it has been in the past, but more importantly how can you use it and what can you do today. JDBC drivers, best practices, pl/java and other less frequently used tools will be presented and discussed.
And then we will look into the future, to see what is currently under development. Like Phoebe, a new Java Reactive Driver for PostgreSQL that targets clusters, pipelined queries and non-JDBC interface for fully asynchronous operation. And also what needs to be done in areas like server-side Java, to bring Java to a fully advanced first-level language within PostgreSQL.
Vladimir Sitnikov is currently working on performance and scalability of NetCracker OSS. Telecommunication companies world wide use NetCracker OSS for enterprise and network automation. Vladimir is keen on Java and database performance: Oracle, PostgreSQL. He’s one of the committers in the PostgreSQL JDBC driver project, and architect of many performance improvements.
Common Java wisdom is to use PreparedStatements and Batch DML in order to achieve top performance. It turns out one cannot just blindly follow the best practices. In order to get high throughput, you need to understand the specifics of the database in question, and the content of the data.
In the talk we will see how proper usage of PostgreSQL protocol enables high performance operation while fetching and storing the data. We will see how trivial application and/or JDBC driver code changes can result in dramatic performance improvements. We will examine how server-side prepared statements should be activated, and discuss pitfalls of using server-prepared statements.
Multicorn is a generic Foreign Data Wrapper which goal is to simplify development of FDWs by writing them in Python.
We will see:
- what is an FDW what Multicorn is trying to solve how to use it, with a brief tour of the FDWs shipping with Multicorn.
- how to write your own FDW in python, including the new 9.5 IMPORT FOREIGN SCHEMA api.
- the internals: what Multicorn is doing for you behind the scenes, and what it doesn't
After a presentation of FDWs in general, and what the Multicorn extension really is, we will take a look at some of the FDWs bundled with Multicorn.
Then, a complete tour of the Multicorn API will teach you how to write a FDW in python, including the following features:
- using the table definition
- WHERE clauses push-down
- output columns restrictions
- influencing the planner
- writing to a foreign table
- IMPORT FOREIGN SCHEMA
- ORDER BY clauses pushdown
- transaction management
This will be a hands-on explanation, with code snippets allowing you to build your own FDW in python from scratch.
I am 10+ years experience software developer. My specialization is multithreaded and distributed server-side applications.
We will consider the advantages and disadvantages of solutions based on JSONB compared to traditional relational approach on real projects, including: 1. Performance 2. Data Versioning 3. Scalability 4. Reliability 5. Report building
Michael Paquier is a hacker and blogger of PostgreSQL community, currently working at VMware on PostgreSQL related technologies. He has many areas of focus and a large panel of activity in community, be they implementation of new features, bug fixes, or patch reviews.
A backup is something that no Postgres deployments should go without as it gives the insurance to get back a deployment on its feet should a disaster strike.
In this talk we will discuss why backups are essential in any sane PostgreSQL deployments (this seems obvious) and what are the different options available to define and set up a good backup strategy. On top of that is discussed how the future of backups would need to be handled, particularly regarding differential backups that gain in popularity among users with large deployments.
АО "БАРС Груп"
First working on Oracle, we could not ignore appearance and growth of PostgreSQL. I will describe how we came to PostgreSQL and share some experience of migrating a large medical system.
- developing a code converter;
- packages migration;
- our patches solving some of the migration problems.
Diasoft Platform Ltd
Head of development department in Diasoft Platform Ltd (http://www.diasoft-platform.ru/). Chief system architect of the company's core products. Currently I am engaged in projects of migration of information systems to open source or Russian DBMS.
Diasoft Platform company's technology solutions for migration of applications from proprietary DBMS (on the examples of Oracle and Microsoft SQL Server) to PostgreSQL will be presented in the talk. These solutions are implemented in Diasoft Database Adapter software.
Our technology solutions allow to automate: 1) migration of database schema (including translation of stored procedure and function code); 2) migration of data; 3) migration of client applications without any change in their source code.
ОАО "НПО РусБИТех"
Закончил НИЯУ МИФИ в 2014г по специальности "математик, системный программист". Опыт работы с PostgreSQL около 4 лет. Сейчас занимаюсь разработкой защищенной версии СУБД PostgreSQL.
The base version of DBMS PostgreSQL can register events bellow: - connect and disconnect - denial of access with date, time and user's name.
Requirements guidelines for audit subsystem is much wider possibilities basic version of database PostgreSQL.
The RusBITech company holds the necessary improvements PostgreSQL database to enhance it's functionality.
A result of improvements audit subsystem of PostgreSQL database as part of the operating system «Astra Linux Special Edition» further enables registration:
- The creation and destruction of database objects;
- Changes to the rules of access control;
- Both failures and successful attempts to access the database objects;
- Changes to the powers and status of subjects access objects access.
For all events are specified: - date and time; - The user performing the action of the registrant; - The object on which the action is carried out; - Type of event; - The result of the operation.
Audit subsystem of modified PostgreSQL is integrated into a centralized audit system OS «Astra Linux Special Edition». Provided the rules setting without stopping the logging (restarting) database.
1C:Enterprise: the most popular in Russia/CIS ERP level development platform that supports PostgreSQL
16+ years in software development and management, mostly in ERP area: iScala, Epicor, MS Dynamics, 1C. Specialization: ERP platforms, development tools and frameworks.
More than 300.000 developers use technology platform "1C:Enterprise" as a main development tool. I'll tell you about architecture and features that made "1C:Enterprise" one of the most popular development environment in Russia and CIS and about growing popularity of PostgreSQL amongst 1C users.
Компания gilev.ru - ведущая российская компания по оптимизации производительности информационных систем 1С:Предприятие 8.
Занимаюсь вопросами внедрения решений на платформе 1С:Предприятие с 2005 года, решениями с использованием PostgreSQL с 2008 года. Имею большой практический опыт запуска проектов внедрения 1С в среде linux
In late 2006, the 1C company has implemented to work 1C:Enterprise platform with DBMS PostgreSQL, which can operate under the operating systems Windows or Linux. The talks will attempt to summarize the experience of sharing the platform 1C:Enterprise with PostgreSQL database since 2008. Consideration will be given a few success stories, technical features of the work are examples of specific tasks, offers advice on selecting and cons. The talk may be of interest to employees of companies considering the option of using PostgreSQL for the 1C:Enterprise, DBA, professionals interested in the possibility of extensibility PostgreSQL.
Технический директор проекта. Архитектор БД. Создал всю структуру и логику проекта, используя Postgresql.
Since joining Zalando in 2010, one of my primary activities has been to help our growing team of engineers to migrate from MySQL to PostgreSQL. I have been working with databases and data for more than 17 years, and began my career as an Oracle Developer and DBA. I am a frequent speaker at PostgreSQL conferences and love to share my experiences about the ways how developers can retain high speed of development while entrusting their data to a proven, powerful and open-source relational database.
Since its launch in 2008, Zalando has grown with tremendous speed. The road from startup to multinational corporation has been full of challenges, especially for Zalando's technology team. Distributed across Berlin, Helsinki, Dublin and Dortmund — and nearly 900 professionals strong — Zalando Technology still plans to expand by adding 1,000 more developers through the end of 2016. This rapid growth has showed us that we need to be very flexible about developing processes and organizational structures, so we can scale and experiment. In March 2015, our team adopted Radical Agility: a tech management strategy that emphasizes Autonomy, Purpose, and Mastery, with trust as the glue holding it all together. To make autonomy possible, teams can now choose their own technology stacks for the products they own. Microservices, speaking with each other using RESTful APIs, promise to minimize the costs of integration between autonomous teams. Isolated AWS accounts, run on top of our own open-source Platform as a Service (called STUPS.io), give each autonomous team enough hardware to experiment and introduce new features without breaking our entire system.
One small issue with having microservices isolated in their individual AWS accounts: Our teams keep local data for themselves. In this environment, building an ETL process for data analyses, or integrating data from different services, becomes quite challenging. PostgreSQL's new logical replication features, however, now make it possible to stream all the data changes from the isolated databases to the data integration system so that it can collect this data, represent it in different forms, and prepare it for analysis.
In this talk, I will discuss Zalando's open-source data collection prototype, which uses PostgreSQL's logical replication streaming capabilities to collect data from various PostgreSQL databases and recreate it for different formats and systems (Data Lake, Operational Data Store, KPI calculation systems, automatic process monitoring). The audience will come away with new ideas for how to use Postgres streaming replication in a microservices environment.
Web RDBMS-based systems developer with more than 10 years experience. Presently one of the authors of Avito.ru - the largest european classified ads site.
My experience of working with PostgreSQL has provided clear understanding of its main advantages, making us choose and recommend choosing it.
4.1 Transactional DDL
4.2 WAL and True Physical Replication
4.3 Transactional Snapshot and True Logical Replication and PGQ
4.4 Exciting extensibility
Константин Евтеев, Avito.ru • Разработчик баз данных • С Postgres c 2009 года: миграции с MS SQL, администрирование обеих СУБД • В Авито с 2014 года: разработка распределенных систем обработки данных
10 years of development experience. Has worked as a lead developer, architect, IT manager and deputy CIO. A polyglott programmer, loves brief and quick code.
- First aquaintance
- Fight with replication
- Partitioning and migration
- Cross data-center use
- v8, json, jsonb, jsquery
- Version upgrade
Postgres, Oracle Developer. Oracle based Billing system development, integration, migration, optimization. Postgres based architecture development for wireless divices monitoring systems, mobile services systems.
Method of automated refresh of preprocessed results of analytis reports is provided. Preprocessing and caching of reports allows ability for fast response for big data reports. Author describes the way of reports cache refreshing with minimum server loads and tuned actualization rate.
Education: Moscow State University (Department of Physics), 1994 Edition. Founder and CEO of the group of companies Softpoint since 2005. Since then, all efforts directed at creating innovative solutions to the problems of performance and scalability of information systems. Such solutions are already more than 10 and most of the patented.
You have a distributed IT system, it has many nodes, BUT it has: - Lack of efficiency of exchange. Delay timing - hours or days? - Interference to users block during the exchange? - Poor handling - all nodes exchange status is not clear; - Low exchange stability, the need for manual control?
DBReplicaton - the technology of high-speed data exchange between PostgreSQL databases. This report presents a solution that runs in the tens of medium and large companies in Russia (> 2500 active users,> 20 nodes exchange), which has: - Own transport subsystem; - Centralized unified interface and control the exchange; - Bilateral exchanges: the ability to work with data on changes in all nodes involved in the exchange; - High data rate (2 seconds). Additionally, you will learn about the unusual use of replication in a variety of business systems.
Peter van Hardenberg was a founding member of the Heroku Postgres database product, and has spoken around the world about PostgreSQL. He's answered untold numbers of user questions and spent time working with users at all stages of development and expertise.
Heroku Postgres is a cloud database service and the largest provider of PostgreSQL as a service anywhere. We operate more than 1,000,000 PostgreSQL databases with a team of about 10 people. We may be the most efficient DBAs in history, with approximately 100,000 databases per person on our team! This talk will introduce the opportunity and challenges of building and operating a cloud database service, as well as discussing the strategies we use to build, operate, and scale this product and team for the last six years now. We will include details about * a brief introduction to the service to provide context * strategies to design and build such a data service * operational war stories like how to recover from losing thousands of servers at once, * common challenges users have with Postgres * and a basic overview of the technical architecture
This is a complementary talk to Will Leinweber's talk, which will go into much more depth on the architecture of the software we have written.
Will has helped to build Heroku Postgres for nearly 5 years.
In addition to providing a general purpose web platform, Heroku has a large, supporting Postgres service. Over the years, we've learned a lot about running Postgres at scale.
In this talk, we'll cover:
- why Postgres is attractive to run as a cloud service
- how to provision, manage, and monitor a Postgres fleet
- tradeoffs needed to make Postgres work in this environment
- automating failure recovery
- and more
Alibaba has provided a relational database service (RDS) for postgres in our public cloud platform (aliyun.com, the currently biggest public cloud in China). We are also enabling internal applications to use postgres in our other internet business and we can share our experience