Different Replication Solutions in PostgreSql
Today we are going to discuss Different Replication Solutions in PostgreSql 9.x
Shared Disk Failover:
- It avoids synchronization overhead by having only one copy of database.
- It uses a single disk array which is shared by multiple database servers.
- If the main database server fails, the standby server is able to mount and start the database as though it were recovering from a database crash. This allows rapid failover with no data loss.
- Shared hardware functionality is common in network storage devices.
- One significant limitation of this method is that if the shared disk array fails or becomes corrupt, the primary and standby servers are both nonfunctional.
- Another issue is that the standby server should never access the shared storage while the primary server is running.
File System (Block-Device) Replication
A modified version of shared hardware functionality is file system replication, where all changes to a file system are mirrored to a file system residing on another computer. The only restriction is that the mirroring must be done in a way that ensures the standby server has a consistent copy of the file system — specifically, writes to the standby must be done in the same order as those on the master. DRBD is a popular file system replication solution for Linux.
Directly moving WAL records from one database server to another is typically described as log shipping. PostgreSQL implements file-based log shipping by transferring WAL records one file (WAL segment) at a time. WAL files (16MB) can be shipped easily and cheaply over any distance, whether it be to an adjacent system, another system at the same site, or another system on the far side of the globe.
Transaction Log Shipping:
Warm and hot standby can be kept current by reading a stream of write-ahead-log WAL records. If the main server fails the standby server contains almost all of the data of the main server and can be quickly made the master database server. This can be synchronous or asynchronous and can only be done for the entire database server. A standby server can be implemented using file-based log shipping or streaming replication, or a combination of both.
Trigger-Based Master-Standby Replication:
- This setup sends all data modification queries to the master server.
- The master server asynchronously sends data changes to the standby server.
- The standby can answer read-only queries while the master server is running.
- The standby server is ideal for data warehouse queries.
- Slony-I is an example of this type of replication, with per-table granularity, and support for multiple standby servers.
- Because it updates the standby server asynchronously (in batches), there is possible data loss during fail over.
Statement-Based Replication Middleware:
- With statement-based replication middleware, a program intercepts every SQL query and sends it to one or all servers.
- Each server operates independently. Read-write queries must be sent to all servers, so that every server receives any changes.
- But read-only queries can be sent to just one server, allowing the read workload to be distributed among them.
- If queries are simply broadcast unmodified, functions like random(), CURRENT_TIMESTAMP, and sequences can have different values on different servers. This is because each server operates independently, and because SQL queries are broadcast (and not actual modified rows). If this is unacceptable, either the middleware or the application must query such values from a single server and then use those values in write queries. Another option is to use this replication option with a traditional master-standby setup, i.e. data modification queries are sent only to the master and are propagated to the standby servers via master-standby replication, not by the replication middleware. Care must also be taken that all transactions either commit or abort on all servers, perhaps using two-phase commit (PREPARE TRANSACTION and COMMIT PREPARED.
- Pgpool-II and Continuent Tungsten are examples of this type of replication.
Asynchronous Multimaster Replication:
For servers that are not regularly connected, like laptops or remote servers, keeping data consistent among servers is a challenge. Using asynchronous multimaster replication, each server works independently, and periodically communicates with the other servers to identify conflicting transactions. The conflicts can be resolved by users or conflict resolution rules. Bucardo is an example of this type of replication.