Migrating PostgreSQL from one version to another has become pretty easy. Using pg_upgrade it takes a few seconds to upgrade even a 100GiB database. It becomes interesting when you switch platforms, say from 32bit to 64bit, as we are facing it with our switch from Gentoo to NixOS. Our NixOS-based platform is stable enough for customers to use it. Some larger databases benefit especially from larger RAM sizes.
So the question is: How to migrate from 32bit to 64bit with as little downtime as possible?
pg_upgrade modifies the data files in-place. There is no SQL dump/restore involved, hence it’s very fast. Unfortunately, but for obvious reasons, the raw disk format of PostgreSQL on 32bit is not compatible with the format on a 64bit machine. So, no pg_upgrade in this case.
This is the classic way of migrating any database to another version. Dump the entire contents into SQL and feed it to the new server. It works. It’s slow.
The test database is about 150GiB on disk. A first approach might look like this.
newdb# pg_dumpall -h olddb.fcio.net | psql
Took 5h 42m. Ouch.
Maybe changing some of the PostgreSQL settings helps.
So we don’t sync writes to disk anymore. Also do not force to write full pages to the WAL. This should improve performance at the cost that the database is probably broken after system failure. This is not a problem during restore – if the system crashes we have to start over anyway.
Guess what. Didn’t change much. Actually it got a bit slower (5h 50m) but that’s probably due to other factors (virtualization, network storage).
Let’s try the custom format and only the main database of this particular application. The other databases make take 100MiB. So the diffence should be marginal.
newdb# pg_dump -Fc -h olddb.fcio.net dbname | pg_restore -d dbname
7h 19m. WAT. Turns out the custom format compresses the dump. This is quite pointless in our case and just consume a lot of CPU. Turn compression off then!
newdb# pg_dump -Fc -Z0 -h olddb.fcio.net dbname | pg_restore -d dbname
Now it’s a bit faster than the naïve approach: 5h 21m. Still nothing to be proud of! It turns out that pg_dump both requires quite some CPU and gets slowed down by network latency. When we create the dump on the old database VM, things start to improve:
newdb# ssh olddb.fcio.net sudo -iu postgres pg_dumpall | sudo -iu postgres psql
5h 4m. We saved about 40 minutes compared to the original approach. As seen above the custom format without compression performes better than plain pg_dump. Another thing is network latencies and buffering. Adding a small (88MiB) buffer might detach both processes enough so one doesn’t have to wait for the other:
newdb# ssh olddb.fcio.net sudo -iu postgres pg_dump -Fc -Z0 dbname "|" \ mbuffer | sudo -iu postgres pg_restore -d dbname
3h 33m. Now we are talking! Further increasing the buffer to 500MiB saved another half hour: 3h 4m.
So the final command would be:
newdb# source=OLDDBHOST newdb# database=DBNAME newdb# ssh $source sudo -iu postgres pg_dump -Fc -Z0 -C $database "|" \ mbuffer -m 500m | sudo -iu postgres pg_restore -C -d postgres
Don’t forget to re-set the fsync/full_page_write parameters, if you have changed them.
All in all, measuring helps a lot to find the right set of tools. Your situation might be different: measure!
Image credit: Ganesh Raghunathan.