Hi. I'm jon.404, a Unix/Linux/Database/Openstack/Kubernetes Administrator, AWS/GCP/Azure Engineer, mathematics enthusiast, and amateur philosopher. This is where I rant about that which upsets me, laugh about that which amuses me, and jabber about that which holds my interest most: *nix.

PostgreSQL - installation/configuration/benchmarking - on Dell servers Part II

Posted 03-27-2009 at 08:02 PM by rocket357
Updated 10-27-2009 at 06:44 PM by rocket357

(Continued from this blog post)

Now test the setup:

Code:

linux ~ # hdparm -tT /dev/sdc

/dev/sdc:
 Timing cached reads:   11464 MB in  2.00 seconds = 5739.66 MB/sec
 Timing buffered disk reads:  1816 MB in  3.00 seconds = 604.39 MB/sec

Looks good. Now let's format the partition:

I like xfs. I use xfs. You might not...so run "man mkfs.<your_favorite_filesystem>" to read up on how to convert this to your filesystem:

Code:

linux ~ # mkfs.xfs -L pg_data -b size=4096 -d su=64k,sw=5 -l version=2 /dev/sdc1

Ok, there's a lot going on here...first, give the filesystem a label with -L <some_label> (completely optional). Next, set the block size (xfs can create block sizes over 4096 bytes, but Linux can't mount the filesystem if you do! The block size can't be larger than the kernel page size, which is 4096 bytes, typically). Next, define a few data partition options: first is su, which is the RAID "stripe size", which you can get with:

Code:

linux ~ # MegaCli -LDInfo -LALL -aALL | grep "Stripe Size:"
Stripe Size: 64kB
Stripe Size: 64kB

Next is the stripe width ("sw"), and this should be equal to the number of data-bearing disks in the array. If you created a RAID 5 of "n" disks, this number would be n - 1. For RAID 6, n - 2. For RAID 10 it would be n / 2 since each mirror effectively is a single disk.

ext3 example:

Code:

linux ~ # mkfs.ext3 -b 4096 -E stride=16,stripe-width=80 /dev/sdc1

For ext3, the stride is the chunk size (defaults to 64k on PERC cards) divided by the block size (4096 bytes), and stripe-width is the number of data-bearing disks (see above) times the stride.

Once you've created the filesystem, mount it with relevant options (noatime, data=writeback, etc...). I've mounted this one with just "noatime" since data=writeback doesn't apply to xfs. Next, let's perform a little test:

Code:

linux ~ # dd if=/dev/zero of=/var/lib/postgresql/data/base/test bs=8k count=1000000
1000000+0 records in
1000000+0 records out
8192000000 bytes (8.2 GB) copied, 11.6311 seconds, 704 MB/s

Looks good. Now let's measure read speed. First, you need to unmount the filesystem to clear the cache! If you don't umount/mount the array, your numbers will be skewed!

Code:

linux ~ # umount /dev/sdc1
linux ~ # mount -a
linux ~ # dd if=/var/lib/postgresql/data/base/test of=/dev/null bs=8k count=1000000
1000000+0 records in
1000000+0 records out
8192000000 bytes (8.2 GB) copied, 11.4865 seconds, 713 MB/s

You'll have to play around with the numbers, but you can use "blockdev --setra <some_value> /dev/<pg_data_partition>" to set the readahead to an appropriate value for max performance. I set mine to 4096 since it seems to be the "sweet spot". YMMV depending on the type of work your database is doing.

You could also do a round of bonnie tests at this point to get a second opinion on your setup, if you'd like.

Next up is the actual PostgreSQL installation. I prefer to build from source since it gives me a great deal of fine-tuning control over the final product, but if you'd prefer to install from the repositories, that's fine. I'll build mine with the following commands (after downloading and unpacking the archive):

Code:

linux ~ # apt-get -y install gcc make openssl ssh libssl-dev bzip2 libc6-dev libreadline5-dev bison flex zlib1g-dev
linux ~ # cd postgresql-8.3.6/ 
linux ~ # export CFLAGS="-pipe -O2" 
linux ~ # export CXXFLAGS=$CFLAGS 
linux ~ # mkdir -p /root/debian/postgresql
linux ~ # ./configure --prefix=/root/debian/postgresql/usr --exec-prefix=/root/debian/postgresql/usr --with-gnu-ld --without-docdir --with-openssl --enable-thread-safety
linux ~ # make -e -j `grep -c "model name" /proc/cpuinfo`
linux ~ # make install # will install to /root/debian/postgresql/usr!

At this point, you can add in whatever else you'd like (contrib projects, pg_top, etc...). Just configure them with --prefix and --exec-prefix same as above. Once you're satisfied with the packages you've installed to the fake directory, tar it up:

Code:

cd /root/debian/postgresql/ && tar cjpf postgresql-8.3.6-COMPILED.tar.bz2 ./usr

and copy a backup somewhere (especially useful if you manage multiple machines). Then install PostgreSQL using the tarball:

Code:

linux ~ # tar xjpf postgresql-8.3.6-COMPILED.tar.bz2 -C /

Once that's done, you need to add the postgres superuser and initialize the database:

Code:

linux ~ # mkdir -p /var/lib/postgresql/data
linux ~ # useradd -d /var/lib/postgresql postgres
linux ~ # chown -R postgres:postgres /var/lib/postgresql
linux ~ # su --command="/usr/bin/initdb -D /var/lib/postgresql/data" postgres

When the initialization finishes, you need to tweak a few settings in the postgresql.conf file. Perhaps most important are (for a reporting database...please look up appropriate values for OLTP and web site backends! A good starting point is the whack-a-mole):

Quote:

shared_buffers = MAX_RAM * 0.25
work_mem = MAX_RAM / max_connections
maintenance_work_mem = 128MB
wal_buffers = 1MB
checkpoint_segments = 16-64
cpu_tuple_cost = 0.002
cpu_index_tuple_cost = 0.0001
cpu_operator_cost = 0.0005
effective_cache_size = MAX_RAM * 0.67

If you've read whack-a-mole, you'll notice these values are the same as given there for data warehouse applications. I've also added a few tweaks, including turning off fsync since this database operates in a read-only environment. Dangerous, I know...do so at your own risk.

Mount your disk arrays in temporary locations and copy the contents of /var/lib/postgresql/data/base to the data array and /var/lib/postgresql/data/pg_xlog to the log array. Next, remount the arrays over the places you just copied data from (i.e. umount /dev/sdc && mount -t xfs /dev/sdc1 /var/lib/postgresql/data/base).

Ok, we should be ready to fire up the server and start tweaking settings! Save the following as /etc/init.d/postgresql:

Code:

#!/bin/bash

export PGDATA=/var/lib/postgresql/data

case "$1" in
	start)
		echo "Starting the PostgreSQL Server"
		su --command="/usr/bin/pg_ctl start -D $PGDATA -o -i" postgres
		;;
	stop)
		echo "Stopping the PostgreSQL Server"
		su --command="/usr/bin/pg_ctl stop -D $PGDATA" postgres
		;;
	restart)
		echo "Restarting the PostgreSQL Server"
		su --command="/usr/bin/pg_ctl restart -D $PGDATA -o -i" postgres
		;;
	reload)
		echo "Reloading the PostgresSQL config"
		su --command="/usr/bin/pg_ctl reload" postgres
		;;
	status)
		echo "Status of PostgreSQL Server:"
		su --command="/usr/bin/pg_ctl status" postgres
		;;
	*)
		echo "Usage: /etc/init.d/postgresql {start|stop|restart|reload|status}"
		;;
esac
exit 0

(This is a VERY basic init script. Suggestions are more than welcome for improving it!)

Then run:

Code:

linux ~ # chmod 755 /etc/init.d/postgresql
linux ~ # /etc/init.d/postgresql start

If you get shared memory errors, you likely need to increase shmmax (though sometimes shmmni for machines with large amounts of RAM). You can run "sysctl kernel.shmmax" to see what your current shmmax is, or run "sysctl -e kernel.shmmax=<new_value>" to set a new value, then try to start the server again.

Adjust the postgres user's password with:

Code:

linux ~ # /usr/bin/psql -U postgres -d postgres -c "ALTER USER postgres WITH PASSWORD '<some_password>'"

Once you've tweaked postgresql.conf to your satisfaction, benchmark it using something like pgbench, or BenchmarkSQL or the like. Make sure you write performance numbers down, since this will serve as your baseline for future evaluations! Here's an example pgbench run:

Code:

linux ~ # psql -U postgres -c "CREATE DATABASE test"
CREATE DATABASE
linux ~ # pgbench -i -c 32 -t 10000 -U postgres -d test
NOTICE:  table "branches" does not exist, skipping
NOTICE:  table "tellers" does not exist, skipping
NOTICE:  table "accounts" does not exist, skipping
NOTICE:  table "history" does not exist, skipping
creating tables...
10000 tuples done.
<snip output>
100000 tuples done.
set primary key...
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "branches_pkey" for table "branches"
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "tellers_pkey" for table "tellers"
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "accounts_pkey" for table "accounts"
vacuum...done.
linux ~ # pgbench -c 32 -t 10000 -U postgres -d test 2>/dev/null
pghost:  pgport:  nclients: 32 nxacts: 10000 dbName: test
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 32
number of transactions per client: 10000
number of transactions actually processed: 320000/320000
tps = 2741.001109 (including connections establishing)
tps = 2742.227042 (excluding connections establishing)
linux ~ # pgbench -S -c 32 -t 10000 -U postgres -d test 2>/dev/null
pghost:  pgport:  nclients: 32 nxacts: 10000 dbName: test
transaction type: SELECT only
scaling factor: 1
number of clients: 32
number of transactions per client: 10000
number of transactions actually processed: 320000/320000
tps = 27919.776014 (including connections establishing)
tps = 28047.437332 (excluding connections establishing)

Once this is done load any data backups you may have from previous machines or create your new databases. Alter pg_hba.conf to "go live" with the database (to allow external connections), and restart the server.

PostgreSQL - installation/configuration/benchmarking - on Dell servers Part II

Comments