Welcome to the most active Linux Forum on the web.
Go Back > Blogs > Musings on technology, philosophy, and life in the corporate world
User Name


Hi. I'm a Unix Administrator, mathematics enthusiast, and amateur philosopher. This is where I rant about that which upsets me, laugh about that which amuses me, and jabber about that which holds my interest most: Unix.
Rate this Entry

PostgreSQL and Indexes: a formula for...disaster?

Posted 01-27-2011 at 12:03 PM by rocket357
Updated 01-29-2011 at 03:41 PM by rocket357

I've noticed a ton of complaining about PostgreSQL's apparent lack of sanity when it comes to index usage, so I'd like to throw a few things out that explain the "idiotic" way PostgreSQL handles indexes.

First things first. If the table fits in a single page, YOU ARE NOT GOING TO SEE INDEX ACTIVITY. That gets more people than any other issue. If the table has 17 rows, don't expect PostgreSQL to bother with indexes. Using an index on a table that fits in a single page is like asking someone to fly from San Diego to Los Angeles. The cities are 116 miles by car you can get there in roughly 2 hours. Sure, the flight might only take 45 minutes, but you still have to drive to the airport, get through security, check in, load, wait for your baggage, drive to your destination, etc...In these cases, it's usually faster to just drive straight there. PostgreSQL is smart enough to figure this out and it "just drives straight there" (i.e. use a sequential scan, rather than read the index, filter out unwanted rows, then read the rows it needs (i.e. random seeks) out of the table file).

Ok, so let's say the table has 17 million rows, not 17. Chances are pretty good that the table is going to incur some disk i/o. In this instance, it makes sense that an index scan (given that an index exists that is a match for the query) would greatly reduce the disk i/o required...or does it? If you have a query like such:


And you have an index that covers the entire table (i.e. the index is roughly the same size as the table), the planner isn't going to bother reading the index. Why read a 10 GB index off the disk, just so you can read another 1 GB of the table off the disk? Why not just read the 10 GB table once and scan through it? (Aside from the disk i/o, doing a single table read can preserve the OS's filesystem cache instead of potentially nuking it by reading multiple files). PostgreSQL will, however, read a 500 MB index so it can read 1 GB off of the disk...

But wait! I have a 500 MB index on my 10 GB table, and PostgreSQL is *STILL* performing sequential scans! WHY?!

Here's where the fun begins. In postgresql.conf, there are a lot of settings that clue the planner in to your environment. Are you running with a single IDE drive from 1995? Well, then you probably want to do everything you can to reduce disk i/o. Tons of RAM? The planner needs to know that the filesystem cache is large, so it can take that into account. PostgreSQL comes with some relatively sane defaults, but the further you get from the "norm", the less efficient your installation will be. If you EXPLAIN ANALYZE and see big errors (bad row-count estimates, skipping indexes, etc...), chances are good you can tweak some planners hints in postgresql.conf and get better plans (or you need better statistics...but that's an entirely different issue). That shiny new SSD, for example, has much better random access time than conventional disks...drop your random_page_cost down so the planner has an accurate "map" of it's environment.

A good paper on the planner hints in postgresql.conf can be found here:

Once you get a realistic environment set up, you'll find that the planner does make decent decisions and efficient use of the resources it has at it's disposal. So remember, test on realistic data, keep your indexes small, and by all means tune postgresql.conf **to match your environment**!
Posted in Uncategorized
Views 1337 Comments 0
« Prev     Main     Next »
Total Comments 0




All times are GMT -5. The time now is 04:41 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration