LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 09-15-2011, 12:36 PM   #1
EnderX
Member
 
Registered: Nov 2006
Posts: 66

Rep: Reputation: 15
Postgresql database reload taking excessively long time on SuSE 11 box - seeking caus


My office maintains several remote servers for our clients. One of our operating patterns in the past has been to keep a few spare emergency backup servers on-site with us so that they can pick up the replication from our PostgreSQL database - the idea being that this way, the remote site won't have to be down for an extended length of time as the DB rebuilds onsite.

The database we're copying is from our main server - running SuSe 9.3 with PostgreSQL 8.1.3 installed. The servers we're copying to are SuSE 11 boxes with PostgreSQL 8.4.4 on them. The copy on the main box is done by means of the pg_dump command, and output as a pair of sql files - one for the schema, and one for the data. These are loaded against an empty copy of the database on the backup boxes by running them as scripts against the db from the command line.

It has normally taken a while to load the Database, but recently things have gone seriously screwy. While I admit the database has grown somewhat large (data input sql script is reading as 2.7 gigs off of an ls -lah run), until fairly recently a backup could be started at the end of the business day, then left running overnight and would be finished by the time the day started again. Now, however, we have had several machines that ran for more than a day without finishing - in two cases, almost 48 hours without finishing the process.

I'm currently trying to figure out where the likeliest problems are, but I'm not all that familiar with Linux's 'performance/maintenance' commands - I had to request help from a local user group just to learn about the existence of the iostat command.

Speaking of iostat, the output I got from my own run of it is below. I let it run for ~5 minutes, with a 15 second spacing between the reports.


Code:
Linux 2.6.34-12-desktop (datatrac) 	09/15/11 	_x86_64_	(4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.94    0.03    0.71   17.60    0.00   79.72

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.98   862.43    9.04  131.63  1291.16  7960.83    65.77    69.35  492.16   4.64  65.30

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.45    0.00    0.41   25.20    0.00   73.94

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   180.27    0.00  160.00     0.00  2781.87    17.39   142.78  877.30   6.25 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.26    0.00    0.63   28.36    0.00   69.75

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1145.33    0.60  180.93   187.73 10328.53    57.93   117.61  694.73   5.35  97.12

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.45    0.00    0.30   43.75    0.00   55.50

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   315.53    0.00  171.60     0.00  4207.47    24.52   155.57  891.52   5.83 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.38    0.00    0.25   46.66    0.00   52.71

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   213.07    0.07  180.40     0.53  3179.73    17.62   154.46  865.49   5.54 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.18    0.00    0.68   30.22    0.00   67.92

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1255.73    0.53  221.47   170.67 11820.80    54.02   121.42  519.80   4.39  97.56

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.37    0.00    0.27   33.87    0.00   65.49

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   161.67    0.00  144.53     0.00  2511.47    17.38   143.70 1020.23   6.92 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.30    0.00    0.73   34.29    0.00   63.68

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.07  1344.07    0.73  225.20   188.27 12568.00    56.46   109.85  485.40   4.28  96.78

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.37    0.00    0.24   36.51    0.00   62.88

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   178.47    0.00  159.73     0.00  2760.53    17.28   143.19  890.51   6.26 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.43    0.00    0.31   28.71    0.00   70.55

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   233.47    0.00  165.40     0.00  3322.13    20.09   124.27  792.68   6.04  99.96

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50    0.00    0.33   27.73    0.00   71.43

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   349.33    0.13  185.60    51.20  4258.13    23.20   124.83  644.99   5.36  99.60

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.98    0.00    0.48   31.18    0.00   67.36

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   938.80    0.47  192.33   119.47  9028.80    47.45   115.30  601.93   5.09  98.04

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.47    0.00    0.33   47.37    0.00   51.83

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   284.67    0.07  173.47    17.07  3721.60    21.54   131.72  746.72   5.76  99.98

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.52    0.00    0.33   31.28    0.00   67.87

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   311.47    0.13  174.33    17.60  3925.33    22.60   122.64  713.80   5.72  99.72

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.02    0.00    0.62   27.73    0.00   70.64

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1041.67    0.40  207.67   136.53  9997.87    48.71   121.31  575.26   4.72  98.13

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.37    0.00    0.25   25.26    0.00   74.12

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   180.47    0.00  151.33     0.00  2665.60    17.61   143.23  953.02   6.61 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.30    0.05    0.65   28.87    0.00   69.13

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1281.53    0.40  215.27   119.47 12002.67    56.21   112.80  517.17   4.53  97.62

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.35    0.00    0.27   36.36    0.00   63.02

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   189.93    0.00  166.73     0.00  2903.47    17.41   143.69  868.46   6.00 100.01

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.52    0.00    0.48   45.01    0.00   53.99

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.07   283.20    0.13  165.87    34.67  3640.00    22.14   117.66  696.96   6.02  99.99

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.14    0.00    0.51   30.89    0.00   67.45

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  1124.60    0.47  197.53   136.53 10590.40    54.18   116.74  589.24   4.95  98.08

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.33    0.00    0.27   29.16    0.00   70.24

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   198.93    0.00  160.87     0.00  2905.07    18.06   143.10  899.60   6.22 100.00
I'm a bit nervous about those numbers - the avgqu-sz and await values are both a lot higher than I can recall from the sample output I've seen online. Then, the numbers I've seen online have been from samples where, I would assume, there's not a database reload going on on the system. I'm not sure if what I'm seeing is evidence of the real problem, or simply an artifact of the conditions the problem occurs under.

Would someone please suggest some other avenues of exploration I could use to follow up on this and determine the true root problem?
 
Old 09-15-2011, 04:46 PM   #2
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
There are a number of things that will cause long loading time but the most likely will be triggers and stored procedures (if there are triggers and stored procedures and you're loading tables from a back up, those will be executed -- and you really do not want that to happen).

If you take a look at the manual page for pg_dump, read about the --disable-triggers option and decide if that will work for you (if you do a data-only plain-text dump).

You may want to change your dump format to archive format then use pg_restore with its --disable-triggers option:
Code:
       --disable-triggers
           This option is only relevant when performing a data-only restore. It instructs
           pg_restore to execute commands to temporarily disable triggers on the target
           tables while the data is reloaded. Use this if you have referential integrity
           checks or other triggers on the tables that you do not want to invoke during
           data reload.
You may want to fiddle with it a little (you know, run a test) but the example in the pg_restore manual page might be a good staring point:
Code:
       Assume we have dumped a database called mydb into a custom-format dump file:

           $ pg_dump -Fc mydb > db.dump

       To drop the database and recreate it from the dump:

           $ dropdb mydb
           $ pg_restore -C -d postgres db.dump

       The database named in the -d switch can be any database existing in the cluster;
       pg_restore only uses it to issue the CREATE DATABASE command for mydb. With -C,
       data is always restored into the database name that appears in the dump file.

       To reload the dump into a new database called newdb:

           $ createdb -T template0 newdb
           $ pg_restore -d newdb db.dump

       Notice we don't use -C, and instead connect directly to the database to be restored
       into. Also note that we clone the new database from template0 not template1, to
       ensure it is initially empty.

       <there is more to the example that should be noted -- see the manual page>
Also, depending upon your site, you may want to look into pg_dumpall (used for clusters).

I'm pretty sure that, from you description, your data bases have triggers and stored procedures in them and you should explore the methods for temporarily disabling them during loading. This assumes, of course, that the target machine has sufficient RAM and disk space available, eh?

Hope this helps some.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Long delay on database reload - where should I start looking for problems? EnderX Linux - General 1 09-16-2008 08:39 PM
SuSE 10.1 taking a long time to boot up adityavpratap SUSE / openSUSE 2 07-04-2006 06:51 AM
Kernel Decompress Taking a long time v2-ncl Linux - Laptop and Netbook 0 11-10-2003 08:47 AM
sendmail taking a long time to deliver. member57 Linux - Software 3 07-14-2003 01:37 PM
Seriously this is taking a long time!! Omega Linux - Newbie 13 04-25-2003 05:46 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 06:46 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration