LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-20-2019, 01:20 PM   #1
hedron
Member
 
Registered: Jul 2009
Location: NYC
Distribution: Slackware64-multilib 15.0, SARPI, artix
Posts: 401

Rep: Reputation: 32
postgresql update slow`


Ok, I wasn't really sure where to post this, but I'm using pygresql in a python script, so it's sort of programming.

I have to update 10s of thousands of entries and it just takes forever. Think days. I started the script last night and it's about 15% of the way through. It's just a simple for loop with a one line update command.

The postgresql server is a virtual machine in which the VDI file is located on a USB hard drive.

Is there a particular reason it's slow? Maybe the print statements slow it down? It prints the command once a loop so I know how far along it is.

Last edited by hedron; 03-20-2019 at 01:22 PM.
 
Old 03-20-2019, 01:29 PM   #2
dc.901
Senior Member
 
Registered: Aug 2018
Location: Atlanta, GA - USA
Distribution: CentOS/RHEL, openSuSE/SLES, Ubuntu
Posts: 1,005

Rep: Reputation: 370Reputation: 370Reputation: 370Reputation: 370
Quote:
Originally Posted by hedron View Post
Ok, I wasn't really sure where to post this, but I'm using pygresql in a python script, so it's sort of programming.

I have to update 10s of thousands of entries and it just takes forever. Think days. I started the script last night and it's about 15% of the way through. It's just a simple for loop with a one line update command.

The postgresql server is a virtual machine in which the VDI file is located on a USB hard drive.

Is there a particular reason it's slow? Maybe the print statements slow it down? It prints the command once a loop so I know how far along it is.
You say "postgresql server is a virtual machine in which the VDI file is located on a USB hard drive." Is your script making remote connection to DB?
USB 2.0 or 3.0?
Do you have to do update one row at a time?
How is the load on VM and the host (memory/CPU usage, and disk IO)?
 
Old 03-20-2019, 01:38 PM   #3
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,264
Blog Entries: 24

Rep: Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194Reputation: 4194
The print is potentially a bottle-neck, but the query you are using is also the first place to look for improving the performance.

Tens of thousands of records should not be a problem for any RDBMS if the data model and queries are well considered.

You don't provide any clues as to how complex the update may be, how many tables are invloved or how records are identified, does it require subqueries, etc... That would be helpful information.

My first question would be whether it is actually necessary to update each record or row in a scripted loop at all? Doing one query per row is not very efficient unless it is really necessary.

Can you not rewrite the query to update all or multiple records and allow the RDBMS to optimize it once and do what it does best?
 
Old 03-20-2019, 02:15 PM   #4
hedron
Member
 
Registered: Jul 2009
Location: NYC
Distribution: Slackware64-multilib 15.0, SARPI, artix
Posts: 401

Original Poster
Rep: Reputation: 32
Quote:
Originally Posted by astrogeek View Post
The print is potentially a bottle-neck, but the query you are using is also the first place to look for improving the performance.

Tens of thousands of records should not be a problem for any RDBMS if the data model and queries are well considered.

You don't provide any clues as to how complex the update may be, how many tables are invloved or how records are identified, does it require subqueries, etc... That would be helpful information.

My first question would be whether it is actually necessary to update each record or row in a scripted loop at all? Doing one query per row is not very efficient unless it is really necessary.

Can you not rewrite the query to update all or multiple records and allow the RDBMS to optimize it once and do what it does best?
Code:
cmd = '''update "{}" set close = {} where time = '{}';'''.format( t, i['avg'], i['time'] )
print( cmd )
db.query( cmd )
print()
That's what's in the for loop that's taking days. It's doing what it's supposed to do, otherwise.


EDIT: Yes, it was the print statement.

Last edited by hedron; 03-20-2019 at 03:12 PM.
 
1 members found this post helpful.
Old 03-21-2019, 06:20 AM   #5
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,863
Blog Entries: 1

Rep: Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869Reputation: 1869
Let's hope you have an index on field 'time'
Also you could try to use bind variables.

Last edited by NevemTeve; 03-21-2019 at 06:24 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Difference Between PostgreSQL And MySQL And How To Migrate From MySQL To PostgreSQL LXer Syndicated Linux News 0 05-27-2017 08:26 PM
LXer: The PostgreSQL Global Development Group's PostgreSQL LXer Syndicated Linux News 0 03-31-2016 07:35 PM
RH8 vs. W2k - Slow, slow, slow zerojosh Linux - Software 2 06-30-2003 07:19 PM
postgresql -odbc & postgresql-jdbc installation kjsubbu Linux - Software 0 06-19-2003 02:50 AM
p4 1.8 ghz SLOW SLOW SLOW Acmeshells Slackware 7 03-12-2003 07:11 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 10:03 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration