LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   /etc/cron.daily/slocate.cron - multiple /usr/bin/updatedb degrades performance (https://www.linuxquestions.org/questions/linux-server-73/etc-cron-daily-slocate-cron-multiple-usr-bin-updatedb-degrades-performance-920990/)

ray63 12-28-2011 11:59 PM

/etc/cron.daily/slocate.cron - multiple /usr/bin/updatedb degrades performance
 
I run openfiler as a NAS and recently noticed significant performance degradation of my rynsc backups.

I have 2x1.5TB drives in RAID-1, backed up to a 2TB drive using rsync, on the same server. The 2TB drive is 96% full, and the raid is 83% full

Whilst I have been able to get 40MB/s throughput (samba to windows) to both the RAID and backup drive, I recently noticed that my backups were taking longer and longer each day, and that although a reboot restores the backup speed, perforamnce starts to degrade again day by day. SAMBA throughput has now dropped to 1MB/s on the 2TB drive.

After trying to umount the 2TB drive, I discovered that there are multiple 'updatedb' processes running, and I assume this might be because cron starts another job, before the prior job had finished. At the moment, I have 11 updatedb jobs running.


Here's the slocate.cron file
Code:

#!/bin/sh
renice +19 -p $$ >/dev/null 2>&1
/usr/bin/updatedb -f "nfs,smbfs,ncpfs,proc,devpts" -e "/tmp,/var/tmp,/usr/tmp,/afs,/net"


Has anyone got any suggestions as to how to solve the problem (other than making more space available on the 2TB drive)?

Should I run updatedb weekly instead? Or not at all? I have never really bother using 'locate' before, so does it matter if I run updatedb?
Should I test to see if an updatedb process is already running, before starting updatedb? My shell scripting writing ability is limited, so hints are welcome.
How long does updatedb take to update 2TB of files?

TenTenths 12-29-2011 03:24 AM

Check to see if any other scripts make use of "locate", if there are none then you can remove that script from the cron or, as you suggest, run it once a week.

Think about timings and see if you can schedule it to run when there it least load on the system and preferably when you don't have an rsync working.

Of course you could always try something as simple as turning it off and seeing what breaks!

ray63 01-05-2012 05:04 PM

Thanks TenTenths

Just to close this one out, I ended up re-arranging my rysnc backups and removing numerous backups. Rather than save every daily backup (over 18 months worth to date), I got around to saving monthly and weekly copies and deleted all the daily backups, except for the last month. This got rid of around 150 GB of data, but more importantly, millions of links. I started deleting files in a script in the morning, and files/directories were still being deleted during the night - so this took a long time.

I then recreated the slocate database (480 MB), and this still took about six hours, so I've put the cron job back into cron.daily, to see how this goes. This job is indexing three volumes (including 1 raid) - total of 2.7 TB, with around 30 million files/links

My backups run at 02:00, so they should normally be finished by 04:00, when cron.daily jobs are run.


All times are GMT -5. The time now is 03:43 PM.