[SOLVED] Lubuntu 12.10 crashing every night possibly by BackInTime program
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Lubuntu 12.10 crashing every night possibly by BackInTime program
Lubuntu 12.10 32-bit with 6gb ram in the system. Two 1gb ram virtualbox machines run inside it as well as BackInTime running each night at 10:00pm.
It seems each night around 1:30 (give or take an hour), the server crashes with a ton of oom-killer messages in the kern.log (I check after rebooting).
Just recently I added over one million files to my directory that backintime backs up. So I think it could be that.
I also run Crashplan as well as rsync all the files to another server each night, but I disabled those two about a week ago trying to find out what is crashing the server. It still crashes with those disabled.
Attached is a portion of my kern.log file. The file is about 1mb on the server and keeps giving oom-killer errors every few minutes until I manually powered the server off.
Large memory (yes even 6 Gig) management of 32-bit systems is murky at best.
That doesn't look to be large allocation requests.
I'd be inclined to save /proc/zoneinfo and "slabtop -o -s c" just after a re-boot, and when you get bit by the oom-killer. Might give you some ideas.
OK, looking at a recent syslog, I see this regarding backintime:
Dec 20 23:34:53 lubuntu-server backintime (root): INFO: Command "find "/media/truebackup/backintime/server/root/daily/20131219-220002-954/backup/" -type d -exec chmod a-w {} \;" returns 0
Then a few minutes later in the same log file:
Dec 20 23:57:04 lubuntu-server kernel: [52119.257761] nxagent invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
Dec 20 23:57:04 lubuntu-server kernel: [52119.257766] nxagent cpuset=/ mems_allowed=0
Dec 20 23:57:04 lubuntu-server kernel: [52119.257769] Pid: 2854, comm: nxagent Tainted: G O 3.5.0-17-generic #28-Ubuntu
Dec 20 23:57:04 lubuntu-server kernel: [52119.257771] Call Trace:
Dec 20 23:57:04 lubuntu-server kernel: [52119.257779] [<c15c01c4>] dump_header.isra.10+0x86/0x1b4
Dec 20 23:57:04 lubuntu-server kernel: [52119.257784] [<c1104a1a>] oom_kill_process+0x23a/0x270
Dec 20 23:57:04 lubuntu-server kernel: [52119.257788] [<c1104ae1>] ? select_bad_process.constprop.15+0x91/0x170
Dec 20 23:57:04 lubuntu-server kernel: [52119.257791] [<c1104f53>] out_of_memory+0x163/0x1c0
Dec 20 23:57:04 lubuntu-server kernel: [52119.257794] [<c1108abf>] __alloc_pages_nodemask+0x68f/0x750
Dec 20 23:57:04 lubuntu-server kernel: [52119.257798] [<c1108bfc>] __get_free_pages+0x1c/0x40
And so on. I have since disabled backintime and the system has not crashed or had an oom-killer entry in any of the logs. So something is going on with backintime. This all started when I added about a million files to one of the current directories that backintime backs up. Also, I rsync the files off this server onto another server and that completes with no problems.
After that line the next task would be 'chmod -R a+w <new_snapshot_folder>'. A very simple task. Quite weired if that would cause a oom-killer...
I would doubt it (the "find ..." command string) caused the issue directly unless there are a large number of directories - a quick strace showed a new child being clone'd for each directory; could be an issue with very large number of directories. I'd be looking at that python task - I presume it's from BIT from the PID. Lots of small memory allocations might cause fragmentation in the slab allocator - there have been issues reasonably recently with this (couple of years ago). Even if the task gets killed it may not fix the fragmentation.
Not doing all that would seem a (much) better option.
I always believe in trying to help anyone prepared to develop open-source.
As you have a real problem, and a potential solution, I'm sure they would appreciate you trying the new version. Whether it solves the issue or not, the feedback will be beneficial.
I would hope it does help.
OK, I'll install it tonight (I have to find out how first) and then will run the backup tonight. Hopefully my server does not freeze up. Still, I'll give it 3 days to be sure since a few times it took two nights before it froze the server.
I agree with syg00 that it might be the 'find ...' command.
I'm pretty sure the new version will fix it because there are two changes regarding this. First we now use
Code:
find [...] -exec chmod [...] {} +
instead of
Code:
find [...] -exec chmod [...] {} \;
which will drastically reduce the amount of new chmod instances (works like xargs)
And second there is this new 'Full rsync mode' which will delegate all the work to rsync (must be selected in options to use it)
Ok I got it installed and it ran fine last night. I'll run it a few more times and report back. I checked on the rsync option in the options area of the program.
The hard drive I backup to ran out of space...I believe switching to rsync made the program recreate all the files in the backup drive. In other words, it was not an incremental backup but a new full backup. I'll have to delete my old backups to run this one as my original folder being backed up takes up more than half the space on the backup drive.
it was not an incremental backup but a new full backup.
To prove if they are not incremental anymore please take a look at FAQ 2403.
Normally they should be incremental. Even after switching to 'Full rsync mode'. But if deleting all previous snapshots is an option for you, it would be better anyways.
Which filesystem do you use on source and destination? I'd recommend ext2|3|4 for dst.
I use ext4. The files all said 1 whereas my old backups would have much higher numbers. So basically, this was not incremental. I am going to delete all my backups and then run this again and see what happens.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.