LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
LinkBack Search this Thread
Old 08-24-2010, 08:52 AM   #1
benroberta
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Rep: Reputation: 0
cript interrupted at same time every night


I'm writing a python script for my TS-7550 (running Debian) that reads data from a serial port every 10 seconds. The script works perfectly, getting data in every second ending in 5 (e.g. 05, 15, 25, etc.), until it reaches 12:25am CST (GMT-6). At this time almost every other night, our program is interrupted for 7-11 seconds, which means we miss one of these ten second readings, or it is off by almost 10 seconds. Here is the error in my log file:
Code:
2010-08-12 00:25:25,017 1879 (DEBUG)	: Wrote Q
2010-08-12 00:25:26,271 1879 (DEBUG)	: line: 5FF	8DA	17DCA	8DB	0	0	0	0
2010-08-12 00:25:26,306 1879 (DEBUG)	: Calculated sleep time:8.69449496269
2010-08-12 00:25:35,007 1879 (DEBUG)	: Wrote Q
2010-08-12 00:25:35,843 1879 (DEBUG)	: line: 601	8DD	17DCA	8DB	0	0	0	0
2010-08-12 00:25:35,878 1879 (DEBUG)	: Calculated sleep time:9.12337088585
2010-08-12 00:25:45,017 1879 (DEBUG)	: Wrote Q
2010-08-12 00:25:56,217 1879 (DEBUG)	: line: 5FF	8DD	17DC9	8DB	0	0	0	0
2010-08-12 00:25:56,232 1879 (WARNING)	: Gap in data between 1281594335 and 1281594356 (21 seconds)
2010-08-12 00:25:56,565 1879 (DEBUG)	: Calculated sleep time:8.43716692924
2010-08-12 00:26:05,007 1879 (DEBUG)	: Wrote Q
2010-08-12 00:26:05,963 1879 (DEBUG)	: line: 602	8DD	17DC6	8DB	0	0	0	0
2010-08-12 00:26:05,997 1879 (DEBUG)	: Calculated sleep time:9.00356197357
(Sending the 'Q' over the serial port tells our PIC chip to send us data)
In this particular case, the reading is off by 11 seconds, meaning we missed an entire 10 second reading that should have been at 12:25:45. The delay above occurred between sending the Q and receiving data, but it has also appeared after receiving the data and before calculating the sleep time.

I looked at /var/log/syslog and found the following:
Code:
ts7500 kernel: imklog 3.18.6, log source = /proc/kmsg started.
ts7500 rsyslogd: [origin software="rsyslogd" swVersion="3.18.6" x-pid="745" x-info="http://www.rsyslog.com"] restart
at 12:25am CST (the system lists it as 06:25 because it is set to GMT). This was the only thing logged at about that time.

/etc/crontab reveals that the daily cron jobs are run at 6:25am GMT:
Code:
# m h dom mon dow user	command
17 *	* * *	root    cd / && run-parts --report /etc/cron.hourly
25 6	* * *	root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6	* * 7	root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6	1 * *	root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
The following files are in /etc/cron.daily:
Code:
apache2
apt
aptitude
bsdmainutils
logrotate
man-db
standard
But as far as I can tell only logrotate actually executes. I decided to modify logrotate to use the nice command in hopes that lowering the priority would help
Code:
#!/bin/sh

test -x /usr/sbin/logrotate || exit 0
nice -n 19 /usr/sbin/logrotate /etc/logrotate.conf
and then executed my script by running
Code:
nice -n -20 python Main.py
but this doesn't seem to have alleviated the problem.

Does anyone know what might be causing this? It's very important to us that we don't miss a 10-second reading, but it's acceptable if the reading is off by a few seconds.

Thanks!

EDIT: Is there a way to change the title? It should say 'script', not 'cript'

Last edited by benroberta; 08-24-2010 at 08:56 AM.
 
Old 08-24-2010, 10:14 AM   #2
lylemwood
Member
 
Registered: Jan 2008
Location: Toronto, Canada
Distribution: Slackware, CentOS
Posts: 45

Rep: Reputation: 18
Try disabling all other tasks for a night or two

Greetings from industrial control nightmare land!

1. I'd try disabling all other jobs (just chmod -x the scripts in daily, weekly, etc or comment out the lines in crontab). Your case is strong for it likely being the logrotate job or something else in daily. Once you can confirm that it is definitely the logrotate job, you can fix the issue.

2. logrotate runs scripts that may disable services. You should examine the logrotate.d directory to see what is rotating and whether or not the services are being restarted/stopped/etc. I've run aground of this kind of issue before.

3. Also, try nice -20 to launch your query job. However, if the logrotate job is killing services you need then priority isn't going to count.

IF you find that logrotate is killing your services, just clean up the job to either allow for a blank period or, better, disable the non-error logging of your services so you don't even need logrotate. That way you can schedule it to happen once a year and just toss this issue.

Good luck!

Edit: Ignore point 3, I somehow read past that without registering...

Last edited by lylemwood; 08-24-2010 at 10:15 AM. Reason: Stupidity...
 
Old 08-24-2010, 10:20 AM   #3
lylemwood
Member
 
Registered: Jan 2008
Location: Toronto, Canada
Distribution: Slackware, CentOS
Posts: 45

Rep: Reputation: 18
It just hit me.... Hard.

Howdy again, sorry to spam...

Part of the logrotate is to kill HUP the syslog service.

If you're using it, there's your issue. That's the service that is dead when your system attempts to write. Have your system create a manual log instead of relying on syslog if you can't disable the logrotate for syslog (unadvisable as it is VERY chatty).

HTH.
 
Old 08-24-2010, 12:01 PM   #4
benroberta
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Original Poster
Rep: Reputation: 0
Thanks, Lyle. I've made changes to crontab to run the daily cron jobs at a different time, but I won't know until tomorrow morning if that had any effect. I'm also going to disable all of the other cron scripts as you recommended (as soon as we're done with some slight hardware modifications).

Looking at logrotate.d, only syslog appears to be rotating daily. apache2 and ppp are weekly, and the others are all monthly. In rsyslog:
Code:
/var/log/syslog
{
	rotate 7
	daily
	missingok
	notifempty
	delaycompress
	compress
	postrotate
		invoke-rc.d rsyslog reload > /dev/null
	endscript
}
I'm not familiar with what these files are supposed to look like, but it appears that rsyslog is being restarted (reloaded?), which is what the log files suggested.

The logging for my program is handled with the python logging module. Specifically, I use logging.getlogger() and add a RotatingFileHandler to that. I'm not familiar with the details of logging, so I'm not sure how I would know if this is using whatever it is that logrotate kills. Replacing this logging would be very time consuming, so I'd rather consider other options.
 
Old 08-24-2010, 10:37 PM   #5
lylemwood
Member
 
Registered: Jan 2008
Location: Toronto, Canada
Distribution: Slackware, CentOS
Posts: 45

Rep: Reputation: 18
With options reduced... Go for better options...

Hello again.

Just a thought, but comment out the compress option in the syslog rotate. It DOES only keep 7 rotates of the log, so compress (which is the slowest portion of the operation) would be the thing killing your operation, likely... Syslog's restart shouldn't take more than a couple of seconds at most. Unless, of course, you're running this on a 486 rig. :-D

Probably the best option, then, provided you are still having the issue at the new time and without compress, is to consider disabling the logrotate for the syslog.

As I mentioned before, this is definitely non-optimal. However, if you disable it for a week or two and check the log file size vs. your free space you may find that you're ok.

Often people over-provision their space (the best option for maintaining quality in MOST linux filesystems), if you have done so and your /var/log containing partition is of sufficient size to handle your syslog not rotating as often, you may wish to move the syslog rotation to annual or monthly at a time where you will not be bothered by the skippage. Depending on your operations you may find that this is suitable for even weekly (Sunday night or whatever).

Again, I hope this helps.
 
Old 08-25-2010, 04:06 PM   #6
benroberta
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Original Poster
Rep: Reputation: 0
I've commented out compress and delaycompress and hopefully that will make enough of a difference that we won't have to make any more drastic changes. Since this machine is really only responsible for running our program and running ntpdate, I think we could easily set it to rotate weekly as well.

For now, I'm leaving for a week of vacation. I'll let you know what I find when I get back.

Thanks again.

-Ben
 
Old 09-03-2010, 07:32 AM   #7
lylemwood
Member
 
Registered: Jan 2008
Location: Toronto, Canada
Distribution: Slackware, CentOS
Posts: 45

Rep: Reputation: 18
NUDGE

Ben, how is it working now?

I'm guessing we go with the IT motto that no news is good news, but I'd like to hear for certain if it's not too much trouble...

Thanks,

Lyle
 
Old 09-04-2010, 09:00 AM   #8
benroberta
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Original Poster
Rep: Reputation: 0
Your guess is mostly correct. Commenting out those lines didn't make a significant difference, so we decided to go a slightly different route, but this requires a little explanation:

The reason that we needed every ten second reading is because our old website and scripts (which we don't have time to update right now) expect to find a file every minute that contains six readings. We wanted the six readings to be every ten seconds, but there's no requirement that they are, only that they represent a minute. So in our main loop we wrote a line that checks to see if we have as many ten second readings as we should based on how far into the minute we are. If we don't have enough readings, we'll take more readings until we catch up to the expected value. This way every minute will have six readings, even when the daily cron jobs run, and the vast majority of these readings will be evenly spaced.

Here's the basic code behind it:
Code:
if (int(time.time())%60/5%11+1)/2 != self.counter:
	timeLeft = 0
else:
	timeLeft = (Config.UPDATE_FREQUENCY/2-(time.time()%Config.UPDATE_FREQUENCY))%Config.UPDATE_FREQUENCY
time.sleep(timeLeft)
self.getData()
So, we didn't really fix the original error, we just accommodated it. I go back to school in a week, but I'll probably be working on this project again over winter break and may take another go at it then. Thanks again for your help.
 
  


Reply

Tags
crondaily, debian, python, rsyslog


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
E: dpkg was interrupted vikramtheone Linux - Newbie 2 04-27-2010 09:08 AM
running a cript (two ways) tostay2003 Programming 6 08-13-2008 01:45 AM
being billed for free time usage at dataone night limited plan redhatanirban Linux - Networking 1 06-01-2008 07:57 PM
LXer: The Curious Incident of Sun in the Night-Time LXer Syndicated Linux News 0 05-25-2006 06:54 PM
Download Interrupted Gerardoj Linux - Software 4 05-02-2004 09:47 AM


All times are GMT -5. The time now is 03:09 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration