LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   View/Stop and and all cron jobs? (https://www.linuxquestions.org/questions/linux-general-1/view-stop-and-and-all-cron-jobs-119472/)

Tenover 11-24-2003 10:11 AM

View/Stop and and all cron jobs?
 
I have a server that seems to hang/die sometime every week between Friday afternoon and early Monday morning.....I'm having a hard time troubleshooting what's causing it, but I imagine it's something in cron.weekly (?)....How can I find out if there's anything scheduled to run between those times? Do I have to check crontab for each and every user or what? Thanks....

Tinkster 11-24-2003 12:57 PM

Re: View/Stop and and all cron jobs?
 
Quote:

Originally posted by Tenover
How can I find out if there's anything scheduled to run between those times? Do I have to check crontab for each and every user or what? Thanks....
Yep ... look at everything in /var/spool/cron,
even though a user space process (anything
that isn't run by root) shouldn't be able to
crash the box...

Cheers,
Tink

Tenover 11-24-2003 03:51 PM

Thanks....There's NOTHING in there....However when I look in /var/spool/anacron, there are two entries (One in cron.daily and one in cron.weekly) that are just the past two dates when this thing has crashed. I tried to do a "more cron.daily" but all it gives me is that date...Any ideas?

Tinkster 11-24-2003 03:54 PM

I'm not using anacron (I start "at" jobs in
rc.local instead ;)) ... maybe there's a
anacron log directory? :)

Cheers,
Tink

Tenover 11-25-2003 09:36 AM

Ok, after checking my "lastlog", I see this in there, followed by "last message repeated..." for about 1000 lines, clogging up my syslog and having gpm run at 99.9% of the CPU....

Nov 25 08:28:41 praesto1 gpm[938]: Error in read()ing first: No such file or dir
ectory
Nov 25 08:28:49 praesto1 last message repeated 202659 times

teval 11-25-2003 09:50 AM

Try running strace gpm &> file
Then look at the file, and see what's failing. Could be certain devices or something like that are missing. it also could be a config file that's bad.

Tenover 11-25-2003 10:01 AM

Here's what the file tells me....

execve("/usr/sbin/gpm", ["gpm"], [/* 51 vars */]) = 0
uname({sys="Linux", node="praesto1", ...}) = 0
brk(0) = 0x805a500
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0
x40017000
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or direct
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=69846, ...}) = 0
old_mmap(NULL, 69846, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3) = 0
open("/lib/i686/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\307\1"..., 1024)
4
fstat64(3, {st_mode=S_IFREG|0755, st_size=5779542, ...}) = 0
old_mmap(NULL, 1291464, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4002a00
mprotect(0x4015c000, 38088, PROT_NONE) = 0
old_mmap(0x4015c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
1000) = 0x4015c000
old_mmap(0x40162000, 13512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP
YMOUS, -1, 0) = 0x40162000
close(3) = 0
munmap(0x40018000, 69846) = 0
brk(0) = 0x805a500
brk(0x805a680) = 0x805a680
brk(0x805b000) = 0x805b000
open("/var/run/gpm.pid", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0600, st_size=5, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
018000
read(3, "1411\n", 4096) = 5
close(3) = 0
munmap(0x40018000, 4096) = 0
kill(1411, SIG_0) = 0
open("/dev/tty0", O_WRONLY) = 3
ioctl(3, 0x541c, 0x8056fa0) = 0
close(3) = 0
fork() = 1416
--- SIGCHLD (Child exited) ---
_exit(0) = ?

teval 11-25-2003 11:02 AM

Do the same but with strace -f
Like that it will follow child processes. problem here isn't in the main one, it's a fork that's causing the problems.

Tenover 11-25-2003 11:14 AM

Ok, here's the output...

execve("/usr/sbin/gpm", ["gpm"], [/* 51 vars */]) = 0
uname({sys="Linux", node="praesto1", ...}) = 0
brk(0) = 0x805a500
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0
x40017000
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or direct
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=69846, ...}) = 0
old_mmap(NULL, 69846, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3) = 0
open("/lib/i686/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\307\1"..., 1024)
4
fstat64(3, {st_mode=S_IFREG|0755, st_size=5779542, ...}) = 0
old_mmap(NULL, 1291464, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4002a00
mprotect(0x4015c000, 38088, PROT_NONE) = 0
old_mmap(0x4015c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
1000) = 0x4015c000
old_mmap(0x40162000, 13512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP
YMOUS, -1, 0) = 0x40162000
close(3) = 0
munmap(0x40018000, 69846) = 0
brk(0) = 0x805a500
brk(0x805a680) = 0x805a680
brk(0x805b000) = 0x805b000
open("/var/run/gpm.pid", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0600, st_size=5, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
018000
read(3, "1411\n", 4096) = 5
close(3) = 0
munmap(0x40018000, 4096) = 0
kill(1411, SIG_0) = 0
open("/dev/tty0", O_WRONLY) = 3
ioctl(3, 0x541c, 0x8056fa0) = 0
close(3) = 0
fork() = 1416
--- SIGCHLD (Child exited) ---
_exit(0) = ?

teval 11-25-2003 11:48 AM

Wierd.. my output looks nothing like that...
Are you using the newest gpm?

Tenover 11-25-2003 11:51 AM

OOps, my bad...I had killed gpm when it started taking up 99.9% of the CPU! Here's the output after starting it again...

execve("/usr/sbin/gpm", ["gpm"], [/* 51 vars */]) = 0
uname({sys="Linux", node="praesto1", ...}) = 0
brk(0) = 0x805a500
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
x40017000
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=69846, ...}) = 0
old_mmap(NULL, 69846, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3) = 0
open("/lib/i686/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\307\1"..., 1024) = 1
4
fstat64(3, {st_mode=S_IFREG|0755, st_size=5779542, ...}) = 0
old_mmap(NULL, 1291464, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4002a000
mprotect(0x4015c000, 38088, PROT_NONE) = 0
old_mmap(0x4015c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x
1000) = 0x4015c000
old_mmap(0x40162000, 13512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_AN
YMOUS, -1, 0) = 0x40162000
close(3) = 0
munmap(0x40018000, 69846) = 0
brk(0) = 0x805a500
brk(0x805a680) = 0x805a680
brk(0x805b000) = 0x805b000
open("/var/run/gpm.pid", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0600, st_size=5, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x
018000
read(3, "1411\n", 4096) = 5
close(3) = 0
munmap(0x40018000, 4096) = 0
kill(1411, SIG_0) = 0
open("/dev/tty0", O_WRONLY) = 3
ioctl(3, 0x541c, 0x8056fa0) = 0
close(3) = 0
fork() = 1539
[pid 1539] close(0) = 0
[pid 1539] close(1) = 0
[pid 1539] close(2) = 0
[pid 1539] open("/dev/console", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 0
[pid 1539] setsid() = 1539
[pid 1539] chdir("/") = 0
[pid 1539] umask(022) = 022
[pid 1539] gettimeofday({1069786274, 130636}, NULL) = 0
[pid 1539] getpid() = 1539
[pid 1539] open("/var/run//gpmiF88HQ", O_RDWR|O_CREAT|O_EXCL, 0600) = 1
[pid 1539] open("/var/run//gpmiF88HQ", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 2
[pid 1539] getpid() = 1539
[pid 1539] fstat64(2, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
[pid 1539] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
1, 0) = 0x40018000
[pid 1539] write(2, "1539\n", 5) = 5
[pid 1539] close(2) = 0
[pid 1539] munmap(0x40018000, 4096) = 0
[pid 1539] link("/var/run//gpmiF88HQ", "/var/run/gpm.pid") = -1 EEXIST (File
ists)
[pid 1539] open("/var/run/gpm.pid", O_RDONLY) = 2
[pid 1539] fstat64(2, {st_mode=S_IFREG|0600, st_size=5, ...}) = 0
[pid 1539] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
1, 0) = 0x40018000
[pid 1539] read(2, "1411\n", 4096) = 5
[pid 1539] unlink("/var/run//gpmiF88HQ") = 0
[pid 1539] brk(0x805e000) = 0x805e000
[pid 1539] time([1069786274]) = 1069786274
[pid 1539] open("/etc/localtime", O_RDONLY) = 3
[pid 1539] fstat64(3, {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0
[pid 1539] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
1, 0) = 0x40019000
[pid 1539] read(3, "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0"..
4096) = 1017
[pid 1539] close(3) = 0
[pid 1539] munmap(0x40019000, 4096) = 0
[pid 1539] getpid() = 1539
[pid 1539] rt_sigaction(SIGPIPE, {0x401143c0, [], 0x4000000}, {SIG_DFL}, 8) =
[pid 1539] socket(PF_UNIX, SOCK_DGRAM, 0) = 3
[pid 1539] fcntl64(0x3, 0x2, 0x1, 0x40114190) = 0
[pid 1539] connect(3, {sin_family=AF_UNIX, path="/dev/log"}, 16) = -1 ENOENT
o such file or directory)
[pid 1539] close(3) = 0
[pid 1539] rt_sigaction(SIGPIPE, {SIG_DFL}, NULL, 8) = 0
[pid 1539] time([1069786274]) = 1069786274
[pid 1539] getpid() = 1539
[pid 1539] rt_sigaction(SIGPIPE, {0x401143c0, [], 0x4000000}, {SIG_DFL}, 8) =
[pid 1539] socket(PF_UNIX, SOCK_DGRAM, 0) = 3
[pid 1539] fcntl64(0x3, 0x2, 0x1, 0x40114190) = 0
[pid 1539] connect(3, {sin_family=AF_UNIX, path="/dev/log"}, 16) = -1 ENOENT
o such file or directory)
[pid 1539] close(3) = 0
[pid 1539] rt_sigaction(SIGPIPE, {SIG_DFL}, NULL, 8) = 0
[pid 1539] fstat64(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(5, 1), ...}) = 0
[pid 1539] ioctl(0, 0x5401, {B38400 opost isig icanon echo ...}) = 0
[pid 1539] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
1, 0) = 0x40019000
[pid 1539] write(0, "gpm: oops() invoked from gpn.c(2"..., 36) = 36
[pid 1539] write(0, "gpm already running as pid 1411:"..., 59) = 59
[pid 1539] munmap(0x40019000, 4096) = 0
[pid 1539] _exit(1) = ?
--- SIGCHLD (Child exited) ---
_exit(0) = ?

unSpawn 11-25-2003 12:23 PM

What's the contents of /etc/crontab (grep /etc/crontab -e "^[\*,0-9]") and dirs (grep /etc/crontab -e run-parts) containing cronjobs?

Tenover 11-25-2003 12:25 PM

First one:

01 * * * * root run-parts /etc/cron.hourly
02 1 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

Tenover 11-25-2003 12:26 PM

Second one:

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 1 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

Same thing??

unSpawn 11-25-2003 12:47 PM

Ahhh, OK. Could you list the contents of those four dirs please?

Tenover 11-25-2003 12:54 PM

Cron.hourly:

[root@praesto1 cron.hourly]# ls -la | more
total 12
drwxr-xr-x 2 root root 4096 Jan 3 2002 .
drwxr-xr-x 59 root root 8192 Nov 25 08:16 ..


Cron.Daily:

[root@praesto1 cron.daily]# ls -la | more
total 44
drwxr-xr-x 2 root root 4096 Nov 6 13:12 .
drwxr-xr-x 59 root root 8192 Nov 25 08:16 ..
-rwxr-xr-x 1 root root 276 Jun 24 2001 0anacron
-rwxr-xr-x 1 root root 51 Sep 4 2001 logrotate
-rwxr-xr-x 1 root root 402 Aug 31 2001 makewhatis.cron
-rwxr-xr-x 1 root root 104 Sep 6 2001 rpm
-rwxr-xr-x 1 root root 132 Oct 24 2002 run_nohup.sh
-rwxr-xr-x 1 root root 132 Jun 24 2001 slocate.cron
-rwxr-xr-x 1 root root 91 Aug 13 2001 sysstat
-rwxr-xr-x 1 root root 193 Nov 28 2001 tmpwatch

Cron.Weekly:

[root@praesto1 cron.weekly]# ls -la | more
total 20
drwxr-xr-x 2 root root 4096 Nov 6 13:12 .
drwxr-xr-x 59 root root 8192 Nov 25 08:16 ..
-rwxr-xr-x 1 root root 277 Jun 24 2001 0anacron
-rwxr-xr-x 1 root root 399 Aug 31 2001 makewhatis.cron

Cron.Monthly:

[root@praesto1 cron.monthly]# ls -la | more
total 16
drwxr-xr-x 2 root root 4096 Nov 6 13:12 .
drwxr-xr-x 59 root root 8192 Nov 25 08:16 ..
-rwxr-xr-x 1 root root 278 Jun 24 2001 0anacron

teval 11-25-2003 01:12 PM

Last time you did that, gpm was already running. Do..
killall -9 gpm
rm /var/run/gpm.pid
Then try the strace again.
This time it should show what it's missing.

Tenover 11-25-2003 01:18 PM

New Output....

execve("/usr/sbin/gpm", ["gpm"], [/* 51 vars */]) = 0
uname({sys="Linux", node="praesto1", ...}) = 0
brk(0) = 0x805a500
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
x40017000
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or dire
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=69846, ...}) = 0
old_mmap(NULL, 69846, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3) = 0
open("/lib/i686/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\307\1"..., 1024
4
fstat64(3, {st_mode=S_IFREG|0755, st_size=5779542, ...}) = 0
old_mmap(NULL, 1291464, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4002a
mprotect(0x4015c000, 38088, PROT_NONE) = 0
old_mmap(0x4015c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED,
1000) = 0x4015c000
old_mmap(0x40162000, 13512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|M
YMOUS, -1, 0) = 0x40162000
close(3) = 0
munmap(0x40018000, 69846) = 0
brk(0) = 0x805a500
brk(0x805a680) = 0x805a680
brk(0x805b000) = 0x805b000
open("/var/run/gpm.pid", O_RDONLY) = -1 ENOENT (No such file or dire
open("/dev/tty0", O_WRONLY) = 3
ioctl(3, 0x541c, 0x8056fa0) = 0
close(3) = 0
fork() = 1830
_exit(0) = ?

teval 11-25-2003 01:26 PM

That's wierd... well it dies, but that didn't write to the log file at all
My only suggestion is to recompile the newest gpm, or to get the rpm for it.

unSpawn 11-25-2003 01:53 PM

# run-parts
01 * * * * root run-parts /etc/cron.hourly
(etc)

Is it me or do the times look a bit wonky? I mean (running Vixie-cron) I always thought the fields where: "minutes, hours, day of month, month, day of week"? Not that it will help you solve your problem, but the times they're run at look weird.

If you can verify the contents of the other cronjobs are "sane" by your standards, then at least you can say your problem probably hasn't got to do with one of these. Only thing I can't recognize is the /etc/cron.daily/run_nohup.sh, but then it's run at such a time it couldn't be part of killing the box.

Definately weird problem, especially because of the period you mention.
Doesn't sysstat show wierd fluctuations in CPU or memory usage over that period?


All times are GMT -5. The time now is 08:36 PM.