Gethyn |
03-23-2011 12:57 PM |
There are a couple of interesting log snippets:
Code:
/var/log/clamav/clamd.log
...
Wed Mar 23 04:07:15 2011 -> WARNING: Directory recursion limit reached, skipping XXXXX
Wed Mar 23 04:08:26 2011 -> XXXXX: Can't open file or directory ERROR
Wed Mar 23 04:11:34 2011 -> XXXXX: Can't open file or directory ERROR
Wed Mar 23 04:14:35 2011 -> SelfCheck: Database status OK.
Wed Mar 23 04:15:14 2011 -> XXXXX: Can't open file or directory ERROR
Wed Mar 23 04:15:14 2011 -> XXXXX: Can't open file or directory ERROR
Wed Mar 23 04:20:58 2011 -> Reading databases from /var/clamav
Wed Mar 23 04:21:01 2011 -> Database correctly reloaded (921596 signatures)
...
I've edited out the actual file names for privacy reasons, but they're all user data rather than system files. The "Can't open file or directory" ones are files over ClamAV's size limit (>2GB). Unfortunately it doesn't seem be possible to get ClamAV to quietly ignore oversize files. The ClamAV database is getting rebuilt around this time too.
Here are the relevant bits of the maillog (apologies for length):
Code:
/var/log/maillog
...
Mar 23 04:08:07 localhost postfix/smtpd[10498]: connect from 141-42.202-68.tampabay.res.rr.com[68.202.42.141]
Mar 23 04:08:19 localhost postfix/trivial-rewrite[12713]: warning: dict_ldap_lookup: Search error -5: Timed out
Mar 23 04:08:19 localhost postfix/trivial-rewrite[12713]: fatal: ldap:ldap_virtual_alias(0,lock|fold_fix): table lookup problem
Mar 23 04:08:22 localhost postfix/smtpd[10498]: warning: premature end-of-input on private/rewrite socket while reading input attribute name
Mar 23 04:08:22 localhost postfix/smtpd[10498]: warning: problem talking to service rewrite: Success
Mar 23 04:08:22 localhost postfix/master[18132]: warning: process /usr/libexec/postfix/trivial-rewrite pid 12713 exit status 1
Mar 23 04:08:26 localhost postfix/smtpd[10498]: 64A64217810: client=141-42.202-68.tampabay.res.rr.com[68.202.42.141]
Mar 23 04:08:27 localhost postfix/cleanup[12717]: 64A64217810: message-id=<4d897f3c.c0efcd62@ff.com>
Mar 23 04:08:27 localhost postfix/qmgr[18135]: 64A64217810: from=<klianags@ff.com>, size=869, nrcpt=1 (queue active)
Mar 23 04:08:27 localhost postfix/smtpd[10498]: disconnect from 141-42.202-68.tampabay.res.rr.com[68.202.42.141]
Mar 23 04:08:28 localhost amavis[11219]: (11219-15) LMTP::10024 /var/amavis/tmp/amavis-20110323T040154-11219: <klianags@ff.com> -> <user@example.org> SIZE=869 Received: from mail.example.org ([127.0.0.1]) by localhost (localhost.example.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP for <user@example.org>; Wed, 23 Mar 2011 04:08:27 +0000 (GMT)
Mar 23 04:08:28 localhost amavis[11219]: (11219-15) Checking: 3ufmr+7F1HLH [68.202.42.141] <klianags@ff.com> -> <user@example.org>
Mar 23 04:08:43 localhost amavis[11219]: (11219-15) SA info: pyzor: [12722] error: TERMINATED, signal 15 (000f)
Mar 23 04:08:58 localhost postfix/smtpd[10498]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:09:04 localhost postfix/smtpd[10498]: lost connection after CONNECT from localhost.localdomain[127.0.0.1]
Mar 23 04:09:06 localhost postfix/smtpd[10498]: disconnect from localhost.localdomain[127.0.0.1]
Mar 23 04:09:07 localhost postfix/smtpd[12723]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:09:08 localhost postfix/smtpd[12723]: 705BB217812: client=localhost.localdomain[127.0.0.1]
Mar 23 04:09:08 localhost postfix/cleanup[12717]: 705BB217812: message-id=<4d897f3c.c0efcd62@ff.com>
Mar 23 04:09:08 localhost postfix/smtpd[12723]: disconnect from localhost.localdomain[127.0.0.1]
Mar 23 04:09:08 localhost amavis[11219]: (11219-15) FWD via SMTP: <klianags@ff.com> -> <user@example.org>,BODY=7BIT 250 2.0.0 Ok, id=11219-15, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as 705BB217812
Mar 23 04:09:08 localhost postfix/qmgr[18135]: 705BB217812: from=<klianags@ff.com>, size=1994, nrcpt=1 (queue active)
Mar 23 04:09:08 localhost amavis[11219]: (11219-15) Passed SPAMMY, [68.202.42.141] [68.202.42.141] <klianags@ff.com> -> <user@example.org>, Message-ID: <4d897f3c.c0efcd62@ff.com>, mail_id: 3ufmr+7F1HLH, Hits: 42.456, size: 868, queued_as: 705BB217812, 40869 ms
Mar 23 04:09:08 localhost postfix/lmtp[12718]: 64A64217810: to=<user@example.org>, relay=127.0.0.1[127.0.0.1]:10024, delay=45, delays=3.7/0.39/0/41, dsn=2.0.0, status=sent (250 2.0.0 Ok, id=11219-15, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as 705BB217812)
Mar 23 04:09:08 localhost postfix/qmgr[18135]: 64A64217810: removed
Mar 23 04:09:09 localhost postfix/smtpd[12725]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:09:09 localhost postfix/smtpd[12725]: lost connection after CONNECT from localhost.localdomain[127.0.0.1]
Mar 23 04:09:09 localhost postfix/smtpd[12725]: disconnect from localhost.localdomain[127.0.0.1]
Mar 23 04:09:12 localhost postfix/postfix-script: stopping the Postfix mail system
Mar 23 04:09:12 localhost postfix/lmtp[12727]: 705BB217812: to=<user@example.org>, relay=127.0.0.1[127.0.0.1]:2003, delay=4.9, delays=0.62/0.44/0.77/3.1, dsn=2.1.5, status=sent (250 2.1.5 user@example.org Ok)
Mar 23 04:09:12 localhost postfix/qmgr[18135]: 705BB217812: removed
Mar 23 04:09:13 localhost postfix/master[18132]: terminating on signal 15
Mar 23 04:09:16 localhost sendmail[12744]: alias database /etc/aliases rebuilt by root
Mar 23 04:09:16 localhost sendmail[12744]: /etc/aliases: 77 aliases, longest 10 bytes, 779 bytes total
Mar 23 04:09:28 localhost postfix/postfix-script: starting the Postfix mail system
...
...
...
Mar 23 04:20:36 localhost postfix/smtpd[12813]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:20:44 localhost postfix/smtpd[12813]: lost connection after CONNECT from localhost.localdomain[127.0.0.1]
Mar 23 04:20:44 localhost postfix/smtpd[12813]: disconnect from localhost.localdomain[127.0.0.1]
Mar 23 04:20:44 localhost postfix/smtpd[12813]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:20:48 localhost postfix/smtpd[12813]: DD22B217810: client=localhost.localdomain[127.0.0.1]
Mar 23 04:20:48 localhost postfix/cleanup[13003]: DD22B217810: message-id=<1300854048.1978973175@localhost.example.org>
Mar 23 04:20:48 localhost postfix/qmgr[12795]: DD22B217810: from=<monit@example.org>, size=776, nrcpt=1 (queue active)
Mar 23 04:20:48 localhost amavis[12945]: (12945-03) LMTP::10024 /var/amavis/tmp/amavis-20110323T042048-12945: <monit@example.org> -> <support@example.com> SIZE=776 BODY=8BITMIME Received: from mail.example.org ([127.0.0.1]) by localhost (localhost.example.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP for <support@example.com>; Wed, 23 Mar 2011 04:20:48 +0000 (GMT)
Mar 23 04:20:48 localhost amavis[12945]: (12945-03) Checking: jowVanhCjxUE MYNETS [127.0.0.1] <monit@example.org> -> <support@example.com>
Mar 23 04:20:49 localhost postfix/smtpd[12813]: 2B849217812: client=localhost.localdomain[127.0.0.1]
Mar 23 04:20:49 localhost postfix/cleanup[13003]: 2B849217812: message-id=<1300854049.1531217627@localhost.example.org>
Mar 23 04:20:49 localhost postfix/qmgr[12795]: 2B849217812: from=<monit@example.org>, size=766, nrcpt=1 (queue active)
Mar 23 04:20:49 localhost postfix/smtpd[12813]: disconnect from localhost.localdomain[127.0.0.1]
Mar 23 04:20:49 localhost amavis[11435]: (11435-15) LMTP::10024 /var/amavis/tmp/amavis-20110323T041031-11435: <monit@example.org> -> <support@example.net> SIZE=766 BODY=8BITMIME Received: from mail.example.org ([127.0.0.1]) by localhost (localhost.example.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP for <support@example.net>; Wed, 23 Mar 2011 04:20:49 +0000 (GMT)
Mar 23 04:20:49 localhost amavis[11435]: (11435-15) Checking: pqkSHT7EbonO MYNETS [127.0.0.1] <monit@example.org> -> <support@example.net>
Mar 23 04:20:50 localhost postfix/postfix-script: stopping the Postfix mail system
Mar 23 04:20:50 localhost postfix/master[12792]: terminating on signal 15
Mar 23 04:20:51 localhost sendmail[13025]: alias database /etc/aliases rebuilt by root
Mar 23 04:20:51 localhost sendmail[13025]: /etc/aliases: 77 aliases, longest 10 bytes, 779 bytes total
Mar 23 04:20:56 localhost amavis[11435]: (11435-15) (!)FWD via SMTP: <monit@example.org> -> <support@example.net>, 450 4.4.1 Can't connect to INET4 socket 127.0.0.1: Connection refused, MTA([127.0.0.1]:10025), id=11435-15
Mar 23 04:20:56 localhost amavis[11435]: (11435-15) Blocked MTA-BLOCKED, MYNETS LOCAL [127.0.0.1] [127.0.0.1] <monit@example.org> -> <support@example.net>, Message-ID: <1300854049.1531217627@localhost.example.org>, mail_id: pqkSHT7EbonO, Hits: -2.9, size: 766, 7495 ms
Mar 23 04:20:56 localhost amavis[11435]: (11435-15) (!!)TROUBLE in process_request: Error writing a SMTP response to the socket: Broken pipe at (eval 67) line 987, <GEN50> line 31.
Mar 23 04:20:57 localhost postfix/postfix-script: starting the Postfix mail system
Mar 23 04:20:57 localhost amavis[11435]: (11435-15) (!)Requesting process rundown after fatal error
Mar 23 04:20:57 localhost amavis[12945]: (12945-03) (!)FWD via SMTP: <monit@example.org> -> <support@example.com>, 450 4.4.1 Can't connect to INET4 socket 127.0.0.1: Connection refused, MTA([127.0.0.1]:10025), id=12945-03
Mar 23 04:20:57 localhost amavis[11435]: (11435-15) SMTP shutdown: Error writing a SMTP response to the socket: at (eval 67) line
987, <GEN50> line 31.\n
Mar 23 04:20:57 localhost amavis[12945]: (12945-03) Blocked MTA-BLOCKED, MYNETS LOCAL [127.0.0.1] [127.0.0.1] <monit@example.org> -> <support@example.com>, Message-ID: <1300854048.1978973175@localhost.example.org>, mail_id: jowVanhCjxUE, Hits: -2.9, size: 776, 9254 ms
Mar 23 04:20:57 localhost postfix/master[13076]: daemon started -- version 2.3.3, configuration /etc/postfix
Mar 23 04:20:57 localhost amavis[12945]: (12945-03) (!!)TROUBLE in process_request: Error writing a SMTP response to the socket: Broken pipe at (eval 67) line 987, <GEN21> line 31.
Mar 23 04:20:57 localhost amavis[12945]: (12945-03) (!)Requesting process rundown after fatal error
Mar 23 04:20:58 localhost amavis[12945]: (12945-03) SMTP shutdown: Error writing a SMTP response to the socket: at (eval 67) line 987, <GEN21> line 31.\n
Mar 23 04:20:58 localhost postfix/qmgr[13079]: DD22B217810: from=<monit@example.org>, size=776, nrcpt=1 (queue active)
Mar 23 04:20:58 localhost postfix/qmgr[13079]: 2B849217812: from=<monit@example.org>, size=766, nrcpt=1 (queue active)
Mar 23 04:20:58 localhost amavis[13096]: (13096-01) LMTP::10024 /var/amavis/tmp/amavis-20110323T042058-13096: <monit@example.org> -> <support@example.net> SIZE=766 BODY=8BITMIME Received: from mail.example.org ([127.0.0.1]) by localhost (localhost.example.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP for <support@example.net>; Wed, 23 Mar 2011 04:20:58 +0000 (GMT)
Mar 23 04:20:58 localhost amavis[13097]: (13097-01) LMTP::10024 /var/amavis/tmp/amavis-20110323T042058-13097: <monit@example.org> -> <support@example.com> SIZE=776 BODY=8BITMIME Received: from mail.example.org ([127.0.0.1]) by localhost (localhost.example.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP for <support@example.com>; Wed, 23 Mar 2011 04:20:58 +0000 (GMT)
Mar 23 04:20:58 localhost amavis[13097]: (13097-01) Checking: P8WXuWzzphZ0 MYNETS [127.0.0.1] <monit@example.org> -> <support@example.com>
Mar 23 04:20:58 localhost amavis[13096]: (13096-01) Checking: WoSGo1iYwCAl MYNETS [127.0.0.1] <monit@example.org> -> <support@example.net>
Mar 23 04:20:58 localhost amavis[13096]: (13096-01) cached 3bae5ab78a88d7af26bf16055ca3bd5c from <monit@example.org> (1,1)
Mar 23 04:20:58 localhost amavis[13097]: (13097-01) cached 3bae5ab78a88d7af26bf16055ca3bd5c from <monit@example.org> (1,1)
Mar 23 04:20:59 localhost postfix/smtpd[13100]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:20:59 localhost postfix/smtpd[13102]: connect from localhost.localdomain[127.0.0.1]
Mar 23 04:20:59 localhost postfix/smtpd[13100]: 0C963217817: client=localhost.localdomain[127.0.0.1]
Mar 23 04:20:59 localhost postfix/cleanup[13104]: 0C963217817: message-id=<1300854048.1978973175@localhost.example.org>
Mar 23 04:20:59 localhost postfix/qmgr[13079]: 0C963217817: from=<monit@example.org>, size=1016, nrcpt=1 (queue active)
Mar 23 04:20:59 localhost amavis[13097]: (13097-01) FWD via SMTP: <monit@example.org> -> <support@example.com>,BODY=7BIT 250 2.0.0 Ok, id=13097-01, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as 0C963217817
Mar 23 04:20:59 localhost postfix/smtpd[13102]: 0F78C21781C: client=localhost.localdomain[127.0.0.1]
Mar 23 04:20:59 localhost postfix/cleanup[13104]: 0F78C21781C: message-id=<1300854049.1531217627@localhost.example.org>
Mar 23 04:20:59 localhost amavis[13097]: (13097-01) Passed CLEAN, MYNETS LOCAL [127.0.0.1] [127.0.0.1] <monit@example.org> -> <support@example.com>, Message-ID: <1300854048.1978973175@localhost.example.org>, mail_id: P8WXuWzzphZ0, Hits: -2.9, size: 776, queued_as: 0C963217817, 173 ms
...
It's looking as though ClamAV might be overloading the system, at which point postfix chokes. In this instance Amavis appears to be processing a spam email at the same time as the clamdscan, perhaps the two have to happen at the same time for the system to reach critical load. That could explain the intermittent nature of the problem - this server gets enough mail that it's likely to be scanning a message at the time of the monit connectivity check, but it's not guaranteed.
I've enabled monit's -v option in the initscript, increased the postfix connect timeout in monit, and increased clamdscan's niceness. Will see what happens over the next few days and report back.
Thanks for the feedback so far.
|