Strange random problem with sendmail (damaged mailboxes)
Hello,
I have a strange problem on my Ubuntu mailserver. It happens randomly. The problem is that sometimes one random mailbox file (/var/spool/mail/*) gets corrupted, if I edit it I see that at the beginning of file a random number of null chars (00) is inserted, partially overwriting the real content. I'm not sure if there's a link with the fact that these e-mail boxes are checked both on computer and mobile. Sometimes it happens when I restart the sendmail service. To fix these mailboxes I proceed with deleting all chars (included nulls) until I find a line starting with the word "From:" (the beginning of a new message). After that, the mailbox starts working correctly. I searched a lot on the web but it seems I'm the only one to have this particular problem. I'm using sendmail+dovecot+squirrelmail. Could you please help me? What could I check? Thank you!!! |
Quote:
- Does this release provide the latest versions of Dovecot, Squirrel Mail and any dependencies and what are their exact versions? - Is the size of these mail boxes large, huge, humongous or plain ludicrous? (What does 'ls -lh /var/spool/mail/*' return wrt size?) - Does it only happen with just-delivered mail in spool files or also with mailboxes in ~/? - When did this mailbox corruption first manifest itself and can you trace back any system (re-)configuration, mailbox or directory permission changes or SW upgrading? - Do your run any Squirrel Mail plugins we should know about? - Does it happen with any mobile, some, any application it or they use or some? - What do the system and daemon logs show? - And what do system and daemon logs show when a user tries to access a corrupt mailbox? - Is the system low on memory when mailbox corruption happens and do you collect and have SAR data? *Please note these are fifteen questions that should be answered and as verbose as possible. |
First of all thanks for your interest in my problem!
- What distribution release are you using? DISTRIB_ID=Ubuntu DISTRIB_RELEASE=10.10 DISTRIB_CODENAME=maverick DISTRIB_DESCRIPTION="Ubuntu 10.10" - Does this release provide the latest versions of Dovecot, Squirrel Mail and any dependencies and what are their exact versions? I don't think these are the latest versions, anyway: Dovecot v1.1.11 SquirrelMail v1.4.21 OpenSSL 0.9.8o 01 Jun 2010 SpamAssassin version 3.3.1 running on Perl version 5.10.0 - Is the size of these mail boxes large, huge, humongous or plain ludicrous? (What does 'ls -lh /var/spool/mail/*' return wrt size?) There are about 30 mailboxes, the biggest ones are around 500Mb. Total is 7,8Gb - Does it only happen with just-delivered mail in spool files or also with mailboxes in ~/? Sorry, I'm not sure I didn't understand your question... Anyway the problem seems to be only in /var/spool/mail folder - When did this mailbox corruption first manifest itself and can you trace back any system (re-)configuration, mailbox or directory permission changes or SW upgrading? The first time it happened was a long time ago, probably this problem is manifesting since the first setup of the server. Since then, I've only made some updates of Squirrelmail (plus I have auto-updates configured, but I think they don't work anymore because repositories were put offline) - Do your run any Squirrel Mail plugins we should know about? I use the standard plugins, plus "local_autorespond_forward" and "vlogin" (to manage multiple domains) - Does it happen with any mobile, some, any application it or they use or some? The mobiles used are of various type. In my opinion the problem is related with Blackberries (the mailbox is checked with the standard client on them). No problems with mailboxes checked with iPad or Galaxy tabs until now. - What do the system and daemon logs show? They are quite huge and I don't know what string I can search for. I'm trying to trace back the point in which the problem happened, when I find something I'll post it. Sorry if I have nothing to post now. - And what do system and daemon logs show when a user tries to access a corrupt mailbox? In mail.log and syslog I have only this: Sep 18 08:44:14 mail dovecot: imap-login: Login: user=<p.xxxxxxxx>, method=PLAIN, rip=178.239.87.70, lip=xx.xx.xx.67 Sep 18 08:44:14 mail dovecot: IMAP(p.xxxxxxxx): Disconnected: Logged out bytes=74/357 ... Sep 18 09:45:00 mail dovecot: POP3(p.xxxxxxxx): Couldn't init INBOX: Mailbox isn't a valid mbox file Sep 18 09:45:01 mail dovecot: POP3(p.xxxxxxxx): Mailbox init failed top=0/0, retr=0/0, del=0/0, size=0 In mail.warn: Sep 18 09:45:00 mail dovecot: POP3(p.xxxxxxxx): Couldn't init INBOX: Mailbox isn't a valid mbox file After repairing: Sep 18 11:16:07 mail dovecot: pop3-login: Login: user=<p.xxxxxxxx>, method=PLAIN, rip=212.91.93.67, lip=xx.xx.xx.67 Sep 18 11:16:07 mail dovecot: POP3(p.xxxxxxxx): Disconnected: Logged out top=0/0, retr=0/0, del=0/0, size=0 - Is the system low on memory when mailbox corruption happens and do you collect and have SAR data? The server has 2Gb of memory. I cannot say if when the problem happens the memory is low, I think it should be enough for 30 mailboxes. I tried to launch a "sar" command but a message appears saying that the packet is not installed. I'm sorry I cannot provide more info, I'm not an expert :-( so I think I'll have to live with this problem. My only hope would be to find another one who has or had this same problem. Thanks! |
Quote:
http://hg.dovecot.org/dovecot-1.2/log (http://hg.dovecot.org/dovecot-1.2/log?rev=corruption) http://sourceforge.net/tracker/?group_id=311 (http://sourceforge.net/search/?group...rds=corruption) Quote:
Quote:
Quote:
Quote:
Code:
inotifywait -m var/spool/mail/ -e modify --format "%w%f" 2>&- | while read ITEM; do Quote:
Quote:
|
All times are GMT -5. The time now is 12:55 AM. |