LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (http://www.linuxquestions.org/questions/linux-server-73/)
-   -   Why did SquirrelMail, Dovecot Imap and Outgoing Mail all break at once? (http://www.linuxquestions.org/questions/linux-server-73/why-did-squirrelmail-dovecot-imap-and-outgoing-mail-all-break-at-once-764914/)

websissy 10-27-2009 09:55 PM

Why did SquirrelMail, Dovecot Imap and Outgoing Mail all break at once?
 
Help! I need some seasoned advice please.

We're running the OldStable version of Debian Etch from August 2008. We've been using squirrelmail connecting through Dovecot's Imap and Pop3 servers since then to provide either SSH or TLS/SSL connections to postfix mail via squirrelmail on our server. Although the SSL capability is installed, we're really not using it -- choosing to use SSH with strong passwords instead.

This configuration has given us NO problems since we started... until today. For unexplained reasons this morning the IMAP interface suddenly began refusing or failing connections to everyone trying to connect through SquirrelMail (or that's the way it looks from the outside). It also fails to send out any emails. I've tried rebooting the server but that made almost no difference.

The problem was first reported by a user. I then verified it. What I was seeing BEFORE the reboot when I tried to login was an error from SquirrelMail that said:

Error connecting to IMAP server: myserver.com.
11074 :

but what I'm seeing since the server reboot is:

Error connecting to IMAP server: myserver.com.
11087 :

The IMAP server connect problem seems to be isolated to SquirrelMail. At least I ran 2 tests and found I CAN connect to the IMAP server using both Microsoft Outlook 2003 and Outlook Express and can see the contents of all folders on the server. So the IMAP problem only shows up in SquirrelMail. But it DOES prevent ANY users from loggin in through SquirrelMail

However, the inability to send mail OUT from the server shows up everywhere. Mail sent internally between accounts on the server -- either within a single domain or between domains -- and even from remote users connected to the server through Outlook or Outlook express gets delivered fine. But email addressed to anyone outside the server to any domain -- whether Yahoo or or MSN or Google or whereever is all bouncing back with a "relay request denied" error.

For instance, I sent an email from one of my server accounts to my yahoo inbox and it bounced back.

Other things I've tried are:

checked port status with

nmap -qT

both port 443 and 993 are reported as open -- one with imap and the other with imaps.

I restarted inetd. It had no effect.

I restarted postfix. It had no effect.

I restarted the whole server. The IMAP login error code when attempting to login via SquirrelMail changed 11074 to 11087. That's all. All other behaviors remain the same.

I also confirmed the SquirrelMail login failure problem occurs in IE 7, IE 8 and Firefox from 3 different machines in multiple geo-locations and networks AND both with and without the user's local firewall running. So the issue is definitely ON the server and seems to be isolated to squirrelmail even though no changes have been made to squirrelmail or any of its components in months.

When I checked the mail.err log, I found the following series of seemingly useless error messages:

Code:

Oct 25 08:40:33 axx012503 dovecot: POP3(sarah): unlink(/var/mail/sarah.lock) failed: Permission denied
Oct 25 09:31:48 axx012503 dovecot: POP3(mymailname): UIDs broken with partial sync in mbox file /var/mail/mymailname
Oct 25 10:02:28 axx012503 dovecot: POP3(mymailname): UIDs broken with partial sync in mbox file /var/mail/mymailname
Oct 26 01:01:06 axx012503 postfix/sendmail[30523]: fatal: root(0): queue file write error
Oct 26 15:06:53 axx012503 dovecot: POP3(sarah): unlink(/var/mail/sarah.lock) failed: Permission denied
Oct 27 01:01:04 axx012503 postfix/sendmail[6715]: fatal: root(0): queue file write error
Oct 27 08:56:15 axx012503 dovecot: POP3(bigdork): unlink(/var/mail/bigdork.lock) failed: Permission denied
Oct 27 08:56:16 axx012503 dovecot: POP3(bigdork): file_dotlock_delete() failed with mbox file /var/mail/bigdork: No such file or directory
Oct 27 14:57:56 axx012503 dovecot: POP3(mymailname): UIDs broken with partial sync in mbox file /var/mail/mymailname

That's about it. I've been racking my brains on this all day and I'm no closer to a resolution now than I was when the problem was reported 12 hours ago. At this point, any hints, helpful suggestions or questions would be appreciated! :rolleyes:

Thanks!

MensaWater 10-28-2009 07:57 AM

It sounds like someone started blocking ports. Did you check iptables? Is there a firewall device between you and the places you're trying to go that maybe the network team made changes on? Did someone introduce some new tool like Websense? Did your ISP start blocking mail suddenly.

Your issue may be with port 25 SMTP rather than the rest since everything broke at the same time.

websissy 10-28-2009 08:44 AM

Quote:

Originally Posted by jlightner (Post 3735272)
It sounds like someone started blocking ports. Did you check iptables? Is there a firewall device between you and the places you're trying to go that maybe the network team made changes on? Did someone introduce some new tool like Websense? Did your ISP start blocking mail suddenly.

Your issue may be with port 25 SMTP rather than the rest since everything broke at the same time.

You know, I went to bed last night puzzling over the same things here. And woke up this morning realizing one could explain the outgoing mail failures if mail was being blocked somewhere upstream from this server. But that alone wouldn't also explain the sudden change in IMAP behavior would it?

I'm the one and only server admin on this dedicated server. Indeed, I pay the annual server lease and have designed and admin all but one of the sites on the server. I've made no changes to iptables; but I'll double-check that to confirm that's NOT the issue. The server does not run Websense. It could be that the hosting center has started blocking outgoing mail; but you'd think that if they had done that and targeted my server they would have notified me of any problem first and I've received no notices or warnings whatsoever. So for the moment I assume they are NOT blocking our outgoing mail. Once I've eliminated other potential on-server issues, I'll check that. There's no firewall device I know of in their dedicated server hosting center that could produce this result; but I'll ask about that too.

The way I see it, it seems more likely a single interruption or change somewhere in the server's mail loop has caused all these problems than that a series of coincidental events has. Therefore I'm convinced I'm looking for a single smoking gun somewhere.

So, the question is what single event could possibly cause all the behaviors I'm seeing here?

Thanks for the feedback!

websissy 10-28-2009 06:03 PM

The Smoking Gun?
 
It dawned on me a while ago that I may have known the cause of this problem all along but have been overlooking it because it was so obvious.

We had an issue involving email on the server on Monday the 26th in which users complained they were unable to login to check mail through squirrelmail. At the time I could not identify an obvious cause for the problem. Furthermore, except for squirrelmail logins, the server responded normally and did not seem to be under stress. So, after several wasted hours trying to isolate a cause for the problem, I ruled out the possibility of a DDOS attack and decided to try a remote server restart.

So without considering the impact, I logged-in as the admin and did a "shutdown -r now". I realized within seconds I should have done a more orderly shutdown; but by the time that dawned on me the server was already rebooting.

To my surprise and disappointment that reboot failed and the system did not come back up again as expected. After waiting an hour with no reply or recovery from the system, I contacted the hosting center and requested a manual reboot of the server. It came back up right away and from all the tests I ran at the time, it seemed to be fine. Email logins worked and everything else I tried seemed to work too.

That was, until the server's email went down again yesterday morning -- this time refusing to allow IMAP logins and throwing the strange error messages you see in my first post above into /var/logs/mailerr.log I've been chasing the cause of that problem ever since.

But now I'm wondering if the issue can't be all traced back to that uncontrolled shutdown on Monday.

So, my question is: "If I take the server offline in a KVM mode, aren't there some disk integrity checking utilities I can run to make sure the mail queue or other postfix datafiles on the hard drive weren't damaged by the shutdown?"

Can anyone tell me what those utilities are or point me to a procedure somewhere that will help? I've looked. But so far, I'm not having much luck.

Thanks!

AlucardZero 10-28-2009 06:46 PM

"shutdown -r now" is clean and orderly.

websissy 10-28-2009 11:36 PM

Quote:

Originally Posted by AlucardZero (Post 3735921)
"shutdown -r now" is clean and orderly.

Yes but the forced *manual* reboot when the system did not come back up after that *orderly shutdown* wasn't necessarily "orderly" at all. I have no idea what the server center operators did when that happened.

So, to elminate the possibility of undetected file system damage as the cause of my email problem, what I'm proposing to do is restart the system from the secondary hard drive (that drive should be bootable) and run a manual fsck on the unmounted primary drive to confirm all files are intact and there is no undetected damage to the main file system as a result of that forced reboot.

Are you saying that's unnecessary because the journaling file system should have recovered from such glitches?

Thanks!

websissy 10-29-2009 02:49 AM

Quote:

Originally Posted by websissy (Post 3735884)
It dawned on me a while ago that I may have known the cause of this problem all along but have been overlooking it because it was so obvious.

... blah... blah... blah...

But now I'm wondering if the issue can't be all traced back to that uncontrolled shutdown on Monday.

This may have been a lame-brain idea; but it seems to have worked. To eliminate the possibility of file system damage that was lingering around since that uncontrolled reboot by the server center ops guys on Monday I requested KVM access to the server. Then I rebooted the system and forced a fsck on the boot drive. When that was done, I rebooted again and the mail system came back up and appears to be working fine now! :-)

Time will tell whether I really solved the underlying problem or not. But at the moment it's running smoothly and mail IS going out again.

Hurray!


All times are GMT -5. The time now is 02:45 PM.