LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Postfix: deferred mails pile up? (https://www.linuxquestions.org/questions/linux-networking-3/postfix-deferred-mails-pile-up-397384/)

Chowroc 12-28-2005 07:53 PM

Postfix: deferred mails pile up?
 
I found that when the number of deferred messages grow, postfix qmgr is prone to "pile up" the mails and active them all at once. Since I have control the rate and connection features like this:
Code:

initial_destination_concurrency = 1
default_destination_concurrency_limit = 1
smtp_destination_concurrency_limit = 1
in_flow_delay = 3s
default_process_limit = 80
maximal_queue_lifetime = 10d

# smtp_connection_cache_reuse_limit = 5s
smtp_connection_cache_reuse_limit = 5
smtp_connection_cache_time_limit = 1s
smtp_connection_cache_on_demand = no
smtp_connection_cache_destinations = sina.com

The 1st section limit to 20 mails/min to a destination, because most large site such as 163, yahoo, sina, sohu ... ask for this, otherwise many mails will be bounced while the mail was generated by a PHP program(to send confirm mails to registered users); and the 2nd makes the smtp process not reuse the connections, otherwise some site will bounce the mail with "too many letters during this connections".

But when mails to some site such as sina.com was not fast, these mails start to be deferred, at last I found in the maillog that qmgr active all of these mails just in several seconds.
Code:

Nov 22 09:43:17 PTZXMAIL postfix/qmgr[12331]: 7FEE6A2A034: to=<xxxx@sina.com>, relay=none, delay=474849, status=deferred (delivery temporarily suspended: lost connection with sinamx.sina.com.cn[202.108.3.187] while sending DATA command)
Just in this one second, I have found 225 such records, the most I have found is that 4700 records during maybe 4~5s.

Since I have limit the rate to a destination, why still postfix do this?

What about the defers between smtp & qmgr? since qmgr just manage the queue, why does it report "lost connection"?

---------------------------------

So far, I can only solve this problem with a temporary way: Just put all these mails to the hold queue, and then fetch them to requeue one by one. Below is a short script to do so:
Code:

#!/bin/sh

_sites=$@
_basetime=120
_offset=120

>sites.txt

for site in $_sites; do
        echo $site

        mailq \
        | awk "BEGIN{RS=\"\"; FS=\"(\n| *)\"} {if(\$NF~/@$site/) print \$1}" \
        | grep -v '!' \
        | sed 's/\*$//g' >tmp

        cat tmp | while read msg_id; do postsuper -h $msg_id; done

        mailq \
        | awk "BEGIN{RS=\"\"; FS=\"(\n| *)\"} {if(\$NF~/@$site/) print \$1}" \
        | grep '\!' \
        | sed 's/\!$//g' \
        | sort >>sites.txt
        # to avoid overriding this operations before!
done

sort sites.txt >tmp
cp -f tmp sites.txt

cat sites.txt | while read msg_id; do
        postsuper -r $msg_id;
        r=`echo "" | awk "{srand(); print int(rand()*$_offset+$_basetime)}"`
        sleep $r;
done

But I don't think it's a good idea, I want to find the way that makes the postfix solve this problem itself. Is there any one can help me?

Thanks.


All times are GMT -5. The time now is 06:24 PM.