Quote:
Originally Posted by countrydj
|
That shouldn't be possible -- that is, I cannot reproduce that with any input. (The
preg_split() will match that
###BF### part every single time.) Did you check the right log file? The time matches your trial?
However, I'm often wrong.. So let's redo the
// extract user ID changes with a better set:
Code:
$email = @quoted_printable_decode($email);
$parts = @preg_split('/###[\t\n\v\f\r ]*BF[\t\n\v\f\r ]*###/', $email, 3, PREG_SPLIT_NO_EMPTY);
$email = preg_replace('/^[^@]*###/', "", @$parts[1]);
$email = preg_replace('/###.*$/s', "", $email);
$email = trim($email, "\t\n\v\f\r ");
- The first line will fix the embedded (aka soft) newlines and escapes. Not all messages use quoted-printable encoding, but it should not do harm even if the message was not quoted-printable encoded. Specifically, it's unlikely for e-mail addresses to have a = followed by a two-digit hexadecimal number in them. Even if they do, they're affected only if the e-mail server decides to use non-quoted-printable encoding. The other parts of the message do not matter, since we discard them anyway.
- The second line splits the e-mail into no more than three parts, since the second one should contain the e-mail address, and the other parts are discarded anyway. It should work correctly even if the e-mail server added extra newlines and whitespace in the separator. The function eats the separators; they're not included in the split parts. (The last one may contain "unused" separators, since we only split it in maximum three parts.)
- If there still is a ### in $email, and there is no @ before it, the third line removes ### and everything before it. This should remove all garbage before the actual address.
- If there still is a ### in $email, the fourth line removes it and everything after it. This should remove all garbage after the actual address.
- The fifth and last line trims whitespace from the beginning and end of the e-mail address. This is just good practice.
For the PHP
preg_ family of functions, the reference on the special characters is
here, and explanations on the backslash escape sequences
here.
Escape sequences for all single-quoted strings in PHP are described
here (there are only two,
\\ and
\'). Escape sequences for all double-quoted strings in PHP are described
here.
When using for example
preg_ functions, I use single-quoted strings for the pattern. If you use double-quoted strings in the pattern, PHP will first de-escape the double-quoted escapes, then the
preg_ function will do its own de-escaping on top of those. That's a lot of backslashes.
I'm used to working with non-English locales, so I usually write the "any newline or space or tab or such" explicitly out as "
[\t\n\v\f\r ]". Others use "
[\s\v]" or "
[:space:]". I tend not to, because their interpretation may depend on the locale. And with functions like
trim or
strcspn, you always need to use the first form (without the angle brackets,
[]).
Hope this helps,
Nominal Animal