LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   PHP: MIME header encoding (https://www.linuxquestions.org/questions/programming-9/php-mime-header-encoding-4175495742/)

dstu 02-21-2014 03:06 AM

PHP: MIME header encoding
 
Hi all,

I'm struggling with an issue and I need assistance:

I need to convert HTML encoded text to mime header encoding using PHP, but it's not working for me.

Original text in Hebrew:
הודעה בעברית

HTML Entities text (I replaced the & by _ for it to be properly displayed here):
_#1492;_#1493;_#1491;_#1506;_#1492; _#1489;_#1506;_#1489;_#1512;_#1497;_#1514;

Mime encoded text by Thunderbird (correct):
=?UTF-8?B?15TXldeT16LXlCDXkdei15HXqNeZ16o=?=

Mime encoded text by the following PHP function (incorrect):
PHP Code:

mb_encode_mimeheader(html_entity_decode($nameENT_NOQUOTES'UTF-8'), "UTF-8""B"

=?UTF-8?B?w5fClMOXwpXDl8KTw5fCosOXwpQgw5fCkcOXwqLDl8KRw5fCqMOXwpnDl8Kq?=


Can anyone identify why my function doesn't generate the correct result?

Thanks a lot,

David

NevemTeve 02-21-2014 03:28 AM

Debugging is what you should do

Code:

#!/usr/local/bin/php
<?php
    $in= '&'.'#1492;';
    printf ("in='%s' [hex %s]\n", $in, bin2hex ($in));

    $step1= html_entity_decode ($in, ENT_NOQUOTES, 'UTF-8');
    printf ("step1='%s' [hex %s]\n", $step1, bin2hex ($step1));

    $step2= mb_encode_mimeheader ($step1, 'UTF-8', 'B');
    printf ("step2='%s' [hex %s]\n", $step2, bin2hex ($step2));
?>

Examine the output step by step and find out when it goes awry.
Code:

in='ה' [hex 2623313439323b]
step1='×' [hex d794]
step2='=?UTF-8?B?w5fClA==?=' [hex 3d3f5554462d383f423f773566436c413d3d3f3d]


dstu 02-21-2014 03:53 AM

I did what you suggested (note that I'm using php-cli and not web).

These are the results I get:
Code:

root@server:/usr/local/sbin/phpfilters# ./test_mime.php
in='_#1492;' [hex 2623313439323b]
step1='ה' [hex d794]
step2='=?UTF-8?B?w5fClA==?=' [hex 3d3f5554462d383f423f773566436c413d3d3f3d]

In Thunderbird, for the single letter _#1492; I get the following result:
=?UTF-8?B?15Q=?=

I'm not sure what to do with these results.

Thanks.

NevemTeve 02-21-2014 04:42 AM

if unicode is 1492 (#x5D4), then UTF8 is D794, which is indeed '15Q=' in base64
so methinks it 'step2' that does the wrong thing.
Documentation of mb_encode_mimeheader suggests setting mb_internal_encoding

Add this to the beginning of the script:
Code:

    mb_internal_encoding ('UTF-8');

dstu 02-21-2014 08:26 AM

solved, using another function
 
Hi,

I need to check your suggestion, but in the mean time, I found another function that works (more complex, though):

PHP Code:

    $preferences = array(
        
"input-charset" => "UTF-8",
        
"output-charset" => "UTF-8",
        
"scheme" => "B",
        
"line-length" => 76,
        
"line-break-chars" => "\n"
    
);
    
    
$from_name=substr (iconv_mime_encode (''html_entity_decode($nameENT_NOQUOTES'UTF-8'), $preferences),2); 

The substr is required because the function puts a ": " in the beginning of the result.

Thank you for your help!!

David

dstu 02-21-2014 08:32 AM

Quote:

Originally Posted by NevemTeve (Post 5122170)
Add this to the beginning of the script:
Code:

    mb_internal_encoding ('UTF-8');

I checked your solution now. It also works (and it's cleaner).

Thanks again!


All times are GMT -5. The time now is 03:19 PM.