Perl - MIME/HTML mail
Hi,
I've been programming in Perl for a while now but still haven't found the best way to solve this problem. I am writing a POP3 Client and fetch program to get the E-Mail from the POP3Client and insert it into a MySQL database. The main problem is that email is not readable when it is pulled from the database. The emails initially displayed like this: Code:
I then tried implementing a very simple regular expression s/<(?.*)>//sg on the string containing the mail before it is inserted - this worked, to a point but still left most of the mail unreadable. I then added to regular expression that is recommended by perldoc -q remove.html and that works fine :D Really the aim of the whole exercise is to _completely_ obliterate any other features that are left within the email :| If you look here: http://www.unixshak.org.uk:8080/hlpdsk/ then you can see that the system works with all plaintext email: http://www.unixshak.org.uk:8080/hlpd....php?ticket=43 It then struggles with the other HTML and MIME encoded parts of the emails :( Does anyone have the ultimate solution on how to get rid of all of the formatting text? I played with MIME::Parser and MIME::Body last night, and it didnt really get what I needed :/ Any help would be much appreciated. Thanks in Advance, Shak |
sorry to interupt but i am working on a program that i need to parse thru html files and delete all html coding and leave the text. i see that you may have something that i need. if i am reading your post correctly you have a regular expression to remove all html??
Quote:
|
Run perldoc -q and there is not only a Regular expression but a link to a perl script that will remove _all_ HTML from a document. Unfortunately that does not suffice for my problem.
Shak |
Ok, I solved the problem. I did some research into the problem and Ive found that the Outlook mail is really just souped up 2 part MIME messages. Now Perl has an array of modules (available from CPAN) for MIME, if you're interested check out MIME::Tools. I used MIME::Parser, MIME::Entity and MIME::Body. Below is the code I used to solve the problem (I added some extra comments as its out of context):
Code:
# This is part of Mail::POP3Client to get the headers and body of the POP3 mail in question Shak |
All times are GMT -5. The time now is 10:25 PM. |