Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
OK, This is quite a tricky request but I'm trying to convert the mail files from a legacy reader called Ameol2 to work with Thunderbird. I'm 80% there but have a snag I'm trying to resolve.
I can convert the mail folder files and import them into Thunderbird and 90% of the messages work AOK (even displaying the HTML version of a message!), but messages which had attachments don't display.
I have traced this down to what happens when Ameol2 decodes the attachment..
When you decode a message in A2 it saves it to a directory and then strips the trailing MIME part of the message ONLY from the message, leaving the "There's a mime attachment to this message" note in the header. Thunderbird then reads this but can't find the attachment (because Ameol2 has stripped it out), assumes it's a malformed message and doesn't display it
if I use grep -v to strip out all of the "There's a mime attachment to this message" header info then the HTML messages stop displaying and it looks a bit of a mess
Is there a program that can process my half converted mbox files and strip out all of the mime & html portions leaving only the plain text message versions (which I can live with)
I'm not aware of a program to do this - from the sound of it, the MIME remaining in the mbox files is corrupt, which would make it difficult.
But you might be able to salvage quite a lot if you understand the MIME layout. Read through the RFC first of all, RFC1521. Basically the text parts will be topped and tailed by a line starting "--", then the sub-headers within those parts will include a Content-Type of "text/plain", possibly with other options following.
You should be able to use awk to locate such sections of text and spit them out without the surrounding MIMEery, but it will be tricky to do a perfect job.
Can you post a (short!) complete message that has had an attachment removed? I'm thinking it might be easier to inject a dummy attachment than to extract the text... but it depends on exactly how the MIME is mangled.
Thanks for taking the time to think about my mail problem. I appreciate it.
I'll try and find some suitable messages to post as examples
It's quite tricky to isolate it down to a uniform set of if...then expressions.
The idea I had was to find a way to completely remove all the HTML & MIME sections from the mbox file (hopefully leaving only the plain text messages) and then perhaps replace the mime note in the header with one that specified plain text rather than "look there's an attachment"
eg replace Content-Type: multipart/alternative; with Content-Type: text/plain; charset="us-ascii"
I'm thinking this is quite a tricky suck-and-see problem/solution.
OK. That example is perfectly well formed, apart from the Content-Type, which as you say should be text/plain. If you changed that, the message should be acceptable to TB, and would include the text substituted for the PDF.
But presumably a naive substitution of all the Multipart/Mixed headers for text/plain would also zap perfectly good multipart/mixed messages that you have in there too?
But presumably a naive substitution of all the Multipart/Mixed headers for text/plain would also zap perfectly good multipart/mixed messages that you have in there too?
That's the problem. If I have messages that hat HTML parts or valid attachments that haven't been decoded then it screws up all messages after the first one with a valid attachment if I remember my tests.
That's the problem. If I have messages that hat HTML parts or valid attachments that haven't been decoded then it screws up all messages after the first one with a valid attachment if I remember my tests.
OK. I need to see what one of these more complicated messages looks like. Can you post another - or mail me one directly (nick.battle@gmail.com). These were presumably a mixture of decoded/removed attachments and other MIME parts which weren't mangled?
Andrew, I looked at the mbox you sent. I've modified it and mailed it back to you.
For some reason, the file had short groups of headers, each with a valid "From" prefix. So firstly, these were being interpreted as separate messages with no content (and few headers!). Then, some of the messages that had had attachments removed still had Content-Type headers for a multi-part message, even though they were actually text/plain. Fixing that for the few cases that remained seemed to produce a working mailbox.
The mbox you sent only had 15 messages (not 17 as I said in my mail!), so manual repair was easy. If you need to perform this on a larger number of messages, it will be trickier, but awk should be able to cope.
HTH,
-nick
Last edited by Nick_Battle; 03-28-2007 at 03:31 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.