LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Email consolidation (https://www.linuxquestions.org/questions/linux-software-2/email-consolidation-639528/)

Draciron 05-02-2008 11:35 PM

Email consolidation
 
For years I've been collecting email under various clients and the sheer size of it makes it now impossible to find anything and I'm using up 100+ gigs just collecting and backing it all up. What I want to do is collect it all into one email client then sort out the many duplicates.

I have email stored in Kmail, Thunderbird, Evolution and older unsupported formats like Netscape mail, Eudora and even a 100 megs or so of Outlook PAB files.

I'm happy with Kmail or Thunderbird but have not been able to successfully move email even between them. The dir structures changed over various versions and getting them to look in the right place fails. Anybody have suggestions on how to import any of this into modern Linux email clients. If I can get it all into one client I figure I can export the sorted and trimmed down email and just maintain it in a non propriatory format. So any tool that can import these would be greatly appreciated.

I am using FC on all my desktops, Ubuntu (Really Kbuntu the way I have it configured on mylaptop.) I have a new machine going into service shortly and can put any necessary distro on it to do this.

Thanks

Draciron 05-03-2008 09:28 AM

Shameless bump. You mean nobody knows of any good email conversion tools? If I can get some of them converted at least I've started down the road.

TIA...

unSpawn 05-03-2008 10:19 AM

Quote:

Originally Posted by Draciron (Post 3141132)
Shameless bump.

Per the LQ Rules, please do not bump your own thread until at least 24 hours have elapsed without a reply. Because the LQ membership is global, people in other time zones may not have seen this post yet, and thus it may take some time before a response is received.
http://www.linuxquestions.org/rules.php

* Besides that Linuxquestions.org uses a 0-reply list to show your fellow members threads that haven't been replied to, so by replying to your own OP you only throw away chances for more eyeballs.


Quote:

Originally Posted by Draciron (Post 3140734)
What I want to do is collect it all into one email client then sort out the many duplicates. I have email stored in Kmail, Thunderbird, Evolution and older unsupported formats like Netscape mail, Eudora and even a 100 megs or so of Outlook PAB files.

E-mail clients are distro-independent, so use the distro you're comfortable with. IIRC As far as the "unsupported" formats goes at least ye aulde Netscape and Eudora use mbox-format (plaintext) file format.

The two e-mail protocols are POP and IMAP. Only real crappy clients support only POP3. With POP3, the default choice for most SOHO situations, you download e-mail from the mailserver to your e-mail client and while doing so the messages are deleted from the mailserver. With IMAP your e-mail stays on the server and you use your e-mail client just to read and organise e-mail. Setting up an IMAP server is real easy since it doesn't have exotic dependencies, and IMHO the easiest conversion would be to set up an IMAP mailserver, move mailboxes over that are plaintext and copy over e-mail from within clients that use a proprietary format like the ClippyOS one.

Sorting e-mail will require a bit more work because you can't rely on date (skew) or subject (obvious). The only unique part inside a sent(!) e-mail is the Message-ID header. Procmail/Formail can read e-mail headers so with a procmail recipe it is possible to do whatever ops you want to on e-mail. On "100+ gigs" of unsorted e-mail it might take "some" time though :-]

Draciron 05-04-2008 11:53 AM

Quote:

Originally Posted by unSpawn (Post 3141176)

* Besides that Linuxquestions.org uses a 0-reply list to show your fellow members threads that haven't been replied to, so by replying to your own OP you only throw away chances for more eyeballs.



E-mail clients are distro-independent, so use the distro you're comfortable with. IIRC As far as the "unsupported" formats goes at least ye aulde Netscape and Eudora use mbox-format (plaintext) file format.

The two e-mail protocols are POP and IMAP. Only real crappy clients support only POP3. With POP3, the default choice for most SOHO situations, you download e-mail from the mailserver to your e-mail client and while doing so the messages are deleted from the mailserver. With IMAP your e-mail stays on the server and you use your e-mail client just to read and organise e-mail. Setting up an IMAP server is real easy since it doesn't have exotic dependencies, and IMHO the easiest conversion would be to set up an IMAP mailserver, move mailboxes over that are plaintext and copy over e-mail from within clients that use a proprietary format like the ClippyOS one.

Sorting e-mail will require a bit more work because you can't rely on date (skew) or subject (obvious). The only unique part inside a sent(!) e-mail is the Message-ID header. Procmail/Formail can read e-mail headers so with a procmail recipe it is possible to do whatever ops you want to on e-mail. On "100+ gigs" of unsorted e-mail it might take "some" time though :-]


Apologize for breaking etiquite.

It's not downloading that is the problem. These are all emails that have long ago been downloaded. Back in ages eon past I used Eudora for my home email. Win 3.11 version mostly, briefly used it under NT. Places I worked however used Outlook so I have PAB files lingering around from those days. I switched to Netscape mail after moving to Linux and used that for a year or two and switched to Kmail which has been my primary email client until recently. I like features of both Thunderbird and Kmail and have large stores of email in both format. I also have a few scattered emails stored from Evolution which I'd also like to consolidate. Sorting dups is going to be tedious but I can do that easily enough. It's getting them into the same client that is the problem.


Issue.
Kmail cannot even see it's own email if the dir structures are not exact and the dir structures have changed over the years.

I have not yet successfully imported ANY format into Thunderbird. I can point all day but it does not import the email.

I have heard of but have no idea where to find tools to convert Eudora email to Kmail or Thunderbird formats for importing.

Any idea how to get Evolution to read it's own mail formats? Again getting old email archives to import has been unreasonably difficult. I can do it once, but as soon as I add folders a second time or change inbox to a different name to avoid name conflicts the folders are ignored just as they are with Thunderbird and Kmail.

I have both pop and imap access to my current email provider which is not an issue. I want to go through my offline email to contact people I've lost touch with, to organize it and most of all to delete the gigs of email that are no longer needed but are taking up huge amounts of space in my archives.

Tools I hope to find.

A way to convert either Thunderbird or Kmail to a format that the other can see.

A tool to convert old Eudora email to a format either Kmail or THunderbird can import.

A tool to allow me to import email boxes into either Kmail or Thunderbird even though I might have 100 boxes named inbox or if I renamed the inbox that the import function bombs out because I renamed the mailbox. There can only be one inbox and so I am limited to using only ONE archived set of folders at a time and no way to combine these many archives. Each time I search an archive of either Kmail or Thunderbird I have to copy the entire dir structure and if I've deleted a single cache file or index file the whole thing fails and I can't see ANY email :(

Tools for finding duplicates based on content would be really cool but I can easily sort by sender and title and get rid of many of them that way.

Any of these tools gets me one step closer.

unSpawn 05-04-2008 08:03 PM

Quote:

Originally Posted by Draciron (Post 3142302)
Apologize for breaking etiquite.

Why? Nothing to break here.


Quote:

Originally Posted by Draciron (Post 3142302)
It's not downloading that is the problem. These are all emails that have long ago been downloaded.

You misunderstood that. That was just me explaining POP vs IMAP and the benefits of IMAP.


Quote:

Originally Posted by Draciron (Post 3142302)
I have heard of but have no idea where to find tools to convert Eudora email to Kmail or Thunderbird formats for importing.

With all due respect but I don't think you understood what I wrote. Through the years I've dealt with most of those MUA's and then some. I've found running a local IMAP server by far the most efficient, easiest way to consolidate e-mail.

tredegar 05-05-2008 05:06 AM

Moving mail to kmail.
I've done this several times. It's a pain, but possible.
Run kmail at least once, so the default directories are created. Now is the time to create "Local folders" for the different classes of email you have (Personal, Work, Purchases, VeryoldEmails etc) or are going to import.
Close kmail.
Assemble all your old emails in mdir format (one file for each message)
There's a tool, mbox2mdir to help you convert mbox (one huge file with lots of individual emails, one after the other) to mdir format.

Make sure kmail is not running.
Navigate to ~/.kde/share/apps/kmail/mail with konqueror
Turn on "Show hidden files"
Delete all the "dot" files that have the word index in them.

Put all your email files (They'll probably have names like 1209136216.5562.1YxDq:2,S be in plain text, and have all the headers at the beginning of the file) in the appropriate ~/.kde/share/apps/kmail/mail/mailboxname/new direcories. Copy only the email files, not directories.

Restart kmail
Wait a long time while all the emails are re-indexed
20,000 emails take about 20minutes to reindex on a 1.6GHz P4
Close kmail.
Reopen kmail. You may have to click on each mailbox in turn before the "Total" and "Unread" columns are displayed correctly.

Hope this helps.

unSpawn 05-06-2008 07:09 AM

While it's nice to see a conversion procedure for one specific application, a choice based on a protocol like IMAP instead means your e-mail is not tied to usage with one application or on one platform and not at risk if the application breaks. It also means you can access your e-mail from any machine on your LAN.

tredegar 05-06-2008 09:55 AM

unSpawn,

I was only trying to help, but perhaps you can see the bigger picture, which I am having difficulty with.

IMAP sounds useful, and I took a look at it (Eg here: http://www.faqs.org/docs/Linux-HOWTO...MAP.html#ss2.1 ). It looks complicated, but maybe it is really easy when you "just try it".

Are you suggesting that Draciron set up an IMAP server on one of his machines and put all his email there, and then access it from another application (Kmail, thunderbird, whatever), with his IMAP serving up his emails? So future emails are fetched from his ISP (and are deleted from their server) to his IMAP server, which then serves them up to the application of his choice, and never deletes them?

That way his email client would only get mail from his IMAP server, which he could then treat as the equivalent of an especially benevolent ISP who didn't mind 100+GB of mail just sitting there, and never being deleted?

Apologies for all the q's, but I'm trying to understand the concept.

Is moving email from one IMAP server to another a (relatively) straightforward process? Can you just point the new IMAP server at the old one and say "Fetch everything from the old server"?

And what about the mail he sends? That "SentMail" directory is a useful reference. The page I linked to above says
Quote:

Through IMAP the user can create, delete, or rename mailboxes; get new messages; delete messages; and perform search functions on mail. A separate protocol is required for sending mail.
A separate protocol? I hope the answer is not sendmail. That is scary stuff.

Looking at my distro's repositories, there are lots of potential IMAPs: bincimap courier-imap cyrus21-imapd and more. Is this a bit like "KDE vs gnome" - ie they're all IMAPs but a bit different in the ways they are implemented?
Thanks for reading.

unSpawn 05-06-2008 10:49 AM

Quote:

Originally Posted by tredegar (Post 3144597)
I was only trying to help

I know. Helping out is cool. My remark wasn't criticising your conversion procedure.


Quote:

Originally Posted by tredegar (Post 3144597)
IMAP sounds useful, and I took a look at it (Eg here: http://www.faqs.org/docs/Linux-HOWTO...MAP.html#ss2.1 ). It looks complicated, but maybe it is really easy when you "just try it".

It depends what or why you call it complicated. Setting up IMAP is one of the easiest things to do. But Cyrus is harder to set up due to dependencies. Maybe check out other product like Dovecot (compare with this walkthrough for instance: http://www.debian-administration.org/articles/275/print), Courier-IMAP or UW-IMAP.


Quote:

Originally Posted by tredegar (Post 3144597)
Are you suggesting that Draciron set up an IMAP server on one of his machines and put all his email there, and then access it from another application (Kmail, thunderbird, whatever), with his IMAP serving up his emails? So future emails are fetched from his ISP (and are deleted from their server) to his IMAP server, which then serves them up to the application of his choice, and never deletes them?

Exactly! Central storage, unrestricted access.


Quote:

Originally Posted by tredegar (Post 3144597)
That way his email client would only get mail from his IMAP server, which he could then treat as the equivalent of an especially benevolent ISP who didn't mind 100+GB of mail just sitting there, and never being deleted?

Yes.


Quote:

Originally Posted by tredegar (Post 3144597)
Is moving email from one IMAP server to another a (relatively) straightforward process? Can you just point the new IMAP server at the old one and say "Fetch everything from the old server"?

IMAP is just an "enabler", representation. I know there is one IMAP to IMAP copy tool but since nothing changes on your filesystem (as you keep whatever mailbox format and ~/mail storage structure your users currently use) you could copy that structure to the new server easily.


Quote:

Originally Posted by tredegar (Post 3144597)
And what about the mail he sends? That "SentMail" directory is a useful reference. The page I linked to above says A separate protocol? I hope the answer is not sendmail. That is scary stuff.

Sendmail is NOT a scary MTA. Most of the security considerations are ancient history. Anyway. Any MTA should do, really.


Quote:

Originally Posted by tredegar (Post 3144597)
Looking at my distro's repositories, there are lots of potential IMAPs: bincimap courier-imap cyrus21-imapd and more. Is this a bit like "KDE vs gnome" - ie they're all IMAPs but a bit different in the ways they are implemented?

One comparison is here: http://www.linuxjournal.com/article/6998. It's old and as such hopelessly outdated, specs have evolved. It does give some insight in why some IMAP servers are more equal than other IMAP servers.

Draciron 05-06-2008 11:41 AM

Quote:

Originally Posted by unSpawn (Post 3144644)
It depends what or why you call it complicated. Setting up IMAP is one of the easiest things to do. But Cyrus is harder to set up due to dependencies. Maybe check out other product like Dovecot (compare with this walkthrough for instance: http://www.debian-administration.org/articles/275/print), Courier-IMAP or UW-IMAP.

Maybe I'm not getting what your suggesting. These are files that have long since been downloaded. How exactly would I move them up to even a local server as they are in disparate formats. I'd still need to convert them to something. I also have many duplicates scattered that leap across many backups.

It also makes the assumption I'm running multiple machines. I happen to be running 5, all 5 connected on internal lan, and 3 dual homed to also talk to the net. Though two of them I run headless. All are located in same room so I really don't need to access them anywhere but from the same console. Women tend to get upset about unsightly cables and spillage like CDs, and related toys. So I centralize all my machines to avoid such issues with girlfriends. They just don't go into computer room unless they want to use a computer and the mess is out of their site except the recording box located next to my gutars (at this moment in same room with rest of the computers) and the laptop which wanders around the house. So I really have no need to access them anywhere. I use web base email for current email. I archive offline. Amazing how fast even Google mail can fill up :)

I archive for many reasons. Back when Hotmail first came out I signed up with them. Then M$ bought them and it's less than great service went down hill fast. Luckily in the early days they had pop 3 access and I have copies of many of those emails from Hotmail. Same with Yahoo. I've had oen Yahoo email account for over 10 years now, the other for at least several. For a long time they allowed free pop3 access and I saved all those email offline. Good thing as they won't even let me PAY them to get pop3 access to those accounts. So I save it in archives. I have work emails saved off from various places I've worked. Anyway you get the idea. 20 years worth of email that's stacked up and grown into a monster. %75 of it I just want to delete as it means nothing any more but that last %25 often contains contact info I really want at or things of sentimental value or other reasons I really want to be able to occasionally read it. I might go for months without looking at a single email on my clients. I use web based email for current email. The client side storage is purely for archival purposes. It's also so I don't have to remember, was that on x.domain or gmail or yahoo. I'll have a copy in me archives and I can search through all of my accounts I can download from at the same time.



Exactly! Central storage, unrestricted access.

Quote:

Originally Posted by unSpawn (Post 3144644)
IMAP is just an "enabler", representation. I know there is one IMAP to IMAP copy tool but since nothing changes on your filesystem (as you keep whatever mailbox format and ~/mail storage structure your users currently use) you could copy that structure to the new server easily.

I dunno about that. Some of the formats are REALLY old. Some like M$'s formats constantly change just to change. It'd also add another service to run and secure as well as a potential vulnerability that I don't have. If I don't NEED to run a service I usually don't. Just one less thing to worry about. One less thing that can break or that can be compramised.

Quote:

Sendmail is NOT a scary MTA. Most of the security considerations are ancient history. Anyway. Any MTA should do, really.
There are good relative merits for many email transport systems. I've set up several different email systems but don't have a personal favorite. I hate using sendmail though mostly because of security issues and because while well documented it's syntax is a little arcane.

I do appreciate the idea. If I were serving up email for an entire family the IMAP idea would be a really good idea. My girl left me though (must have been all the cables LOL) and my kids live with their respective and sometimes disrepective mothers. So it's just me with two monitors, one hooked to a KVM, the other dedicated to one box. Thank you for the ideas and the response and I suspect many folks reading this exchange will find the IMAP idea perfect.

Draciron 05-06-2008 11:46 AM

Quote:

Originally Posted by tredegar (Post 3143139)
Moving mail to kmail.
I've done this several times. It's a pain, but possible.
Run kmail at least once, so the default directories are created. Now is the time to create "Local folders" for the different classes of email you have (Personal, Work, Purchases, VeryoldEmails etc) or are going to import.
Close kmail.
Assemble all your old emails in mdir format (one file for each message)
There's a tool, mbox2mdir to help you convert mbox (one huge file with lots of individual emails, one after the other) to mdir format.

Make sure kmail is not running.
Navigate to ~/.kde/share/apps/kmail/mail with konqueror
Turn on "Show hidden files"
Delete all the "dot" files that have the word index in them.

Put all your email files (They'll probably have names like 1209136216.5562.1YxDq:2,S be in plain text, and have all the headers at the beginning of the file) in the appropriate ~/.kde/share/apps/kmail/mail/mailboxname/new direcories. Copy only the email files, not directories.

Restart kmail
Wait a long time while all the emails are re-indexed
20,000 emails take about 20minutes to reindex on a 1.6GHz P4
Close kmail.
Reopen kmail. You may have to click on each mailbox in turn before the "Total" and "Unread" columns are displayed correctly.

Hope this helps.

That helps a great deal! Hadn't thought about the .index files. That's why moving dirs only worked once. The mbox tool sounds exactly like what I need for many of the formats. That gets me started at least. I've tried the import function in Kmail and in Thunderbird. In both cases they go away. One I let run overnight and 10 hours later all it'd done was create a dir in my home dir with the subdir structure but no emails and nothing imported. The Mbox idea will let me conver the bulk of them to that one dir and I can sort them manually or programatically from there. It's actually getting them into either Kmail or Thunderbird that is the part that's stumping me.

Thanks again.

unSpawn 05-06-2008 12:48 PM

Quote:

Originally Posted by Draciron (Post 3144684)
Maybe I'm not getting what your suggesting. These are files that have long since been downloaded. How exactly would I move them up to even a local server as they are in disparate formats. I'd still need to convert them to something.

If you care to read back my first reply I said: "set up an IMAP mailserver, move mailboxes over that are plaintext and copy over e-mail from within clients that use a proprietary format like the ClippyOS one". With all due respect but that is allmost all you need to know to achieve your goal.


Quote:

Originally Posted by Draciron (Post 3144684)
It also makes the assumption I'm running multiple machines.

No it does not.


Quote:

Originally Posted by Draciron (Post 3144684)
I do appreciate the idea. If I were serving up email for an entire family the IMAP idea would be a really good idea.

No you don't. But dismissing it on the wrong grounds is your right.

tredegar 05-06-2008 01:50 PM

@unSpawn,

Thanks for the clarifications and links (especially the debian-admin one - lots of good stuff there, it's bookmarked).

I understand the concept much better now, and I think I should move to this way of handling emails.

Like Draciron I have accumulated many (since the 80's), and I know there's still a ton of stuff in mbox format from when I ran something called KA9Q which ran in DOS, but worked fine for many years. Right now I just vim that huge file when needed (OK, it's very rarely needed). Anyway, my thanks for your explanation of this fresh (to me) concept of a local IMAP server: It is good!

@Draciron,

Pleased I could help with the kmail thing. You still have a lot of work too do (converting to mdir format for example).

If you want to stick with kmail (and in the short term, it doesn't matter whether you choose that or unSpawn's IMAP solution - you still need to convert formats) then one thing that might save you a lot of time, and encourage reliability, is this:

When you want to import another batch of mail to kmail, create an entirely new directory (AKA "Local Folder") from within kmail (which keeps a lot of stuff in config files, and this method has worked reliably for me in the past) for the new import.

Close kmail.

Navigate to ~/.kde/share/apps/kmail/mail and delete only the .*index* files that refer to the newly created directory. You'll need to delete them even if they are empty. Then dump your mdir format emails into that directory. Then restart kmail. This way the indexes will only have to be created for the new directory, not all of them, all over again.

Apart from saving time, if the import goes horribly wrong, you'll only have to remove the latest directory (from within kmail, please) and start again. The other directories and indexes will be unharmed.

When you are happy with the new import, you can use kmail's search tools and filter rules (You can select a bunch of emails, R-Click them and then "Message -> Apply Filters") to select emails from the new import, and move them to different, more appropriate directories.

As for some of the more obscure email formats you want to import, if mbox2mdir can't help, you can probably write some bash scripts to make conversions for you. I am no expert with these (sed awk, but I eventually get them to work), but plenty of people here are.

You have quite a few hoops to jump through, but I think it'll be tedious rather than difficult.

Once you have your emails sorted out, all in mdir format, then I think IMAP is the way to go: unSpawn's answers to my questions were highly illuminating.

Good luck.

unSpawn 05-06-2008 05:43 PM

Quote:

Originally Posted by tredegar (Post 3144790)
As for some of the more obscure email formats you want to import, if mbox2mdir can't help, you can probably write some bash scripts to make conversions for you. (..) Once you have your emails sorted out, all in mdir format, then I think IMAP is the way to go

No. And that's why I talked about choosing an application versus choosing for the Internet Message Access Protocol standard. As long as you have the e-mail application and it can "talk" IMAP there's no need to "convert" anything. That's the beauty of it. Plaintext, GPG-encapsulated, multipart MIME or "HTML e-mail", e-mail is e-mail, it really doesn't matter. You can copy over any mails from your MUA to IMAP storage.

billymayday 05-06-2008 05:57 PM

I had a glance through this, and may well have missed bits. Here's my 2c

What unSpawn says is exactly what I'd have suggested (not meant to sound high and mightly btw), and I did something similar some years ago.

All you need to do is set up the server (it will take no time), then start kmail so that it has the native kmail emails in one account and set up another imap account from withing kmail (at least that's what you'd do in most mail clients - I don't use kmail). Then drag and drop your kmail inbox across to the imap account. Done for kmail. Repeat for Evolution, etc. How easy is that?

Won't solve mess within kmail, etc., but you can work on that separately.

Use a conversion utility for mbox to maildir where appropriate and you're most of the way there.

If you use dovecot for example, you probably won't even need to modify the config - it should work out of the box on a properly configured system.

Use fetchmail to collect emails from your ISP's pop(s) account in future.


Rgds


All times are GMT -5. The time now is 08:20 PM.