LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 11-11-2021, 10:55 PM   #1
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,638

Rep: Reputation: 183Reputation: 183
Export of Thunderbird inbox to Mbox file not quite working


After researching on how to export my inbox to an Mbox file I found the suggestion(s) to use 'ImportExportTool NG' add-on. I installed that and ran it as instructed (right-click on Inbox folder > ImportExportTools NG > Export Folder). The resulting file, Inbox, is not an mbox file, at least not one like gets created by Linux in the /var/spool/mail folder. I cannot open it with mailx. doing 'mail -f Inbox' just shows 1 message (of 477 total). However, it does appear to open with mutt!

Is there some newer format for the mbox file that is not supported by mailx? Is there a way to convert this to the format used in /var/spool/mail (created by Sendmail)?
 
Old 11-12-2021, 07:44 AM   #2
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,638

Original Poster
Rep: Reputation: 183Reputation: 183
After some experimentation, I've been able to solve the problem. First, the 'ImportExportTools NG' add-on for Thunderbird exports in Windows text format with CR/LF. mailx requires a blank line prior to the "From ". The CR on the line messes that up for mailx, but apparently mutt can handle it.

Next, the "From " Line format. the man page for mbox says:
Quote:
A postmark line consists of the four characters "From", followed by a space character, followed by the message's envelope sender
address, followed by whitespace, and followed by a time stamp. This line is often called From_ line.

The sender address is expected to be addr-spec as defined in RFC2822 3.4.1. The date is expected to be date-time as output by asc‐
time(3). For compatibility reasons with legacy software, two-digit years greater than or equal to 70 should be interpreted as the
years 1970+, while two-digit years less than 70 should be interpreted as the years 2000-2069. Software reading files in this format
should also be prepared to accept non-numeric timezone information such as "CET DST" for Central European Time, daylight saving time.

Example:

>From example@example.com Fri Jun 23 02:56:55 2000
The 'ImportExportTools NG' add-on does not create the from line this way. It does:
Code:
From - Thu Nov 11 23:28:10 2021
with no email address at all. Furthermore, the date in the exported From line is the date/time of the export, not of the message! Again, this messes up mailx, but apparently mutt looks at the actual From: and Date: lines to interpret the sender and date/time of the message.

There is a pretty interesting page, https://www.loc.gov/preservation/dig...dd000383.shtml, which gives some detail about the mbox format. There are several variations and mailx apparently expects the MBOXRD format. That page does say, 'The "From " line structure is From sender date moreinfo'. So, the 'ImportExportTools NG' add-on is NOT exporting the From line in this format. The add-in is for Windows Thunderbird, so I suppose it exports compatible with that OS.

To solve the problem I've created a script to post-process the exported file:
Code:
#!/bin/bash

# Removes bogus (for mbox) "^From " and replaces with format: "From mike@mydom.com  Fri Nov  5 08:04:38 2021"

if [ -z "$1" ]; then echo path to exported mbox file required; exit; fi

# Convert CR/LF to LF
perl -pe 's/\r$//g' < "$1" > booga

# The -n argument to csplit needs to big enough for all message. E.g. 1000-9999 messages in the mbox file   
# needs -n 4. If not big enough, the cat of these files will not cat them in the correct order and the sort
# bit (at bottom) will be needed.

csplit --suppress-matched -s -n 3 -z booga "/^From /" '{*}'

for f in xx*
do
    newfile=`echo $f | sed 's/xx/yy/'`

    # Some emails (from me) have "From: mark@mydom.org (Mark Foley)". So, remove between the parenthesis.
    from=`grep "^From: " $f | head -1 | cut "-d<" -f2- | tr -d "<>" | sed -e 's/(.*)//' -e 's/^From: //'`
    date=`grep "^Date: " $f | head -1 | sed -e 's/^Date: //' -e 's/ -0.*$//'`
    fromdate=`date -d "$date" "+%a %b %e %H:%M:%S %Y"`
echo ${newfile}: From $from $fromdate

    echo From $from $fromdate >$newfile
    cat $f >>$newfile
done

cat yy* >newInbox

# To sort, if needed
#cat yy* >booga
#rm newInbox

#mail -f booga <<EOF
#sort date
#s * newInbox
#x
#EOF

rm booga xx* yy*
I used the Windows add-on for Thunderbird because my Linux Thunderbird said "This add-on is not compatible with your version of Thunderbird". However, I just updated my Linux Thunderbird to 91.3.0 and now that add-on is there. This fixes the CR/LF problem, but the "From " line issue is still there in this version. I suppose the developer will need to be contacted to fix that.

I hope this can help someone!

Last edited by mfoley; 11-12-2021 at 09:13 AM.
 
1 members found this post helpful.
  


Reply

Tags
format, mailx, mbox, mutt, sendmail



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Evolution keeps showing mail folder "INBOX" under "Inbox" rnturn Linux - Software 0 04-02-2018 09:45 AM
INBOX, Inbox, and MS Outlook 2007 Toomas Linux - Server 5 02-23-2012 09:24 AM
Dovecot + Sendmail -- mbox and INBOX Spechal Linux - Server 15 11-06-2008 09:13 PM
Using thunderbird inbox folders with kmail - thunderbird on linux too slow mtess Linux - Software 1 03-12-2008 10:34 AM
How to export/import Thunderbird mail to another computer with Thunderbird mail clau_bolson Linux - Software 1 04-06-2006 01:43 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 04:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration