After some experimentation, I've been able to solve the problem. First, the 'ImportExportTools NG' add-on for Thunderbird exports in Windows text format with CR/LF. mailx requires a blank line prior to the "From ". The CR on the line messes that up for mailx, but apparently mutt can handle it.
Next, the "From " Line format. the man page for mbox says:
Quote:
A postmark line consists of the four characters "From", followed by a space character, followed by the message's envelope sender
address, followed by whitespace, and followed by a time stamp. This line is often called From_ line.
The sender address is expected to be addr-spec as defined in RFC2822 3.4.1. The date is expected to be date-time as output by asc‐
time(3). For compatibility reasons with legacy software, two-digit years greater than or equal to 70 should be interpreted as the
years 1970+, while two-digit years less than 70 should be interpreted as the years 2000-2069. Software reading files in this format
should also be prepared to accept non-numeric timezone information such as "CET DST" for Central European Time, daylight saving time.
Example:
>From example@example.com Fri Jun 23 02:56:55 2000
|
The 'ImportExportTools NG' add-on does not create the from line this way. It does:
Code:
From - Thu Nov 11 23:28:10 2021
with no email address at all. Furthermore, the date in the exported From line is the date/time of the export, not of the message! Again, this messes up mailx, but apparently mutt looks at the actual From: and Date: lines to interpret the sender and date/time of the message.
There is a pretty interesting page,
https://www.loc.gov/preservation/dig...dd000383.shtml, which gives some detail about the mbox format. There are several variations and mailx apparently expects the MBOXRD format. That page does say, '
The "From " line structure is From sender date moreinfo'. So, the 'ImportExportTools NG' add-on is NOT exporting the From line in this format. The add-in is for Windows Thunderbird, so I suppose it exports compatible with that OS.
To solve the problem I've created a script to post-process the exported file:
Code:
#!/bin/bash
# Removes bogus (for mbox) "^From " and replaces with format: "From mike@mydom.com Fri Nov 5 08:04:38 2021"
if [ -z "$1" ]; then echo path to exported mbox file required; exit; fi
# Convert CR/LF to LF
perl -pe 's/\r$//g' < "$1" > booga
# The -n argument to csplit needs to big enough for all message. E.g. 1000-9999 messages in the mbox file
# needs -n 4. If not big enough, the cat of these files will not cat them in the correct order and the sort
# bit (at bottom) will be needed.
csplit --suppress-matched -s -n 3 -z booga "/^From /" '{*}'
for f in xx*
do
newfile=`echo $f | sed 's/xx/yy/'`
# Some emails (from me) have "From: mark@mydom.org (Mark Foley)". So, remove between the parenthesis.
from=`grep "^From: " $f | head -1 | cut "-d<" -f2- | tr -d "<>" | sed -e 's/(.*)//' -e 's/^From: //'`
date=`grep "^Date: " $f | head -1 | sed -e 's/^Date: //' -e 's/ -0.*$//'`
fromdate=`date -d "$date" "+%a %b %e %H:%M:%S %Y"`
echo ${newfile}: From $from $fromdate
echo From $from $fromdate >$newfile
cat $f >>$newfile
done
cat yy* >newInbox
# To sort, if needed
#cat yy* >booga
#rm newInbox
#mail -f booga <<EOF
#sort date
#s * newInbox
#x
#EOF
rm booga xx* yy*
I used the Windows add-on for Thunderbird because my Linux Thunderbird said "This add-on is not compatible with your version of Thunderbird". However, I just updated my Linux Thunderbird to 91.3.0 and now that add-on is there. This fixes the CR/LF problem, but the "From " line issue is still there in this version. I suppose the developer will need to be contacted to fix that.
I hope this can help someone!