Technicality
· 26TH OF AUGUST, THE YEAR 2005CONVERTING MAILDIR TO MBOX
I recently had to convert a bunch of maildir messages to mbox format so I could import them into Thunderbird. Well, maybe Thunderbird reads maildir messages. Didn’t even think of that. Anyway, not a very difficult task. Here’s my script:
/*
This script should convert a directory full of maildir files to a single mbox file.
Note that if an email has a line starting with From: and containing an @ somwhere before the actual From line, this script with bork. That's pretty unlikely though.
Output is a file called archive.mbox. If you're importing into Mozilla Thunderbird, just drop it into the Mail/Local Folders dir in your Thunderbird Profile.
To use the script, just drop it in the same directory as the maildir messages and run it.
*/
//get list of files that aren't directories
$files_str = shell_exec('ls');
$files_arr = explode("\n", chop($files_str));
$final = '';
//foreach file, read it into an array
foreach ($files_arr as $file) {
if ( is_file($file) && is_numeric(rtrim($file, '.'))) {
$this_file = file($file);
//foreach line that starts with From: and contains an @, put a copy without the colon at the top
foreach( $this_file as $line ) {
if( strpos($line, 'From: ') === 0 && strpos($line, '@') != false ) {
$line = str_replace('From: ', 'From ', $line);
array_unshift($this_file, $line);
break;
}
}
//implode the file to a string and add it onto the end of the mbox
$this_file_str = implode('', $this_file);
$final .= "$this_file_str\n";
}
}
$fp = fopen('archive.mbox', 'w');
fwrite($fp, $final);
fclose($fp);
?>

16 COMMENTS
is thunderbird better than mail? i haven’t really tried it out at all.
Thanks mate.
Who the fuck writes utility scripts in PHP though? After I finish some other shit I’m going to convert this to Perl and post it here.
PHP is gay and any blog that needs moderation is run by a pussy.
#!/usr/bin/perl
# Originally from http://www.pageofguh.org/technicality/524
# Converted from PHP to Perl by Marx Twain
# This script should convert a directory full of maildir files to a single mbox file.
# Note that if an email has a line starting with From: and containing an @ somwhere before the actual From line, this #script with bork. That’s pretty unlikely though.
# Output is a file called archive.mbox. If you’re importing into Mozilla Thunderbird, just drop it into the #Mail/Local Folders dir in your Thunderbird Profile.
# To use the script, just drop it in the same directory as the maildir messages and run it.
#get list of files that aren’t directories
foreach() {
push @files_arr, $_ if -f $_;
}
$/ = undef; #SLURP HLUAHGLUAHLGHLGUHAL
#foreach file, read it into an array
for $file (@files_arr) {
chomp $file;
if( $file =~ /^\d+\./ ) {
open(FILE, “;
close(FILE);
@lines = split /\n/, $filetext;
#foreach line that starts with From: and contains an @, put a copy without the colon at the top
$text = “”;
$from = “”;
for $line (@lines) {
if( $line =~ /^From: .*$/ ) {
$line =~ s/^From: /From /;
$from = “\n” . $line . “\n”;
}
else {
$text .= $line . “\n”;
}
}
#implode the file to a string and add it onto the end of the mbox
$final .= $from . $text if length($text) > 0 and length($from) > 0;
}
}
open(FILE, “>archive.mbox”);
print FILE $final;
close(FILE);
oh, that foreach should have < * > (html eater ruined it)
Christ.
Reposting using a code tag.
#!/usr/bin/perl
#get list of files that aren’t directories
foreach() {
push @files_arr, $_ if -f $_;
}
$/ = undef; #SLURP HLUAHGLUAHLGHLGUHAL
#foreach file, read it into an array
for $file (@files_arr) {
chomp $file;
if( $file =~ /^\d+\./ ) {
open(FILE, ";
close(FILE);
@lines = split /\n/, $filetext;
#foreach line that starts with From: and contains an @, put a copy without the colon at the top
$text = "";
$from = "";
for $line (@lines) {
if( $line =~ /^From: .*$/ ) {
$line =~ s/^From: /From /;
$from = "\n" . $line . "\n";
}
else {
$text .= $line . "\n";
}
}
#implode the file to a string and add it onto the end of the mbox
$final .= $from . $text if length($text) > 0 and length($from) > 0;
}
}
open(FILE, ">archive.mbox");
print FILE $final;
close(FILE);
I write PHP utility scripts all the time. For one, I know PHP so well that its easier then doing it any other language. But also, so does everyone else I work with, since we do PHP web development.
Even so, I think PHP is an excellent utility language.. it has the same power as Perl and is generally more readable. There are no $_, $/, etc. Note also that while regular expressions are certainly powerful (and php supports them), Mr. Ueda didn’t use any which makes it easier to understand what’s going on… not to mention faster.
I don’t mean to start a PHP vs Perl debate… I’ve used and loved both.. my real point is that given today’s PHP market penetration, it makes a great deal of sense to write these types of things in PHP.
Just to note: you want to turn off what I call “retard quotes” (the styled open and closed quotes that Microsoft Office uses and your blog is replacing real quotes with) so people who find this can just copy and paste.
My browser (Firefox on Linux) replaces these “retard quotes” with ‘?’s and I have to go back and fix them…
Done and done. Curly quotes problem solved using this advice.
Hey all, I tried the php script and was able to get it to create a file, but it doesn’t contain anything.
I didn’t get any errors when running it. I am on Redhat 9 / exim . Any ideas?
I write bash script: http://kaneda.bohater.net/files/maildir2mbox.sh
Thanks Kanedaaa, worked like a charm…
[...] I’m in the process of migrating a domain from one server to Google Apps. I’ve got email dating back to 2001 on this server and I don’t want to lose it. However, I don’t really want it all imported to Google either. One problem I ran into is that MailSteward can import mbox files but doesn’t have support for maildir (that I could find). After flexing my Google-fu, I found this blog post which, in the comments, references this bash script. After downloading and running the script on my Maildir, I had a nice little 72Mb mbox file containing approximately 1,130 email. I transferred it to my laptop, fired up MailSteward and imported it. Within five minutes, MailSteward had imported the entire file, with the exception of 20 duplicate emails that it found. It skipped the dupes. Nice. [...]
Why not just use formail (Included with procmail)
Enter into the maildir, and do something like this:
for i in new/* ;do formail > ../mbox;done
for i in new/* cur/*;do formail \\> ../mbox;done
Ah, forget it…this thing keeps thinking I’m uploading html.
for i in new/* cur/*;do formail C”$i” DD ../mbox;done
Where “C” = Less than, “D” = Greater than