«»

Technicality

· 26TH OF AUGUST, THE YEAR 2005

CONVERTING MAILDIR TO MBOX

I recently had to convert a bunch of maildir messages to mbox format so I could import them into Thunderbird. Well, maybe Thunderbird reads maildir messages. Didn’t even think of that. Anyway, not a very difficult task. Here’s my script:


/*
This script should convert a directory full of maildir files to a single mbox file.

Note that if an email has a line starting with From: and containing an @ somwhere before the actual From line, this script with bork. That's pretty unlikely though.

Output is a file called archive.mbox. If you're importing into Mozilla Thunderbird, just drop it into the Mail/Local Folders dir in your Thunderbird Profile.

To use the script, just drop it in the same directory as the maildir messages and run it.
*/

//get list of files that aren't directories

$files_str = shell_exec('ls');
$files_arr = explode("\n", chop($files_str));

$final = '';

//foreach file, read it into an array
foreach ($files_arr as $file) {
if ( is_file($file) && is_numeric(rtrim($file, '.'))) {
$this_file = file($file);

//foreach line that starts with From: and contains an @, put a copy without the colon at the top
foreach( $this_file as $line ) {

if( strpos($line, 'From: ') === 0 && strpos($line, '@') != false ) {
$line = str_replace('From: ', 'From ', $line);
array_unshift($this_file, $line);
break;
}
}

//implode the file to a string and add it onto the end of the mbox
$this_file_str = implode('', $this_file);
$final .= "$this_file_str\n";
}
}

$fp = fopen('archive.mbox', 'w');
fwrite($fp, $final);
fclose($fp);

?>

16 COMMENTS

tony said on August 29th, 2005 at 2:17 am,

is thunderbird better than mail? i haven’t really tried it out at all.

Marx Fucking Twain said on September 24th, 2005 at 3:43 pm,

Thanks mate.
Who the fuck writes utility scripts in PHP though? After I finish some other shit I’m going to convert this to Perl and post it here.

Marx Fucking Twain said on September 24th, 2005 at 9:29 pm,

PHP is gay and any blog that needs moderation is run by a pussy.

#!/usr/bin/perl
# Originally from http://www.pageofguh.org/technicality/524
# Converted from PHP to Perl by Marx Twain

# This script should convert a directory full of maildir files to a single mbox file.
# Note that if an email has a line starting with From: and containing an @ somwhere before the actual From line, this #script with bork. That’s pretty unlikely though.
# Output is a file called archive.mbox. If you’re importing into Mozilla Thunderbird, just drop it into the #Mail/Local Folders dir in your Thunderbird Profile.
# To use the script, just drop it in the same directory as the maildir messages and run it.

#get list of files that aren’t directories
foreach() {
push @files_arr, $_ if -f $_;
}

$/ = undef; #SLURP HLUAHGLUAHLGHLGUHAL

#foreach file, read it into an array
for $file (@files_arr) {
chomp $file;

if( $file =~ /^\d+\./ ) {
open(FILE, “;
close(FILE);

@lines = split /\n/, $filetext;

#foreach line that starts with From: and contains an @, put a copy without the colon at the top
$text = “”;
$from = “”;
for $line (@lines) {
if( $line =~ /^From: .*$/ ) {
$line =~ s/^From: /From /;
$from = “\n” . $line . “\n”;
}
else {
$text .= $line . “\n”;
}
}

#implode the file to a string and add it onto the end of the mbox
$final .= $from . $text if length($text) > 0 and length($from) > 0;
}
}

open(FILE, “>archive.mbox”);
print FILE $final;
close(FILE);

Marx Fucking Twain said on September 25th, 2005 at 8:50 am,

oh, that foreach should have < * > (html eater ruined it)

Marx Fucking Twain said on September 25th, 2005 at 8:56 am,

Christ.
Reposting using a code tag.

#!/usr/bin/perl
#get list of files that aren’t directories
foreach() {
push @files_arr, $_ if -f $_;
}

$/ = undef; #SLURP HLUAHGLUAHLGHLGUHAL

#foreach file, read it into an array
for $file (@files_arr) {
chomp $file;

if( $file =~ /^\d+\./ ) {
open(FILE, ";
close(FILE);

@lines = split /\n/, $filetext;

#foreach line that starts with From: and contains an @, put a copy without the colon at the top
$text = "";
$from = "";
for $line (@lines) {
if( $line =~ /^From: .*$/ ) {
$line =~ s/^From: /From /;
$from = "\n" . $line . "\n";
}
else {
$text .= $line . "\n";
}
}

#implode the file to a string and add it onto the end of the mbox
$final .= $from . $text if length($text) > 0 and length($from) > 0;
}
}

open(FILE, ">archive.mbox");
print FILE $final;
close(FILE);

JD said on September 25th, 2005 at 8:53 pm,

I write PHP utility scripts all the time. For one, I know PHP so well that its easier then doing it any other language. But also, so does everyone else I work with, since we do PHP web development.

Even so, I think PHP is an excellent utility language.. it has the same power as Perl and is generally more readable. There are no $_, $/, etc. Note also that while regular expressions are certainly powerful (and php supports them), Mr. Ueda didn’t use any which makes it easier to understand what’s going on… not to mention faster.

I don’t mean to start a PHP vs Perl debate… I’ve used and loved both.. my real point is that given today’s PHP market penetration, it makes a great deal of sense to write these types of things in PHP.

Samat said on September 25th, 2005 at 9:17 pm,

Just to note: you want to turn off what I call “retard quotes” (the styled open and closed quotes that Microsoft Office uses and your blog is replacing real quotes with) so people who find this can just copy and paste.

My browser (Firefox on Linux) replaces these “retard quotes” with ‘?’s and I have to go back and fix them…

ken-ichi said on September 25th, 2005 at 9:48 pm,

Done and done. Curly quotes problem solved using this advice.

mike said on December 13th, 2006 at 3:56 pm,

Hey all, I tried the php script and was able to get it to create a file, but it doesn’t contain anything.

I didn’t get any errors when running it. I am on Redhat 9 / exim . Any ideas?

TRTech said on April 6th, 2007 at 12:53 pm,

Thanks Kanedaaa, worked like a charm…

i really have nothing to say said on August 9th, 2007 at 4:09 pm,

[...] I’m in the process of migrating a domain from one server to Google Apps. I’ve got email dating back to 2001 on this server and I don’t want to lose it. However, I don’t really want it all imported to Google either. One problem I ran into is that MailSteward can import mbox files but doesn’t have support for maildir (that I could find). After flexing my Google-fu, I found this blog post which, in the comments, references this bash script. After downloading and running the script on my Maildir, I had a nice little 72Mb mbox file containing approximately 1,130 email. I transferred it to my laptop, fired up MailSteward and imported it. Within five minutes, MailSteward had imported the entire file, with the exception of 20 duplicate emails that it found. It skipped the dupes. Nice. [...]

Corey said on November 23rd, 2007 at 1:54 pm,

Why not just use formail (Included with procmail)

Enter into the maildir, and do something like this:

for i in new/* ;do formail > ../mbox;done

Corey said on November 23rd, 2007 at 1:56 pm,

for i in new/* cur/*;do formail \\> ../mbox;done

Corey said on November 23rd, 2007 at 1:57 pm,

Ah, forget it…this thing keeps thinking I’m uploading html.

for i in new/* cur/*;do formail C”$i” DD ../mbox;done

Corey said on November 23rd, 2007 at 1:58 pm,

Where “C” = Less than, “D” = Greater than

LEAVE A COMMENT

(required)
(required)