~~NOTOC~~
====== Howto extract e-mail MIME attachments using Perl and uudecode ======
This is a perl script that extracts attachments from e-mails. I use it to keep a copy of all attachments in my postfix mail server before downloading e-mails to my e-mail reader.
===== Procedure =====
One way to extract MIME attachments from e-mails is to convert the attachment from its MIME format to base-64 format then use the unix uudecode utility to decode the attachement. The procedure thus consists of thwo steps.
==== 1. Convert the attachment from MIME format to base-64 format ====
* Find the boundary MIME attachment are delimited by boundaries. The first step is to find these boundaries in order to locate the attachment. The boundary is indicated by the e-mail header attribute 'boundary'. Example from an actual e-mail :
boundary=------------ms040402050104070203050000;
* Extract the attachment's filename. This is indicated by the 'filename' attribute in the MIME attachment's header. Example
filename="report.doc"
* Replace the MIME header by the base-64 header. We need to remove the whole MIME header of the attachement and replace it by
begin-base64 0644 filename
where filename is the filename associated to the attachment.
==== 2. Run uudecode on the base-64 format of the attachment ====
Once we converted the attachment to base-64, we just need to run the uudecode utility on it to extract the actual attached file.
===== The script =====
The following Perl script performs the actions described above.
#!/usr/bin/perl
use strict;
my $boundary="none";
my $type="none";
my $fname="none";
my $location="nowhere";
foreach my $line () {
chomp;
if ($line =~ /^Content-type:\s.*\sboundary=.*$/i) {
$line =~ s/\"//g;
($boundary) = $line=~ m/.*=(.*)$/;
next;
} elsif (($line =~ /$boundary/) && ($location =~ /nowhere/)){
$location="header";
next;
}
if ($location =~ /header/) {
if ($line =~ /^Content-type:\s.*;\s.*$/i) {
($type) = $line=~ /^Content-type:\s(.*);\s.*$/i;
} elsif ($line =~ /^Content-Disposition:\sattachment;\s.*$/i) {
$line =~ s/\"//g;
($fname) = $line=~ /^Content-Disposition:\sattachment;\sfilename=(.*)$/i;
} else {
next unless ($line =~ /^$/) ;
if ( not $fname =~ /none/) {
$location="body";
open(ATT, ">.".$fname.".b64") || die "Could not open file\n";
print ATT "begin-base64 0644 ". $fname ."\n";
}else {
$location="nowhere";
}
}
} elsif ($location =~ /body/) {
if ($line =~ /$boundary/){
print ATT "====\n";
close (ATT);
my @args = ("uudecode", ".".$fname.".b64");
system(@args) == 0
or die "system @args failed: $?" ;
print $fname."\n";
$fname="none";
$type="none";
if (not $line =~ /$boundary--/){
$location="header";
}
} else {
print ATT $line;
}
}
}
===== Usage =====
Place the script above in a file (e.g. extract.pl) then try it as follows (//message// is an e-mail that contains attachments.)
$cat message | ./extract.pl
report1.doc
report2.pdf
$ls
extract.pl message report1.doc report2.pdf
The names of the attached file will be printed, and the files will be placed in the current directory. In the example above, the e-mail //message// contained two attached documents 'report1.doc' and 'report2.pdf'.
//Note : There is a similar script using use MIME::Parser [[http://www.gnu.org/software/gnats/mimedecode.html|here]].//
{{tag>howto services coding}}