Howto extract e-mail MIME attachments using Perl and uudecode
This is a perl script that extracts attachments from e-mails. I use it to keep a copy of all attachments in my postfix mail server before downloading e-mails to my e-mail reader.
Procedure
One way to extract MIME attachments from e-mails is to convert the attachment from its MIME format to base-64 format then use the unix uudecode utility to decode the attachement. The procedure thus consists of thwo steps.
1. Convert the attachment from MIME format to base-64 format
- Find the boundary MIME attachment are delimited by boundaries. The first step is to find these boundaries in order to locate the attachment. The boundary is indicated by the e-mail header attribute 'boundary'. Example from an actual e-mail :
boundary=------------ms040402050104070203050000;
- Extract the attachment's filename. This is indicated by the 'filename' attribute in the MIME attachment's header. Example
filename="report.doc"
- Replace the MIME header by the base-64 header. We need to remove the whole MIME header of the attachement and replace it by
begin-base64 0644 filename
where filename is the filename associated to the attachment.
2. Run uudecode on the base-64 format of the attachment
Once we converted the attachment to base-64, we just need to run the uudecode utility on it to extract the actual attached file.
The script
The following Perl script performs the actions described above.
#!/usr/bin/perl
use strict;
my $boundary="none";
my $type="none";
my $fname="none";
my $location="nowhere";
foreach my $line (<STDIN>) {
chomp;
if ($line =~ /^Content-type:\s.*\sboundary=.*$/i) {
$line =~ s/\"//g;
($boundary) = $line=~ m/.*=(.*)$/;
next;
} elsif (($line =~ /$boundary/) && ($location =~ /nowhere/)){
$location="header";
next;
}
if ($location =~ /header/) {
if ($line =~ /^Content-type:\s.*;\s.*$/i) {
($type) = $line=~ /^Content-type:\s(.*);\s.*$/i;
} elsif ($line =~ /^Content-Disposition:\sattachment;\s.*$/i) {
$line =~ s/\"//g;
($fname) = $line=~ /^Content-Disposition:\sattachment;\sfilename=(.*)$/i;
} else {
next unless ($line =~ /^$/) ;
if ( not $fname =~ /none/) {
$location="body";
open(ATT, ">.".$fname.".b64") || die "Could not open file\n";
print ATT "begin-base64 0644 ". $fname ."\n";
}else {
$location="nowhere";
}
}
} elsif ($location =~ /body/) {
if ($line =~ /$boundary/){
print ATT "====\n";
close (ATT);
my @args = ("uudecode", ".".$fname.".b64");
system(@args) == 0
or die "system @args failed: $?" ;
print $fname."\n";
$fname="none";
$type="none";
if (not $line =~ /$boundary--/){
$location="header";
}
} else {
print ATT $line;
}
}
}
Usage
Place the script above in a file (e.g. extract.pl) then try it as follows (message is an e-mail that contains attachments.)
$cat message | ./extract.pl report1.doc report2.pdf $ls extract.pl message report1.doc report2.pdf
The names of the attached file will be printed, and the files will be placed in the current directory. In the example above, the e-mail message contained two attached documents 'report1.doc' and 'report2.pdf'.
Note : There is a similar script using use MIME::Parser here.
| Labels: howto, services, coding |
|

Comment