~~NOTOC~~ ====== Howto extract e-mail MIME attachments using Perl and uudecode ====== This is a perl script that extracts attachments from e-mails. I use it to keep a copy of all attachments in my postfix mail server before downloading e-mails to my e-mail reader. ===== Procedure ===== One way to extract MIME attachments from e-mails is to convert the attachment from its MIME format to base-64 format then use the unix uudecode utility to decode the attachement. The procedure thus consists of thwo steps. ==== 1. Convert the attachment from MIME format to base-64 format ==== * Find the boundary MIME attachment are delimited by boundaries. The first step is to find these boundaries in order to locate the attachment. The boundary is indicated by the e-mail header attribute 'boundary'. Example from an actual e-mail : boundary=------------ms040402050104070203050000; * Extract the attachment's filename. This is indicated by the 'filename' attribute in the MIME attachment's header. Example filename="report.doc" * Replace the MIME header by the base-64 header. We need to remove the whole MIME header of the attachement and replace it by begin-base64 0644 filename where filename is the filename associated to the attachment. ==== 2. Run uudecode on the base-64 format of the attachment ==== Once we converted the attachment to base-64, we just need to run the uudecode utility on it to extract the actual attached file. ===== The script ===== The following Perl script performs the actions described above. #!/usr/bin/perl use strict; my $boundary="none"; my $type="none"; my $fname="none"; my $location="nowhere"; foreach my $line () { chomp; if ($line =~ /^Content-type:\s.*\sboundary=.*$/i) { $line =~ s/\"//g; ($boundary) = $line=~ m/.*=(.*)$/; next; } elsif (($line =~ /$boundary/) && ($location =~ /nowhere/)){ $location="header"; next; } if ($location =~ /header/) { if ($line =~ /^Content-type:\s.*;\s.*$/i) { ($type) = $line=~ /^Content-type:\s(.*);\s.*$/i; } elsif ($line =~ /^Content-Disposition:\sattachment;\s.*$/i) { $line =~ s/\"//g; ($fname) = $line=~ /^Content-Disposition:\sattachment;\sfilename=(.*)$/i; } else { next unless ($line =~ /^$/) ; if ( not $fname =~ /none/) { $location="body"; open(ATT, ">.".$fname.".b64") || die "Could not open file\n"; print ATT "begin-base64 0644 ". $fname ."\n"; }else { $location="nowhere"; } } } elsif ($location =~ /body/) { if ($line =~ /$boundary/){ print ATT "====\n"; close (ATT); my @args = ("uudecode", ".".$fname.".b64"); system(@args) == 0 or die "system @args failed: $?" ; print $fname."\n"; $fname="none"; $type="none"; if (not $line =~ /$boundary--/){ $location="header"; } } else { print ATT $line; } } } ===== Usage ===== Place the script above in a file (e.g. extract.pl) then try it as follows (//message// is an e-mail that contains attachments.) $cat message | ./extract.pl report1.doc report2.pdf $ls extract.pl message report1.doc report2.pdf The names of the attached file will be printed, and the files will be placed in the current directory. In the example above, the e-mail //message// contained two attached documents 'report1.doc' and 'report2.pdf'. //Note : There is a similar script using use MIME::Parser [[http://www.gnu.org/software/gnats/mimedecode.html|here]].// {{tag>howto services coding}}