[techtalk] mail format (was re: ftp for non users)

Jeff Dike jdike at karaya.com
Sun Nov 28 00:18:01 EST 1999


Below is a little perl script I wrote that removes html parts from messages 
that have text parts as well.  I wrote it because exmh goes ballistic when it 
sees such mail.  The only downside that I've seen is that it can't tell if the 
two parts have the same content, so if someone sends you a plain text email 
with an html attachment, the html gets thrown away.

I deal with this with a procmail rule which saves everything away before 
mucking with it:

# Save unmodified mail in inbox/allmail
:0 c
| /usr/lib/mh/rcvstore +inbox/allmail

# Pass it through mail.pl and run it through the rest of the rules
:0 f
| perl mail.pl

				Jeff

%boundaries = ();
$html_index = -1;
$text_index = -1;
$part = "";

while(<>){
    chop;
    if(/[Bb]oundary="([^\"]*)"/){
	my $boundary = $1;
	$boundary =~ s/\$/\\\$/g;
	$boundaries{$boundary} = "";
    }
    elsif(/[Bb]oundary=(.*)$/){
	my $boundary = $1;
	$boundary =~ s/\$/\\\$/g;
	$boundaries{$boundary} = "";
    }
    if(/^Content-[tT]ype: text\/plain/){
	$text_index = $#parts + 1;
    }
    elsif(/^Content-[tT]ype: text\/html/){
	$html_index = $#parts + 1;
    }
    else {
	foreach my $boundary (keys(%boundaries)){
	    if(/$boundary/){
		push @parts, $part;
		$part = "";
		last;
	    }
	}
    }
    $part .= "$_\n";
}
push @parts, $part;

for my $i (0..$#parts){
    if(($html_index == -1) || ($text_index == -1) || 
       ($html_index == $text_index) || ($i != $html_index)){
	print $parts[$i];
    }
}



************
techtalk at linuxchix.org   http://www.linuxchix.org




More information about the Techtalk mailing list