[Techtalk] stripping attachments with postfix

Kai MacTane kmactane at GothPunk.com
Sat Jul 24 01:18:21 EST 2004


At 7/23/04 11:10 PM , Carla Schroder wrote:
>On Friday 23 July 2004 11:03 pm, Carla Schroder wrote:
>
>This is supposed to work as a global Windows executable rejecter:
>
>/^TVqQAAMAAAAEAAA/ REJECT
>
>then run a body_checks directive on it. The filter works by looking for the
>base64 encoded start of a windows executable, which supposedly always looks
>the same.
>
>Can it be for real? Can it be this easy?

It sure could. (Note, I'm not saying "it is"; just "it could". As DJB puts 
it, "profile, don't speculate." Well, I haven't got the time to install 
Postfix and do all the profiling, but I'll at least mark my speculations as 
such, rather than letting them be mistaken for profiling.)

Anyway, I expect this is working the same way as file(1) does when it 
returns something like "MS Windows executable (EXE)". file works by 
actually scanning the first few bytes of the file, looking to see if they 
match any of the magic numbers listed in (usually) /etc/magic. Naturally, 
this means the first few bytes of every MS EXE are the same.

Okay, I *did* try some profiling... I uploaded calc.exe and notepad.exe to 
my home Linux box and ran file on them. Here's what I got:

bin/CALC.EXE:            MS-DOS executable (EXE), OS/2 or MS Windows
bin/NOTEPAD.EXE:         MS Windows PE 32-bit Intel 80386 GUI executable

(Then I tried with a bunch of other common and not-so-common executables, 
and got either of the same results.)

However, looking at them in vi, I saw the same first few bytes anyway. 
Specifically, the first 127 bytes have only one byte's difference between 
them, in byte 60. Even more specifically, they read as follows. Here's the 
beginning of CALC.EXE:

  offset    0  1  2  3   4  5  6  7   8  9  a  b   c  d  e  f  0123456789abcdef
00000000  4d 5a 90 00  03 00 00 00  04 00 00 00  ff ff 00 00  MZ..........ÿÿ..
00000010  b8 00 00 00  00 00 00 00  40 00 00 00  00 00 00 00  ¸....... at .......
00000020  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00000030  00 00 00 00  00 00 00 00  00 00 00 00  c8 00 00 00  ............È...
00000040  0e 1f ba 0e  00 b4 09 cd  21 b8 01 4c  cd 21 54 68  ..º..´.Í!¸.LÍ!Th
00000050  69 73 20 70  72 6f 67 72  61 6d 20 63  61 6e 6e 6f  is program canno
00000060  74 20 62 65  20 72 75 6e  20 69 6e 20  44 4f 53 20  t be run in DOS
00000070  6d 6f 64 65  2e 0d 0d 0a  24 00 00 00  00 00 00 00  mode....$.......
00000080  d6 c3 38 22  92 a2 56 71  92 a2 56 71  92 a2 56 71  ÖÃ8".¢Vq.¢Vq.¢Vq

NOTEPAD.EXE begins just the same, up to byte 120 (line 00080), except for 
that 0xC8 byte in line 00030, offset c (byte 60), which is a 0x80 instead. 
Starting at line 00080 offset 0 (byte 120), it has 'PE' and then a couple 
of null bytes (0x00).

According to the /etc/magic file on my machine, that "PE\0\0" string at 
byte 120 is what identifies it as an MS Windows PE rather than an MS-DOS 
executable.

I'm not sufficiently versed in base64 encoding to speculate on whether the 
first 60 bytes of the input file being the same is enough to ensure that 
the first 16 bytes of the base64 encoding are identical. But from a cursory 
reading of http://www.freesoft.org/CIE/RFC/2065/56.htm, it looks like any 
given 3 bytes of an input file will always produce the same 4 bytes of 
base64. So the first 60 bytes in Windows executables *should* suffice to 
ensure that any base64-encoded Windows executable should always have the 
first 80 bytes as a recognizable fingerprint.

At least, I'd wager this will catch 95% or more of them.

(Jeez, I know I said I wasn't going to profile. Oops. It's late and my mind 
tends to wander. Sorry about that.)

                                                 --Kai MacTane
----------------------------------------------------------------------
"I am the storm. My voice is the river.
  Take from me, I fade into you..."
                                                 --The Last Dance,
                                                  "Fairytale (the Storm)"



More information about the Techtalk mailing list