[Techtalk] stripping attachments with postfix
Kai MacTane
kmactane at GothPunk.com
Sat Jul 24 01:18:21 EST 2004
At 7/23/04 11:10 PM , Carla Schroder wrote:
>On Friday 23 July 2004 11:03 pm, Carla Schroder wrote:
>
>This is supposed to work as a global Windows executable rejecter:
>
>/^TVqQAAMAAAAEAAA/ REJECT
>
>then run a body_checks directive on it. The filter works by looking for the
>base64 encoded start of a windows executable, which supposedly always looks
>the same.
>
>Can it be for real? Can it be this easy?
It sure could. (Note, I'm not saying "it is"; just "it could". As DJB puts
it, "profile, don't speculate." Well, I haven't got the time to install
Postfix and do all the profiling, but I'll at least mark my speculations as
such, rather than letting them be mistaken for profiling.)
Anyway, I expect this is working the same way as file(1) does when it
returns something like "MS Windows executable (EXE)". file works by
actually scanning the first few bytes of the file, looking to see if they
match any of the magic numbers listed in (usually) /etc/magic. Naturally,
this means the first few bytes of every MS EXE are the same.
Okay, I *did* try some profiling... I uploaded calc.exe and notepad.exe to
my home Linux box and ran file on them. Here's what I got:
bin/CALC.EXE: MS-DOS executable (EXE), OS/2 or MS Windows
bin/NOTEPAD.EXE: MS Windows PE 32-bit Intel 80386 GUI executable
(Then I tried with a bunch of other common and not-so-common executables,
and got either of the same results.)
However, looking at them in vi, I saw the same first few bytes anyway.
Specifically, the first 127 bytes have only one byte's difference between
them, in byte 60. Even more specifically, they read as follows. Here's the
beginning of CALC.EXE:
offset 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 MZ..........ÿÿ..
00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ¸....... at .......
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000030 00 00 00 00 00 00 00 00 00 00 00 00 c8 00 00 00 ............È...
00000040 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 ..º..´.Í!¸.LÍ!Th
00000050 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f is program canno
00000060 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 t be run in DOS
00000070 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 mode....$.......
00000080 d6 c3 38 22 92 a2 56 71 92 a2 56 71 92 a2 56 71 ÖÃ8".¢Vq.¢Vq.¢Vq
NOTEPAD.EXE begins just the same, up to byte 120 (line 00080), except for
that 0xC8 byte in line 00030, offset c (byte 60), which is a 0x80 instead.
Starting at line 00080 offset 0 (byte 120), it has 'PE' and then a couple
of null bytes (0x00).
According to the /etc/magic file on my machine, that "PE\0\0" string at
byte 120 is what identifies it as an MS Windows PE rather than an MS-DOS
executable.
I'm not sufficiently versed in base64 encoding to speculate on whether the
first 60 bytes of the input file being the same is enough to ensure that
the first 16 bytes of the base64 encoding are identical. But from a cursory
reading of http://www.freesoft.org/CIE/RFC/2065/56.htm, it looks like any
given 3 bytes of an input file will always produce the same 4 bytes of
base64. So the first 60 bytes in Windows executables *should* suffice to
ensure that any base64-encoded Windows executable should always have the
first 80 bytes as a recognizable fingerprint.
At least, I'd wager this will catch 95% or more of them.
(Jeez, I know I said I wasn't going to profile. Oops. It's late and my mind
tends to wander. Sorry about that.)
--Kai MacTane
----------------------------------------------------------------------
"I am the storm. My voice is the river.
Take from me, I fade into you..."
--The Last Dance,
"Fairytale (the Storm)"
More information about the Techtalk
mailing list