[Techtalk] Web page downloads with null characters

Brenda Bell k15a-list-linuxchix at theotherbell.com
Tue Oct 21 14:48:21 EST 2003


Quoting Dan Richter <daniel.richter at wimba.com>:

> This page:
>    http://www.w3.org/QA/Tips/iso-date
> seems to look fine under Windows, but on my Red Hat machine it downloads
> 
> with null characters (character zero) inserted between every two 
> characters. This is true whether I use Mozilla, Konqueror or wget.

I'm pretty sure the page is Unicode (Netscape identifies the charset as
ISO-8859).

Also, Netscape identifies the content type as text/html, but it's really
text/xml.

To see what I mean, use something like wget to download the page... then
try displaying it with various programs that understand Unicode (notepad)
and/or XML (like IE... this may require renaming the file so it has an XML
extension).

FYI:  XMLSpy editor displays this first couple of lines as follows:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">


-- 
Brenda
http://opensource.theotherbell.com



More information about the Techtalk mailing list