[Techtalk] Web page downloads with null characters

John Clarke johnc+linuxchix at kirriwa.net
Wed Oct 22 09:35:11 EST 2003


On Tue, Oct 21, 2003 at 07:58:41 +0200, Dan Richter wrote:

> This page:
>    http://www.w3.org/QA/Tips/iso-date
> seems to look fine under Windows, but on my Red Hat machine it downloads 
> with null characters (character zero) inserted between every two 
> characters. This is true whether I use Mozilla, Konqueror or wget.

that's odd, i'm using galeon 1.2.11 on redhat 7.3 and the page displays
fine.  

if i ever have problems with a web page, the first thing i do is bypass
any proxy caches between me and the server.  if that doesn't help, i
try is fetching the page manually.  telnet to port 80 on the web server
and send "GET uri HTTP/1.0" (replace "uri" with the page you're trying
to view) followed by a blank line (i.e. press return twice).  the
server will send some headers followed by the page content.  for this
page, i got:

    [johnc at dropbear ~]$ telnet www.w3.org 80
    Trying 18.7.14.127...
    Connected to www.w3.org.
    Escape character is '^]'.
    GET http://www.w3.org/QA/Tips/iso-date HTTP/1.0

    HTTP/1.1 200 OK
    Date: Tue, 21 Oct 2003 23:11:07 GMT
    Server: Apache/1.3.28 (Unix) PHP/4.2.3
    Content-Location: iso-date.html
    Vary: negotiate
    TCN: choice
    P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
    Cache-Control: max-age=21600
    Expires: Wed, 22 Oct 2003 05:11:07 GMT
    Last-Modified: Tue, 21 Oct 2003 20:19:34 GMT
    ETag: "3f9594d6;3f8f6ba9"
    Accept-Ranges: bytes
    Content-Length: 6374
    Connection: close
    Content-Type: text/html; charset=iso-8859-1

    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
            "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
    
    ...

with no nulls anywhere in the content.  i tried using http 1.1 too with
the same result (as expected).  for http 1.1 you send "GET uri
HTTP/1.1" followed by "Host: hostname" (in this case, www.w3.org)
followed by a blank line.

> Can anyone tell me where the problem is?

try bypassing any proxy caches if possible.  then try fetching the page
manually - do you see any nulls in the content?  use wget to grab the
page and save it to a local file.  open the file in a text editor - are
there any nulls in there?  try displaying the file in your browser.

maybe it's iso-8859-1 that your browser is having trouble with.  can
you view this page - http://kirriwa.net/ - ok?  my pages are all
iso-8859-1, so if you can view them it's unlikely to be a charset
problem.


cheers,

john
-- 
whois !JC774-AU at whois.aunic.net
GPG key id: 0xD59C360F
http://kirriwa.net/john/


More information about the Techtalk mailing list