[Techtalk] Web page downloads with null characters
John Clarke
johnc+linuxchix at kirriwa.net
Wed Oct 22 09:35:11 EST 2003
On Tue, Oct 21, 2003 at 07:58:41 +0200, Dan Richter wrote:
> This page:
> http://www.w3.org/QA/Tips/iso-date
> seems to look fine under Windows, but on my Red Hat machine it downloads
> with null characters (character zero) inserted between every two
> characters. This is true whether I use Mozilla, Konqueror or wget.
that's odd, i'm using galeon 1.2.11 on redhat 7.3 and the page displays
fine.
if i ever have problems with a web page, the first thing i do is bypass
any proxy caches between me and the server. if that doesn't help, i
try is fetching the page manually. telnet to port 80 on the web server
and send "GET uri HTTP/1.0" (replace "uri" with the page you're trying
to view) followed by a blank line (i.e. press return twice). the
server will send some headers followed by the page content. for this
page, i got:
[johnc at dropbear ~]$ telnet www.w3.org 80
Trying 18.7.14.127...
Connected to www.w3.org.
Escape character is '^]'.
GET http://www.w3.org/QA/Tips/iso-date HTTP/1.0
HTTP/1.1 200 OK
Date: Tue, 21 Oct 2003 23:11:07 GMT
Server: Apache/1.3.28 (Unix) PHP/4.2.3
Content-Location: iso-date.html
Vary: negotiate
TCN: choice
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Cache-Control: max-age=21600
Expires: Wed, 22 Oct 2003 05:11:07 GMT
Last-Modified: Tue, 21 Oct 2003 20:19:34 GMT
ETag: "3f9594d6;3f8f6ba9"
Accept-Ranges: bytes
Content-Length: 6374
Connection: close
Content-Type: text/html; charset=iso-8859-1
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
...
with no nulls anywhere in the content. i tried using http 1.1 too with
the same result (as expected). for http 1.1 you send "GET uri
HTTP/1.1" followed by "Host: hostname" (in this case, www.w3.org)
followed by a blank line.
> Can anyone tell me where the problem is?
try bypassing any proxy caches if possible. then try fetching the page
manually - do you see any nulls in the content? use wget to grab the
page and save it to a local file. open the file in a text editor - are
there any nulls in there? try displaying the file in your browser.
maybe it's iso-8859-1 that your browser is having trouble with. can
you view this page - http://kirriwa.net/ - ok? my pages are all
iso-8859-1, so if you can view them it's unlikely to be a charset
problem.
cheers,
john
--
whois !JC774-AU at whois.aunic.net
GPG key id: 0xD59C360F
http://kirriwa.net/john/
More information about the Techtalk
mailing list