[Techtalk] Web site setup

Kathryn Andersen kat_lists at katspace.com
Wed Apr 20 20:50:56 EST 2005


On Wed, Apr 20, 2005 at 10:15:12AM +0100, Dan wrote:
> I have an interesting problem with web sites that I'd like your comments
> on. Currently our web site is static and is maintained by a
> non-programmer who knows just enough HTML to get by. Since all our web
> pages have the same style (and not just CSS), we'd like to use PHP to
> dynamically generate the top and bottom of each page.

That *is* one approach, yes.  Not the only one, though.

I've done a lot of thinking about this question over the years, as my
own site developed from something small and simple, to something big and
sprawling, while I still wanted to maintain a common look-and-feel with
nice navigation and so on, while at the same time cutting down on the
amount of repetitive work I had to do.

I've flitted from one thing to another, used things, wrote things, threw
them away and wrote other things...

The simplest approach, IMHO, is static generation, because one can
pre-generate the site as a whole, with some sort of templating system,
get it all working nicely with simple static pages on one's own PC at
home, and then upload the whole thing (say, with ftp or rsync) to the
real website, and know it's going to work, because the site itself isn't
dynamic.

For example WebMake (http://webmake.taint.org/) is a system which
concentrates on pre-generating a site, rather than trying to make sure
that one's web-host has lots of dynamic generation stuff.

The next level is dynamic scripting, or going even further one gets into
fancy and not so fancy Content Management Systems.

(Look here for a long list of Content Management Systems to get confused
by...  http://www.la-grange.net/cms)

> The person who maintains our web site is intelligent but isn't a
> programmer, so we want to make this as easy as possible for her. I'd
> like to have each page look something like this:
> 
> <?php
> $title = 'Foo';
> $content = '
> <p>The main content of the page goes here.</p>
> ';
> include('main-library.inc');
> ?>
> 
> That's quite feasible except that you would have to escape all the
> single quotes in the HTML content, which would be a pain for big
> documents.

You wouldn't actually have to do that.  The point of PHP is that it is
embeddable in HTML, not that the whole page has to be PHP.
So, an alternative to that could be:

<html>
<head>
<?php
$title = 'Foo'
include('head-part.inc');
?>
</head>
<body>
<?php include('page-head.inc'); ?>
<p>The main content of the page goes here.</p>
<?php include('page-foot.inc'); ?>
</body>
</html>

On the other hand, something that would be even nicer, IMAO, would be
something where you didn't even have to worry about that level of
"special markup", where you could just use your normal HTML files,
and then put the markup in the *template* file instead.
Which is why I wrote etalpmet (grin)
http://www.katspace.net/tools/etalpmet/

The idea is basically that HTML is well-formed enough that one can
extract the <head> stuff from between the <head> tags of your source
HTML file, and the page-content from between the <body> tags, and plug
them into the user-defined template file at the appropriate spots,
putting all your look-and-feel stuff into the template file.

This is only suitable for static generation, however, and needs to be
applied with something like a makefile... which is why I eventually went
on to write Posy (http://www.katspace.net/tools/posy/) after taking a side
trip through a messy collection of embedded perl, perl scripts and XSLT
glued together with makefiles... Posy is much nicer and cleaner (grin).

> allow here-documents that you don't have to escape, but Perl isn't a
> great choice because we're not doing any heavy-duty processing.

I'm not sure I understand why you think perl is a problem because you
aren't "doing any heavy-duty processing".  Huh?  Is Perl only supposed
to be used for heavy-duty processing?
 
> An alternative is to store the main content in a separate text file, but
> we don't want "unfriendly URIs" (URIs where the page is specified after
> a question mark, like http://example.com/site?page=foo/bar/baz) because
> they're not processed as well e.g. by search engines. And using a script
> as the root of the URI tree (such as
> http://example.com/script.pl/virtual/path) means that we can't later
> decide to put something in the path that the script doesn't process
> (such as http://example.com/script.pl/process-form.cgi).

This is where Apache Rewrite Rules can come to the rescue -- if your
web-host allows them, and runs Apache.
I had exactly this same problem, with how to hide the usage of a script
(posy.cgi) to generate the dynamic content of the site, and all three of
your "unfriendly URIs" above can be solved!

Here is the relevant part of my .htaccess file, which I will explain
below:

# Apache rewrite
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.cgi/$1 [L,QSA]
 
The idea is, the script which processes the site is stuck at the top,
called "index.cgi".  *All* the files which you want it to process,
live off somewhere else, a separate "data" directory, not in your HTML
tree.  (You set that location when configuring the script, so that it
knows where to look)

The RewriteCond parts of the rewrite rule basically say, that any time
it gets a request for a file and it can't find the file, or any time it
gets a request for a directory, and it can't find the directory (because
they aren't actually there underneath the DOCUMENT_ROOT), then follow
this rewrite rule.

And the RewriteRule says, everything that's been passed on to us (that's
the "^(.*)$" part of it, matching everything), call the index.cgi
script.

This way, you actually get the best of both worlds, because...

http://www.example.com/virtual/path

will be passed onto the script (silently transformed into
http://www.example.com/index.cgi/virtual/path)

However

http://www.example.com/shop/process_form.cgi

(assuming that shop/process_form.cgi ARE sitting there under
the DOCUMENT_ROOT directory)

will get passed straight through to the process_form.cgi
because it does exist, so the RewriteCond doesn't trigger.

Nifty, huh?

Kathryn Andersen
-=-=-=-=-=-=-=-=-
Hofstadter's Law:
	It always takes longer than you expect, even when you take
	Hofstadter's Law into account.
-- 
 _--_|\     | Kathryn Andersen	<http://www.katspace.com>
/      \    | 
\_.--.*/    | GenFicCrit mailing list <http://www.katspace.com/gen_fic_crit/>
      v     | 
------------| Melbourne -> Victoria -> Australia -> Southern Hemisphere
Maranatha!  |	-> Earth -> Sol -> Milky Way Galaxy -> Universe


More information about the Techtalk mailing list