[Actionchix] HTML cleanup

jennyw jennyw at dangerousideas.com
Tue May 9 13:09:30 EST 2006

Mary Gardiner wrote:
> Is this going to need to happen? I notice on the test import that
> http://linuxchix.dangerousideas.com/docs/press/chixinnews.html/ looks
> pretty shocking, and has some bad HTML in the page. Jen, do you have a
> sense of how much should be looked over before the final move?

Yep, a lot of cleanup is going to need to happen! What you see on
http://linuxchix.dangerousideas.com/ was the first import I did (a few
weeks ago). Even before doing this, Gloria had done a bunch of work
looking at an automated import and looking at the HTML and we had
decided that a fully automated process was not practical.  I still
thought, though, that some things would benefit from automation, so I
converted all the old docs to xhtml w/ a script and then imported them
into Plone (using the standard Plone skin) just to see what they'd look
like.  I also took a closer look at the HTML and ended up asking Chris
Hardy for help.

Some findings: a) the URLs and titles are inconsistent; b) the markup,
while usually consistent, isn't what we'd want semantically (font tags
used in place of headers); c) sometimes the structure isn't consistent;
d) while there are some things, like the headers, we can automate, a lot
of it is just going to need to be done by hand. 

Chris is working on analyzing the HTML documents to figure out what can
be automated (I'm guessing not too much), and what we need to do by hand
(I'm guessing a lot). Nothing is going to be changed until we have a
process for documenting those changes. In particular, we need to
document any URL changes, so we can create a redirect map.  We'll also
want to come up with standards for URLs, titles, the semantic structure
of the pages, and possibly changes to the organization (all in a
proposal to be approved here before it's actually implemented). 

If anyone has ideas or wants to get involved in proposing guidelines for
the import, feel free to speak out (mostly it's me and Chris talking
right now).  Apologies for things taking a while -- I think both Chris
and I have gotten hit with a few things lately (the consulting life --
feast or famine). My schedule at least is going to be a bit more
reasonable again soon.


More information about the Actionchix mailing list