[prog] advice on data format

Kathryn Andersen kat_lists at katspace.com
Fri Feb 6 09:29:22 EST 2004


I have this suite of perl scripts (scripts, modules and CGI) that I use
as part of the stuff for generating my website, and I keep tinkering
with it, like a guy tinkers with an old car... The purpose of the suite
is to generate reports, and searches, and to edit, a text file
containing data.  (And if you ask "Why don't you just use MySQL?" then
I'll say "Because I don't want to!")

When I originally wrote this suite, I was using my own format, in a
variation on Key:Value pairs.  Then I decided to get trendy and change
the format to XML, so that it would be more useful to others if and when
I released my suite as Open Source.  Erm... using the fastest XML parser
I had access to, doubled the time it took to generate the report files.
No go.  However with hand-parsing and some other tuning things, I
managed to get the speed back to what it was originally.

Now a friend of mine mentioned Tie::File as a nifty Perl module that he
was looking at for some of his stuff.  I took a look at it, and it does
indeed appear to be nifty -- particularly for random access to a file,
but even the fact that you don't have to read the whole file in even
when one is intending to process the whole file, seems good, at least in
the memory-saving department.

However, whichever way I look at it, I can't see a way I could use
Tie::File on a file which is in XML format.  Yes, you can redefine the
record-separator to be something other than newline, but XML doesn't
have record *separators* -- even if the contents of the XML file were
all "record" tags, you still have the enclosing tags to deal with;
something which Tie::File would not comprehend.

So far as I can see, my options are as follows:
a) forget about the whole thing and stick to what I have now
b) change the data format to something more compatible with Tie::File
even though it loses me the interoperability of XML; write import and
export to XML scripts
c) try to subclass off Tie::File even though the author doesn't
guarantee the interface (ugh!)
d) grab the code from Tie::File holus-bolus and write my own Tie::MyData
module, even though the code for Tie::File looks horribly complicated
and I don't want to have to delve through it (ugh!)

I'm personally in favour of (a) or (b), but it would be good to get some
advice.

Kathryn Andersen
(currently trying to Document the suite...)
-- 
 _--_|\     | Kathryn Andersen	<http://www.katspace.com>
/      \    | 
\_.--.*/    | GenFicCrit mailing list <http://www.katspace.com/gen_fic_crit/>
      v     | 
------------| Melbourne -> Victoria -> Australia -> Southern Hemisphere
Maranatha!  |	-> Earth -> Sol -> Milky Way Galaxy -> Universe


More information about the Programming mailing list