[prog] Sql query
Conor Daly
conor.daly-linuxchix at cod.homelinux.org
Wed Jul 13 18:45:27 EST 2005
On Wed, Jul 13, 2005 at 09:02:03AM +0200 or so it is rumoured hereabouts,
Tiera thought:
>
> Yes, I know... but I can't come up with anything else that works at the
> moment. The problem with you query above is that the only status I'm
> interested in is the status for the last period for every product. So if I
> have a product that has two periods, the first one with status 'Special'
> and the last one with status 'Ordinary' and I use your query to pick out
> all products with status 'Special' I would get this product, but that's
> not correct since this product now has status 'Ordinary'.
> The problem would be much easier to solve if there always was one period
> with no enddate...but unfortunately that's not the case. Or another column
> that would tell if it was the last period or not...should have thought of
> that earlier perhaps...
I've started using an extra 'rowid' column to order associated rows in
some of my tables. Basically, where I have a loader program that must
make a decision whether a dataset it wants to load is already in (so it
gets ignored) or is a superset of an existing row (so it replaced the
existing row) or whether it should load aswell. For any set of associated
rows (using 'location' and 'date' for my data, 'productid' for yours),
'rowid' starts at 1 and increases for each row. With your dataset, if
mysql allows you to alter the table and if your 'startdate' is always
increasing, you could add such a rowid. I don't think it would help in
this case though since subselects seem to be necessary for pretty much any
of the methods that come to mind. Of course, you could add a 'current' or
'latest' column but that requires more maintenance:
If you add a 'latest' column to the period table and you want to use it to
identify the latest row (per productid of course), you've got to use an
on/off datavalue for it. Now, when you add a new period, you've got to
look for the last corresponding period and set its 'latest' column to
'off' while setting the new period's 'latest' column to 'on'.
OTOH, what would it cost to add a 'latest' table containing 'period.id',
'period.productid' where it gets updated each time you load a row to
'period' with a query of the form:
update latest set id=$foo where productid=$bar
or
insert into latest values ($foo, $bar)
This assumes period.id is unique in the 'period' table.
You don't say how tables product and period are updated. Are you running
programs to update them or is it plain sql? All of our data insertions
are done by loader programs so we can do the multiple querying necessary
to make all this happen. If you're loading by hand, there's an extra step
to do.
You can get the latest id, productid pair with:
select id, productid, max(startdate) from period
group by id, productid
to load the 'latest' table.
Conor (_not_ a database designer...)
--
Conor Daly <conor.daly at oceanfree.net>
Domestic Sysadmin :-)
---------------------
Hobbiton.cod.ie
09:22:26 up 6 days, 12:16, 1 user, load average: 0.08, 0.03, 0.01
More information about the Programming
mailing list