[prog] Introduction and Perl: Flat File DB Query Question

Jacinta Richardson jarich at perltraining.com.au
Wed Mar 15 12:12:51 EST 2006


Katherine Spice wrote:

> $k = 0;
> $found = '';
> foreach $comparezip (@zip1) {
>         if ($comparezip eq $rc_zip) {
>                 $found = 'yes';
>                 last;
>         }
>         $k++;
> }

I don't think you mean to initialise $k where you do.  Otherwise each time
through the greater loop (for each line...) $k will be set to 0.

While this could have the effect intended as far as ensuring that each zip code
is unique, it will have the unintended side-effect of slowing the program
execution down a *lot* on large files.

What this code does is search through a growing array for every new line in the
file.  So, if your file is 100 lines long, then you'll perform this search 100
times, for array lengths of 0 to 99 units long.  In Comp. Sci parlance, this
means this algorithm is O(N^2), which usually means its not efficient for large
N (the length of the file in this case).

A better solution would be to use a hash:

   my %seen;
15 while (<INPUTFILE>) { #begin while
16     chop;
17     ($rc_name, $street_address, $city, $state, $rc_zip) = split (/\|/);
18
       # If we've already seen this do something.
       if( $seen{$rc_zip}++ ) {
              ....;
              next;
       }

       # It's a new zip.


The other points you raised highlight the real problems with this code.

	* the data file is opened and read twice (neither time with a
	  mode, so there's a security issue there).  Fortunately this is O(2N)
	  which is effectively the same as O(N) in the scheme of things.

	* The zip array and distance are not linked.  This would be a very
	  sensible place to use an array of hashes.

	* If we have to sort the array after creating it, why don't we just
	  use a hash to start with?

If I have some spare time today, I might do a code review and suggest some
greater changes to make things work better.

All the best,

     Jacinta

-- 
   ("`-''-/").___..--''"`-._          |  Jacinta Richardson         |
    `6_ 6  )   `-.  (     ).`-.__.`)  |  Perl Training Australia    |
    (_Y_.)'  ._   )  `._ `. ``-..-'   |      +61 3 9354 6001        |
  _..`--'_..-_/  /--'_.' ,'           | contact at perltraining.com.au |
 (il),-''  (li),'  ((!.-'             |   www.perltraining.com.au   |




More information about the Programming mailing list