[Courses] [Perl] Part 17: grep and map

Dan Richter daniel.richter at wimba.com
Fri Nov 28 13:01:50 EST 2003


LinuxChix Perl Course Part 17: grep and map

1) Introduction
2) grep - filters a list
3) map - transforms the values of a list
4) What "grep" and "map" have in common
5) Exercises
6) Answer to Previous Exercise
7) Past Information
8) Credits
9) Licensing

             -----------------------------------

1) Introduction

Before we finish looking at arrays in Perl, I thought we should take a 
quick look at two handy Perl functions: "grep" and "map". Both functions 
are technically operators because they allow you to do magical things 
that a function can't do, but syntactically they look like functions, so 
we refer to them as functions here.

Let me add that this will be the last e-mail before January. It's that 
busy-busy-busy time of year and I'm afraid that I have no more time to 
write about Perl than you have to read about it.

             -----------------------------------

2) grep - filters a list

The "grep" function returns only the elements of a list that meet a 
certain condition:

   @positive_numbers = grep($_ > 0, @numbers);

As you can see, each element is refered to as "$_". This (plus the fact 
that parentheses are optional) allows you write commands that look 
similar to invocations of the Unix "grep" program:

   @non_blank_lines = grep /\S/, @lines;

In addition, you can specify a code block rather than a single condition:

   @non_blank_lines = grep { /\S/ } @lines;     # Equivalent to the above.

Obviously it doesn't matter in this case, but code blocks are helpful 
when you want a complex filter with multiple lines of code. The result 
of the code block is the result of the last statement executed:

   # All positive numbers can be used as exponents,
   # but negative exponents must be integers.
   @can_be_used_as_exponent = grep {
     if ( $_ < 0 ) {
       ! /\./;          # No decimal point -> integer.
     }
     else {
       1;               # Always true.
     }
   } @array;

             -----------------------------------

3) map - transforms the values of a list

The "map" function applies a transformation to each element of a list 
and returns the result, leaving the original list unchanged (unless you 
mess it up; more on that in a moment).

   @lines_with_newlines = map( $_ . "\n", @lines_without_newlines);

As with "grep", each value in the list is refered to as "$_".

"map" can also take a block of code:

   # Replace "x at y.z" with "x at y dot z" to confuse spammers.
   @disguised_addresses = map {
       my $email = $_;
       $email =~ s/\@/ at /;
       $email =~ s/\./ dot /g;
       $email;
     } @email_addresses;

Note that it's important not to change "$_" because that would change 
the original "@email_addresses" (and you wouldn't get what you wanted in 
"@disguised_addresses").

"map" needs not be a one-to-one mapping. For example, in the following code:

   @words   =   map   m/\b(\w+)\b/g,   @lines;    # Spaces are for clarity.

the regular expression splits a string into a list of words. The "map" 
function returns the result of joining all the small lists. If a line 
contains no words, the regular expression will return an empty list, and 
that's okay.

             -----------------------------------

4) What "grep" and "map" have in common

"grep" and "map" have a lot in common. They both "magically" take a 
piece of code (either an expression or a code block) as a parameter. You 
need to put a comma after an expression but shouldn't put a comma after 
a code block.

Changing "$_" in "grep" or "map" will change the original list. This 
isn't generally a good idea because it makes the code hard to read. 
Remember that "map" builds a list of results by evaluating an 
expression, NOT by setting "$_".

A side effect of this fact is that you should not use "s///" with "map". 
The "s///" operator changes "$_" rather than returning a result, so you 
won't get what you would expect if you use it with "map" (and you 
CERTAINLY shouldn't use it with "grep").

             -----------------------------------

5) Exercises

a) Write some Perl code that, given a list of numbers, generates a list 
of square roots of those numbers. (The square root function in Perl is 
"sqrt".)

b) Modify the code to filter out any negative numbers. The result should 
be as though the negative numbers were never in the original list.

c) Write a Perl program that reads two files and outputs only the lines 
that are common to both of them.

             -----------------------------------

6) Answer to Previous Exercise

The following program reads the password file and outputs a list of 
usernames and UIDs, ordered by username:

   #!/usr/bin/perl -w
   use strict;

   open FILE, '< /etc/passwd' or die "Couldn't open file: $!";
   my @data = sort(<FILE>);
   close FILE;

   my @result;
   foreach (@data) {
     my @fields = split(/:/);    # Equivalent to split(/:/, $_)
     push @result, $fields[0] . ' -> ' . $fields[2];
   }

   print join("\n", at result) . "\n";

The above program is a nice review of Perl functions. But of course, 
There Is More Than One Way To Do It, and we could replace the bottom 
half with:

   foreach (@data) {
     s/^(.*?):.*?:(\d*):.*$/$1 -> $2/;
   }
   print join("\n", at result) . "\n";

Or to make the program really short:

   $_ = join '', @data;
   s/^(.*?):.*?:(\d*):.*$/$1 -> $2/gm;
   print;        # Prints "$_"

             -----------------------------------

7) Past Information

Part 16: Array Functions
      http://linuxchix.org/pipermail/courses/2003-November/001359.html

Part 15: More About Lists
      http://linuxchix.org/pipermail/courses/2003-November/001351.html

Part 14: Arrays
      http://linuxchix.org/pipermail/courses/2003-October/001350.html

Part 13: Perl Style
      http://linuxchix.org/pipermail/courses/2003-October/001349.html

Part 12: Side Effects with Perl Variables
      http://linuxchix.org/pipermail/courses/2003-October/001347.html

Part 11: Perl Variables
      http://linuxchix.org/pipermail/courses/2003-October/001345.html

Parts 1-10: see the end of:
      http://linuxchix.org/pipermail/courses/2003-October/001345.html

             -----------------------------------

8) Credits

Works cited:
a) man perlfunc
b) Kirrily Robert, Paul Fenwick and Jacinta Richardson's
    "Intermediate Perl", which you can find (along with their
    "Introduction to Perl") at:
    http://www.perltraining.com.au/notes.html

Thanks to Jacinta Richardson for fact checking.

             -----------------------------------

9) Licensing

This course (i.e., all parts of it) is copyright 2003 by Alice Wood and 
Dan Richter, and is released under the same license as Perl itself 
(Artistic License or GPL, your choice). This is the license of choice to 
make it easy for other people to integrate your Perl code/documentation 
into their own projects. It is not generally used in projects unrelated 
to Perl.




More information about the Courses mailing list