[prog] perl string question

Thu Feb 27 20:59:14 EST 2003

oh! sorry... i got distracted by regex's and mis-split your string!

if you want to have everything after Message-ID be on the new line, you
can't rely on the colon followed by the space...

so i think this would work better:

$string = "^~00002439:0000007179:000016:Message-ID: <3E33C3BD.21CD8BB6 at att.net>"
$string =~ /^(.*:)(Message-ID.*)$/;
$first_part = $1;
$second_part = $2;

and now it occurs to me that you could actually use substitution as
well--

$string =~ s/^(.*:)(Message-ID.*)$/$1\n$2/;

so the substitution looks a lot like a regular expression, but it has
three /'s rather than just two:  s///;
what happens is that perl matches the string against the regular
expression between the first two /'s, and then it replaces the matched
portion of the string (which is the entire string in this case) with
what is between the second and third /.  just like we used $1 and $2
after the matching in the straight regex, you can use the $1 and $2
variables in the second part of the substitution function, and you can
also insert a newline between them.
this way, now you can just print $string and it will already contain the
line separation...

the new string should be:
^~00002439:0000007179:000016:\nMessage-ID: <3E33C3BD.21CD8BB6 at att.net>

ok, i hope that wasn't too much (and also that it was helpful and not
things you already knew!)

kristina

On Thu, Feb 27, 2003 at 07:44:04PM -0500, k.clair wrote:
- Hello...
- 
- I'm just going to do the string bit and not the printing bit because the
- string bit is long enough...
- 
- well you have a few options:
- 
- --you can still use split:
-     since you know that the string has 4 colons, you know how many
-     strings split will split your string into:
-     ($one, $two, $three, $four, $five) = split(/:/, $yourstring);
- 
-     then you will of course need to re-insert the colons back into the
-     string you print:
- 
-     print FILE "$one:$two:$three:\n";
-     print FILE "$four: $five\n";
- 
- --you can use a regular expression:
- 
-     regular expressions are "greedy" which means that if you use an
-     operator in a regular expression like + or *, it will capture as
-     much of the string as possible, so--
- 
-     $string = "^~00002439:0000007179:000016:Message-ID: <3E33C3BD.21CD8BB6 at att.net>"
-     $string =~ /^(.*:)(\s.*)$/;
-     $first_part = $1;  
-         ### $first part is "^~00002439:0000007179:000016:Message-ID:"
-     $second_part = $2;
-         ### $second_part is " <3E33C3BD.21CD8BB6 at att.net>"
- 
-     what's going on here is--
-     -- the parentheses in the regex mean that perl should remember what
-     the part of the string was that matched that part of the regex.
-     perl reserves variables $1, $2, $3, etc, that correspond to the
-     order of the parentheses in the regex.
-     -- the ^ at the beginning of the regex means that perl should start
-     at the beginning of the string (which it would anyway but i always
-     include it for good measure... sometimes it really is necessary)
-     -- the $ at the end of the regex means to match until the very end
-     of the string
-     -- the . means to match any character (including spaces)
-     -- the * means to match the pattern before it (which in this case is
-     the .) as many times as possible, so in this case it means to match
-     any character as many times as possible
-     -- \s matches a space
-     -- so perl starts at the beginning of the string and says "match
-     every character until i find a colon followed by a space... now, it
-     finds a colon after the "9", but it won't stop there because there
-     is not a space after it.  so the space after the last colon was
-     handy in this case... if there was not a space it would still work,
-     though because the * operator is "greedy", so perl would have the
-     regex eat up the entire string until the last colon, putting as many
-     characters into that first * as possible while still being able to
-     match the regular expression.
- 
-     well that was a bit of a crash course in regular expressions! let me
-     know if you have any questions about that!!
- 
-     usually you'd probably want to use a regex that's a little more
-     specific so that you can account for any erratic data that might be
-     in your file, but that should get you started...
- 
- hope that helps a little,
- kristina
-     
- 
- On Thu, Feb 27, 2003 at 05:22:51PM -0700, mc wrote:
- - Hi all!  I am writing a perl script that eventually will return a file
- - that can be used to pull some posting statistics from an nntp server. 
- - Amazing since I am teaching myself perl at the same time (can't wait for
- - the course to start here!)
- - 
- - Anyway.  I think I have made good progress in that I have the output
- - file to the point where I have all the header information I need for
- - each post on the server.  My problem is that the first line of the
- - headers contains two bits of info, and I would like to separate them. 
- - An example:
- - 
- - ^~00002439:0000007179:000016:Message-ID: <3E33C3BD.21CD8BB6 at att.net>
- - 
- - I would like to output this to my working file so it ends up as:
- - 
- - ^~00002439:0000007179:000016:
- - Message-ID: <3E33C3BD.21CD8BB6 at att.net>
- - 
- - I thought I could do this with the split function, but since I only want
- - to split it at the third ":" I am not sure how to write that.  I am also
- - lost as to how then to print the results to my file.  I am sure it is a
- - print statement, just a bit lost :)
- - 
- - Once I get this figured out, I am pretty sure I have the next part
- - figured out and am almost done. 
- - 
- - Any ideas or pointers to the best way to do this would be very much
- - appreciated!!!
- - 
- - 
- - -- 
- - mc
- - I haven't lost my mind,
- - It is backed up on disk somewhere.
- - 4M
- - 
- - _______________________________________________
- - Programming mailing list
- - Programming at linuxchix.org
- - http://mailman.linuxchix.org/mailman/listinfo/programming
- 
- ### my gpg key can be found here:
- http://www.klerp.net/gpgkey
- lynx --dump --source http://www.klerp.net/gpgkey | gpg import
- Key fingerprint = 6B2F AB26 A8A9 DE4D 91FD  8E93 7A6B 387A 2795 714B

### my gpg key can be found here:
http://www.klerp.net/gpgkey
lynx --dump --source http://www.klerp.net/gpgkey | gpg import
Key fingerprint = 6B2F AB26 A8A9 DE4D 91FD  8E93 7A6B 387A 2795 714B
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 240 bytes
Desc: not available
Url : http://linuxchix.org/pipermail/programming/attachments/20030227/8dfab9de/attachment.pgp