[prog] perl string question
k.clair
k at klerp.net
Thu Feb 27 20:59:14 EST 2003
oh! sorry... i got distracted by regex's and mis-split your string!
if you want to have everything after Message-ID be on the new line, you
can't rely on the colon followed by the space...
so i think this would work better:
$string = "^~00002439:0000007179:000016:Message-ID: <3E33C3BD.21CD8BB6 at att.net>"
$string =~ /^(.*:)(Message-ID.*)$/;
$first_part = $1;
$second_part = $2;
and now it occurs to me that you could actually use substitution as
well--
$string =~ s/^(.*:)(Message-ID.*)$/$1\n$2/;
so the substitution looks a lot like a regular expression, but it has
three /'s rather than just two: s///;
what happens is that perl matches the string against the regular
expression between the first two /'s, and then it replaces the matched
portion of the string (which is the entire string in this case) with
what is between the second and third /. just like we used $1 and $2
after the matching in the straight regex, you can use the $1 and $2
variables in the second part of the substitution function, and you can
also insert a newline between them.
this way, now you can just print $string and it will already contain the
line separation...
the new string should be:
^~00002439:0000007179:000016:\nMessage-ID: <3E33C3BD.21CD8BB6 at att.net>
ok, i hope that wasn't too much (and also that it was helpful and not
things you already knew!)
kristina
On Thu, Feb 27, 2003 at 07:44:04PM -0500, k.clair wrote:
- Hello...
-
- I'm just going to do the string bit and not the printing bit because the
- string bit is long enough...
-
- well you have a few options:
-
- --you can still use split:
- since you know that the string has 4 colons, you know how many
- strings split will split your string into:
- ($one, $two, $three, $four, $five) = split(/:/, $yourstring);
-
- then you will of course need to re-insert the colons back into the
- string you print:
-
- print FILE "$one:$two:$three:\n";
- print FILE "$four: $five\n";
-
- --you can use a regular expression:
-
- regular expressions are "greedy" which means that if you use an
- operator in a regular expression like + or *, it will capture as
- much of the string as possible, so--
-
- $string = "^~00002439:0000007179:000016:Message-ID: <3E33C3BD.21CD8BB6 at att.net>"
- $string =~ /^(.*:)(\s.*)$/;
- $first_part = $1;
- ### $first part is "^~00002439:0000007179:000016:Message-ID:"
- $second_part = $2;
- ### $second_part is " <3E33C3BD.21CD8BB6 at att.net>"
-
- what's going on here is--
- -- the parentheses in the regex mean that perl should remember what
- the part of the string was that matched that part of the regex.
- perl reserves variables $1, $2, $3, etc, that correspond to the
- order of the parentheses in the regex.
- -- the ^ at the beginning of the regex means that perl should start
- at the beginning of the string (which it would anyway but i always
- include it for good measure... sometimes it really is necessary)
- -- the $ at the end of the regex means to match until the very end
- of the string
- -- the . means to match any character (including spaces)
- -- the * means to match the pattern before it (which in this case is
- the .) as many times as possible, so in this case it means to match
- any character as many times as possible
- -- \s matches a space
- -- so perl starts at the beginning of the string and says "match
- every character until i find a colon followed by a space... now, it
- finds a colon after the "9", but it won't stop there because there
- is not a space after it. so the space after the last colon was
- handy in this case... if there was not a space it would still work,
- though because the * operator is "greedy", so perl would have the
- regex eat up the entire string until the last colon, putting as many
- characters into that first * as possible while still being able to
- match the regular expression.
-
- well that was a bit of a crash course in regular expressions! let me
- know if you have any questions about that!!
-
- usually you'd probably want to use a regex that's a little more
- specific so that you can account for any erratic data that might be
- in your file, but that should get you started...
-
- hope that helps a little,
- kristina
-
-
- On Thu, Feb 27, 2003 at 05:22:51PM -0700, mc wrote:
- - Hi all! I am writing a perl script that eventually will return a file
- - that can be used to pull some posting statistics from an nntp server.
- - Amazing since I am teaching myself perl at the same time (can't wait for
- - the course to start here!)
- -
- - Anyway. I think I have made good progress in that I have the output
- - file to the point where I have all the header information I need for
- - each post on the server. My problem is that the first line of the
- - headers contains two bits of info, and I would like to separate them.
- - An example:
- -
- - ^~00002439:0000007179:000016:Message-ID: <3E33C3BD.21CD8BB6 at att.net>
- -
- - I would like to output this to my working file so it ends up as:
- -
- - ^~00002439:0000007179:000016:
- - Message-ID: <3E33C3BD.21CD8BB6 at att.net>
- -
- - I thought I could do this with the split function, but since I only want
- - to split it at the third ":" I am not sure how to write that. I am also
- - lost as to how then to print the results to my file. I am sure it is a
- - print statement, just a bit lost :)
- -
- - Once I get this figured out, I am pretty sure I have the next part
- - figured out and am almost done.
- -
- - Any ideas or pointers to the best way to do this would be very much
- - appreciated!!!
- -
- -
- - --
- - mc
- - I haven't lost my mind,
- - It is backed up on disk somewhere.
- - 4M
- -
- - _______________________________________________
- - Programming mailing list
- - Programming at linuxchix.org
- - http://mailman.linuxchix.org/mailman/listinfo/programming
-
- ### my gpg key can be found here:
- http://www.klerp.net/gpgkey
- lynx --dump --source http://www.klerp.net/gpgkey | gpg import
- Key fingerprint = 6B2F AB26 A8A9 DE4D 91FD 8E93 7A6B 387A 2795 714B
### my gpg key can be found here:
http://www.klerp.net/gpgkey
lynx --dump --source http://www.klerp.net/gpgkey | gpg import
Key fingerprint = 6B2F AB26 A8A9 DE4D 91FD 8E93 7A6B 387A 2795 714B
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 240 bytes
Desc: not available
Url : http://linuxchix.org/pipermail/programming/attachments/20030227/8dfab9de/attachment.pgp
More information about the Programming
mailing list