Operators and Comparators
Operator | Meaning |
** | Exponentiation |
**= | Exponentiation assignment |
() | Null list |
. | String concatenation |
.= | String concatentation assignment |
eq, ne, ge, gt, le, lt | String comparison |
x | String repetition |
.. | Range |
-f, -x, -d, etc | Unary file test operators. Perl has the ability of testing various file settings. |
Control Structures
""
# false
"0"
# false
"00"
# true
$n - $n # "0": false
undef # undef
(undefined) is converted to "": false.
is equivalent to
if (! some-condition)
{ action;
}
DAILY_WORK:
while (1)
{ while (! &time_up_for_the_day)
{ last if &boss_let_go_early;
last DAILY_WORK
if &win_lottery;
&work_a_while;
redo if &overtime_not_over;
} continue
{ &play_a_game_secretly_to_relax;
}
}
&work unless &too_tired;
&work if &having_fun;
&work until &too_tired;
&work while &having_fun;
is equivalent to:
stmt_1;
while (stmt_2)
{ stmt_4;
Stmt_3;
}
Note: The variable $num is set to the value of the element of @num_list in turn.
foreach $num (@num_list)
{ print "$num\n";
}
is the same as the followings:
foreach (@num_list)
{ print "$_\n";
}
or
foreach (@num_list)
{ print;
print "\n";
}
do
{ &i_like_it_this_way;
print "interesting stuff\n";
}
do
{ &work;
} until &tired;
if (&error) { die "ay-ya-ya\n"; }
die "ay-ya-ya\n" if &error;
&error && die "ay-ya-ya\n";
unless (&kiss_me) { &leave_me; }
&leave_me unless &kiss_me;
&kiss_me || &leave_me;
Consider the C's statement:
switch (ch)
{ case 'a': a_ct++; break;
case 'e': e_ct++; break;
case 'I': i_ct++; break;
case 'o': o_ct++; break;
case 'u': u_ct++; break;
default : other_ct++; break;
}
Implement the same statement in Perl in two ways.
Regular Expressions
if (/good/)
{ print;
}
# if $_ contains the pattern "good", then $_ is printed.
Note:
Most other backlashed characters match themselves.
Exercise 2:
Find the single character pattern that matches the following description.
(a) all vowels,
(b) all non-vowels,
(c) all characters except lower case letters (other than 'a' to 'z'),
(d) the backspace character,
(e) carriage return or form feed,
(f) the character ^,
(g) any character in my name ("kwok-bun Yue").
/ab1/
matches “ab1"
/a[aeiou]c/ matches “aac”, “aec”,
“aic”, “aoc” and “auc”
/a.a/
matches an “a”, follows by any character
and then another “a”.
/xy{2,4}/ matches “xyy”, “xyyy”
and “xyyyy”
/x+y*x+/ matches one or
more ‘x’, follows by 0
or more “y”, follows by 1 or more “x”.
/abc|ace/ matches “abc” or “ace”
/[abc]{4}/ matches a string of 4 characters
of ‘a’, ‘b’ or ‘c’.
Consider the string “abccccbaccccba”.
The pattern
/a.*ba/ matches the
entire string,
not “abccccba”.
/a.*ba.*/ matches the entire
string;
with the first “.*”
matching “bccccbacccc”.
/(.)a\1/ matches “aaa”,
“bab”, “xax”, “5a5",
etc., but not “5a6", etc.
/(.*)a\1/ matches “abaab”, “a”,
“cidacid”, etc.
/([abc])x([de])y\2x\1/
matches “axdydxa”, etc.
Example:
/\bair\b/ matches “ air&”, “+air+”,
“air”, etc,
but not “hair”, “airs”, etc.
/\bair\B/ matches “airs”, “+airing”,
etc,
but not “air&”, “+air”, “air”, etc.
/^air/ matches “air”,
“airs”, etc, but not “hair”.
/^air$/ matches “air” only.
/a|bc*/ is equivalent to /(a)|((b)(c*))/
Exercise 3:
Give the Perl’s pattern for the following matching:
(a) either “abcde” or “edcba”.
(b) at least two b followed by at least seven c.
(c) any number of *, followed by any number of $, followed by any number
of +.
(d) a ^ at the beginning of a string, followed by three to four a.
(e) any ten characters, including newline, just before the end of the
string.
(f) any string with the same word in a row for two or more times.
A word is defined as a sequence of alphanumeric or '_', enclosed by white
spaces or beginning or end of a string.
if (/life is (.*)\./)
{ print $1;
}
if (@s = /love is (.*) and hatred is (.*)\./)
{ print "$s[0], not $s[1]";
}
# print all lines in the file example.dat
that contain "[n]",
# where n is given by the user.
print "what is the index"?"
$index = <STDIN>;
chop($index);
open(IN,"example.dat");
while (<IN>)
{ if (/\[$index\]/)
{ print;
}
}
Exercise 4:
Write a Perl program to read in a file "a.a" and prints out all lines that contain the characters ‘a’, ‘c’, ‘e’ and ‘g’.
Matching Operators
/^\/usr\/bin\/perl/
m#/usr/bin/perl#
# Print all lines from the standard input
file that contain the
# string "perl" somewhere in the line, case
ignoring.
while (<STDIN>)
{ if (/perl/i)
{ print;
}
}
while ($line = <STDIN>)
{ if ($line =~ /perl/i)
{ print $line;
}
}
...
print "Do you want to quit? [y/n]";
if (<STDIN> =~ /y/i)
{ die "bye, dear.";
}
Substitution and other common operators using regular expressions
$_ = "I love you.";
s/love/hate/;
print; # print out
"I hate you."
$_ = "I love you and you love me.";
s/love/hate/;
print; # print out
"I hate you and you love me."
$_ = "I love you and you love me.";
s/love/hate/g;
print; # print out
"I hate you and you hate me."
The following is a command line execution of Perl. The switch -e indicates command line execution. The switch -n loops through each line of the file in the command line.
$perl -ne "s/love/hate/g; print;" love_letter.dat
$perl -ne "s/\$i\b/$count/g; print;" <
ex1.pl > ex2.pl
$line = "Merlyn::118:10:Randal:/home/merlyn:/usr/bin/perl";
@fields = split(/:/,$line);
# now fields is ("merlyn","","118","10","Randal",
#
"/home/merlyn","/usr/bin/perl")
Exercise 5:
Write a piece of Perl’s code that reads the file "some.file" and breaks down the contents into tokens. A token is a string of characters (other than white spaces) that are separated by white spaces. The tokens should be stored in the variable @words.
$glue = ":";
@list = ("12", "05","59");
print join($glue, @list); # print "12:05:59"
The first perl command swaps x and y. The second example changes all lower case characters to upper case characters.
$perl -ne ‘tr/xy/yx/; print;’ < e1.dat
> e2.dat
$perl -ne ‘tr/a-z/A-Z/; print;’ < emp1.dat
> emp2.dat
Exercise 6:
Write a Perl program to get rid of all comments of an Ada program, "ex1.ada".
In Ada, anything after -- in a line is discarded by the compiler.
Print out the Ada program without comments to the standard output file.
Suggested Solution to Classwork Exercise
1. For example,
{ $ch eq 'a' && ($a_ct++, last);
$ch eq 'e' && ($e_ct++,
last);
$ch eq 'I' && ($i_ct++,
last);
$ch eq 'o' && ($o_ct++,
last);
$ch eq 'u' && ($u_ct++,
last);
$other_ct++;
}
or
{ ($a_ct++, last) if ($ch eq 'a');
($e_ct++, last) if ($ch eq 'e');
($i_ct++, last) if ($ch eq 'I');
($o_ct++, last) if ($ch eq 'o');
($u_ct++, last) if ($ch eq 'u');
$other_ct++;
}
or
S1:
{ $ch eq 'a' && do {$a_ct++;
last S1;}
$ch eq 'e' && do {$e_ct++;
last S1;}
$ch eq 'I' && do {$i_ct++;
last S1;}
$ch eq 'o' && do {$o_ct++;
last S1;}
$ch eq 'u' && do {$u_ct++;
last S1;}
$other_ct++;
}
(2)
(a) [aeiouAEIOU]
(b) [^aeiouAEIOU]
(c) [^a-z]
(d) \010
(e) [\r\f]
(f) \^
(g) [kwo\-bunYe]
(3)
(a) /abcde|edbca/
(b) /b{2,}c{7,}/
(c) /\**\$*\+*/
(d) /^\^a{3,4}/
(e) /(.|\n){10}$/
(f) /\b(\w*)\b(.*\b\1\b)+/
(4) For example,
#!/usr/bin/perl
open(IN, "a.a");
while (<IN>)
{ if ((/a/) && (/c/) &&
(/e/) && (/g/))
{ print;
}
}
(5) For example,
# Decompose a file into tokens
with
# white spaces as delimiters.
open(IN, "some.file");
while (<IN>)
{ chop;
@words = (@words,
split(/\s+/));
}
(6) For example.
#!/usr/bin/perl
# This does not take care of the problem
of
# -– inside a string.
open(IN, "ex1.ada");
while (<IN>)
{ while (/(.*)--/)
{ $_ = $1 . "\n";
}
print;
}
or simply:
# This does not take care of the problem of
# -– inside a string.
perl -ne "chomp; s/^(.*?)--.*/\1/; print
qq($_\n);" ex1.ada