Perl

Perl 5

Overview

Variable Syntax:

Scalar:           $ (single value)
Array:           @ (list of values, keyed by number)
Hash:            % (group of values, keyed by string)
Subroutine:    & (callable code)
Typeglob:      * (anything and everything)
Case is significant in Perl
You must chmod a file to make it executable
A module is an independant script which can be called from any other script. A pragma is a module which contains some sort of directive that must be obeyed.
From the command line, perl fileName runs a script you have created. perl -e '...code...;' runs Perl code straight from the command line.
The -e in the above example is a switch. perl -h shows you a whole list of switches. For example, -ne puts a while (<>) {...} construct around whatever code you supply (that code being made allowable by the e). -w turns on warnings (to help you write better code). -d turns on the built in Perl debugger (has a big set of its own commands, including s to step to next line, /xx/ to search code for xx, and much more...)
In UNIX, the first line of every Perl script should be: #!/user/bin/perl (or wherever perl is installed on the machine) - this negates need to call Perl at command line
Any command line option can be called within a script itself, by adding that option to the end of #!/user/bin/perl (#!perl is sufficient in windows)
@ARGV is a magic array that holds all input variables passed through the command line (eg. "myScript.pl arg1 arg2")
A filehandle is a name you give to a file, device, socket, etc, as a working reference. Create a filehandle by using the open function. Note that, if the file path contains backslashes, you will need to either escape them, or use forward slashes in their place:

open(FILEHANDLE, "actualFilePath");                 -- reads existing file
open(FILEHANDLE, ">actualFilePath");               -- create file and write to it
open(FILEHANDLE, ">>actualFilePath");             -- append to existing file
open(FILEHANDLE, "| output-pipe command");   -- set up an output filter
open(FILEHANDLE, "input-pipe command |");     -- set up an input filter

STDIN and STDOUT are predefined filehandles - a program's normal input and output channels. STDIN always has a line break tacked onto its end; you can chop its result value to remove the line break (chop removes the last character fed to it)
Close a filehandle with: close(fileHandle);
Read a line from a filehandle by enclosing it in <>, eg. $var = <STDIN>;
Use while($line = <fileHandle>) { ... } to loop through every line of a file. Even shorter, if you drop the $line=part, then the value of each looping will automatically be assigned to $_. Even shorter, you can do:statement while (<fileHandle>)
To quickly open a file, back it up, make changes to it, and write those changes to the original filename:

@ARGV = pathName $^I = ".bk" While (<>) { print; }
- Explanation: $^I takes the first element in@ARGV, renames it to its existing name plus ".bk" at the end. Next, the file is read line by line into $_. Next, a file handle called ARGVOUT is opened, and print is targeted upon it, instead of STDOUT as usual.ARGVOUTis opened on the original file name, so all changes are made in the "new" file.
- This can be invoked in one line with:perl -i.bk -ne"print if /regex/" pathName. Recall that -e invokes Perl, and -n adds the loop
opendir(FILEHANDLE, "actualPath") can be used like the open function, but to open whole directories instead of files. Then you can read the directory contents with:while($file = readdir FILEHANDLE), which reads each file into $file. Note: perl always assumes you are in the directory the script was run from (or the directory designated by the last chdir). So, to test/access$filefrom within the loop, you'll need to chdir, or explicitly state the file's path before $file (eg."actualPath/$file"), etc. This is not an issue if you are simply listing the files, but not doing anything with them.

Arithmetic Operators:

modulus:	`%`
exponentiation:	`**`
repetition:	`x` (right value equals number of repeats)
concatenation:	`.`

$var++ and $var-- increment and decrement $var by 1 (after it has been referenced by the = operator) . Putting the operators before the $var alter its value before it is referenced.
$var .= $var2 appends $var2 to the end of $var. This works with the other operators, as well.

Logical Operators:

OR	`or`	lower precedence; evaluates entire left side for truth, then (if false), evaluates right side, and treats left side as if it never existed
	`\|\|`	higher precedence; evaluates code immediately to its left, then (if false) evaluates code immediately to right. However, any code on left side that was NOT evaluated is not forgotten, and will ultimately be run against whatever is left on right side. This would, for example, break: `open FILE, $file \|\| print "failed";` (error message doesn't print, even if the file does not exist, because`$file`evaluates to true - it doesn't evaluate ALL of left side)
AND	`&&`	higher precedence; Evaluates left side, and if false, returns false with no other action
	`and`	lower precedence than `&&`. See OR for details.
	`&`	Like `&&`, but evaluates left side, and if false, still evaluates right side (even if it will ultimately return false anyway, overall). Remember, in Perl, evaluation means something could change (the value of a variable, for example)
XOR	`xor`	Either/Or; returns false unless one side is true, and one side is false
NOT	`not`	lower precedence
	`!`	higher precedence

Comparison Operators: == is "equal", != is "not equal", < is "less than", > is "greater than", <= is "less than or equal", <=> is "0 if equal, 1 if former is greater, =1 if latter is greater" - this function should be used to compare numeric values. For strings, use cmp (which serves the same purpose).

File Test Operator:

`-e fileName`	file exists
`-r fileName`	file is readable
`-w fileName`	file is writeable
`-d fileName`	file is a directory
`-f fileName`	file is a regular file
`-t fileName`	file is a text file

Truth in Perl: Any string is true except for "" and "0". Any number is true except for 0. Any reference is true. Any undefined value is false. "A" == "B" evaluates to true, "A" eq "B" evaluates to false; same for > (gt) and < (lt).
Within a loop, you can call next to skip to the next loop iteration.
Within a conditional statement, you can call last (orlast if) to skip to the end of the block, as if the condition returned false.
To name a loop: LOOPNAME: while (=) {...} - you can then last, next, continue, and redo that loop specifically (in multilevel loops) - eg. last LOOPNAME if ...;
$! is a predefined variable: it contains the most recent Perl error message
CTRL-C will interrupt a Perl program
Alias one symbol to another with typeglob:*var = *var2results in everything named var being a synonym for everything name var2. Can also glob just *var to var2. Prefix var2 with \$ if you just want scalar vars to be synonyms, not also @ and %
Save a filehandle with typeglob: eg. $var = *STDOUT
Regular Expressions:
- if ($var =~ /someText/ ) returns true if someText exists within $var
- s/text1/text2 replaces all occurences oftext1 with text2. This can also be accomplished with s{text1}{text2}
- Within the two slashes, \s represents a single unit of whitespace, \w represents a single word character (digit or letter), \d represents a digit, and \b represents a word-boundry (imaginary distinction, 0 chars wide) - this enables you to find "Fred" as opposed to "Frederick"
- \w+ represents a whole word.
- A period is a wildcard
- Two numbers in braces, separated by a comma (eg. /\d{7,11}/) match on their associated character if it repeats between their number of times. Omit the maximum for no max. Omit the comma and max to limit by exactly the first number
- * next to a character matches zero or more of the character. ? matches zero or one.
- [^] negates any character left of the ^

Details

Arrays:

$#aryName retrieves the last index of an array. Assigning to $#aryName changes the length of the array.
scalar(@aryName) returns the number of elements in an array.
print @aryName writes all the items in array, in one long string; print "@aryName" writes all items, but separates each with a single space.
@aryName[3..5] represents indexes 3 thru 5

Assign a scalar array value: $aryName[indexNum] = strValue;
@aryName = qw(stringNoQuotes1 stringNoQuotes2 etc) inserts values (as strings) into @aryName, split by space(s)

Array functions:

`push(@aryName, value1, value2, etc...)`	adds value(s) to end of array
`pop(@aryName)`	removes and returns value from end of array
`shift(@aryName)`	removes and returns value from beginning of array
`unshift(@aryName, value1, etc...)`	adds value(s) to beginning of array
`splice(@aryName, offset, iRem, values...)`	deletes `iRem` values from array, starting at `offset`, and replaces them with `values`(which are optional)

Assign a hashed (associative) array key-value pair:

%aryName = ( "key1" => "value1", "key2" => "value2", );

or

%aryName = ('key1', 'value1', 'key2', 'value2', etc...);

or

$aryName{'key1'} = 'value2'; (*Note the use of {} instead of [] as in scalar arrays)
Perl automatically re-orders hash arrays (unlike scalar arrays) for quick access
Each key of a hash must be unique, though values do not.
You cannot use the normal array functions (push, pop, etc) on a hash; you can only assign a value to a key, and delete that key/value pair with: delete $aryName{'keyName'}
In addition to the standard way of accessing hash values, you can:
- someCode keys %aryName
- someCode values %aryName
- someCode @aryName{'key1', 'key2'}
- someCode scalar(keys %aryName) - returns # of elements
- someCode exists $aryName{'key1'} - boolean
Iterating through a hash: can be done either with:
foreach (keys %aryName) {print "key $_ contains $aryName{$_}; }, or:
while (($key, $value)=each %aryName) {print "key $key contains $value"; }
Iterations can be sorted: foreach(sort keys %aryName) and(reverse sort keys %aryName)

Completely destroy a variable with undef $varName - as opposed to just setting it equal to "";
Double-quoted strings are subject to blackslash and variable interpolation, single-quoted strings are not (except for \' and \\). Remember: if you want a $, @, %, or & (the variable indicators) in a string, you must either single quote the string or backslash the symbol!
In a string: \n = newline, \r = carriage return, \t = tab, \f = form feed, \b = backspace, \a = alert (bell), \e = escape, \cC = control-C
To force modification of characters: \u = uppercase next char, \l = lowercase next char, \U = uppercase all following chars, \L = etc, \Q = backslash all following non-alphanumeric chars
Simple scalar variables can be included in a string and will be interpolated (like in PHP)
Assignment of a value to a scalar variable evaluates the right-hand side in a scalar context, while assignment to an array or hash evaluates the right-hand side in a list context. Basically, this means that a scalar can only hold one value, lists can hold many. Setting$var = @varwill take the last value in the @var list and put it in $var. A list can contain other lists as well: eg. (@foo, @bar, &someSub)
Special scalar contexts include: boolean, void (doesn't want a return value), and interpolative (backslash interpretation within strings, pattern functionality, etc)
A null value in a list is represented by (). You can destroy an entire array by setting it equal to ()
All environment variables (for example, the path variable) can be accessed through %ENV. You can see the value of every environment value simply by calling set from the command prompt.
Input Operators:

Command input operator:$var = `someCommand $someVar`will interpolate $someVar, then interpret the enclosed system command as a whole through the shell. You can run any linux command this way. Using backticks will return the entire output of the process invoked, not just a success or error code. This can also be accomplished with qx/someCommand/ or qx(someCommand). Other ways to call system commands include:
- Via a filehandle (the same way you read a file), ie:
  open HANDLE, "someCommand |" while (<HANDLE>) {print "$. $_";}
  
  The pipe character, |, will cause data to be piped from the specified process in this configuation. Using it as the first character, before someCommand, causes data to be piped to the process.
- exec list - this ends execution of the Perl script and runs whatever found in list. If it fails, it instead returns an error code. This apparently doesn't work in Windows
- system("commandName") - runs commandName, then carries on with the script, returning 0 for success or an error code (which must be divided by 256 to be understood, for some reason), unlike backticks.

Line input operator: <FILEHANDLE> yields the next line from the associated file (ending with newline char).STDIN, STDOUT,and STDERR are the most common examples (and are pre-opened). Assigning a filehandle to an array (as opposed to a scalar) results in an indexed list of every line being created within that array.

The code: while (<>) {...} processes each argument (which should be a filename) passed to the script. <> represents @ARGV, only in a specialized way (specific to files). Each filename is presumed open (as if it was a filehandle) and ready to use within the loop.
Filename globbing operator:@aryName = <*something> creates a list of every file in the directory matching something. Similarly, while (<*>) {...} processes all files within the working directory (which can be changed with chdir) - this method is not as efficient as using opendir/readdir

Regular Expressions (& Pattern Matching):

/PATTERN/ - the match operator
?PATTERN? - like match operator, but only matches once

The =~ operator points a regex at a variable other than $_. It is useful, among other things, as the operator in a condition that invokes a regex, ie: if ($varName =~ /pattern/) {...}. This can also be used with a while loop (and global modifier) - useful to make things happen every time a match occurs.
For regex negation with conditional statement, use: !~
m// is the explicit match operator, s/// is the substitution operator. A substitution will operate upon $_ unless a target is specified as follows: $varName =~ s///
The match operator returns true/false, unless you assign it (=) to a variable: @aryName = m//
The match operator, m, can be used with ANY character as regex delimiter; for example, you could run a match with m## orm%%instead of m// - this is good for readability if your pattern has lots of forward slashes in it.
To base a condition on a match (or successful substitution): m// conditionalOperator statement
Most characters in a regular expression match themselves (eg, /Fred/ will look for exactly "Fred"). Those that don't are called metacharacters, but can be matched literally by putting a backslash in front of them. The list: \ | ( ) [ { ^ $ * + ? .
| is the equivalent of "OR". Puttingparenthesesaround a string allows for sub-processing of a pattern, eg./(Fred|Wilma|Barney) Flinstone/ This also stores the match made by the subpattern in a backreference: \1, \2, etc..., depending on how many groupings came before it. That backreference can be used in a following pattern within the same regex. In addition, its value is also stored in the corresponding variable $1, $2, etc... - useful outside of the regex. Backreferences are good, for example, if you need to find an HTML tag and its corresponding close-tag. To prevent the creation of backreferences (in case you simply don't need them, you just want a subpattern), use?:in the following manner: (?:somePattern)

Note: you can put variables into a pattern (eg $foo), but be aware that doing so slows down the process considerably. The pattern has to recompile each time through in case the variable changed (unless you use the o operator). You can even determine one variable name through another: eg,${$varName}
The NFA engine that Perl uses to make matches tries things in a certain order. Knowing that order can improve the efficiency of your regex calls. Rule 1: the engine begins its pattern matching attempt just before the first character of the string. If it fails to find a match, it restarts from just before the second character, etc. Note: this means that any pattern which can match null will match at the leftmost position in the string. Rule 2: the engine tries alternatives in a pattern from left to right.

Assertions:

`^`	Matches at the beginning of the string (or line, if /m is used)
`$`	Matches at the end of the string (or line, if /m is used)
`\b`	Matches at word boundary
`\B`	Matches except at word boundary
`\A`	Matches at the beginning of the string
`\Z`	Matches at the end of the string
`\G`	Matches where previous `m//g` left off
`(?=...)`	Matches if engine would match ... next [this is a sub-assertion]
`(?!...)`	Matches if engine wouldn't match ... next [this is a sub-assertion]

Quantifiers say how many of the previous thing should match in a row. They will apply to only the very character to their left, so if you want to repeat multiple chars, group the chars (put them in parenthesis). Quantifiers suffixed by a question mark mean "match smallest value" (as opposed to biggest - also known as "greedy matching"). Greedy matching begins from left to right, but once it finds the start of a match, it proceeds to the end of the string and backtracks to find the match end:

`{n,m}`	`{n,m}?`	Must occur at least n times but no more than m times
`{n,}`	`{n,}?`	Must occur at leasn n times
`{n}`	`{n}?`	Must match exactly n times
`*`	`*?`	0 or more times
`+`	`+?`	1 or more times
`?`	`??`	0 or 1 time

A "." matches any character except newline (unless you use the /s modifier - then it matches newline as well. Such forward-slash modifiers come at the end of the pattern slashes; eg. s/pattern1/pattern2/s) . The period, in conjunction with the asterisk (.*) matches anything at all, like % in SQL.
A list of characters in square brackets match any one character in the list. You can list characters as [abc] or [a-c]. Use a backslash to protect a hyphen that would otherwise be interpreted as range delimiter in this unique circumstance (however, hyphens appearing as the first or last character within the brackets are interpreted literally - they don't need a backslash).

Special regex shorthand: (note - any item below can produce the negative effect simply by capitalizing it)

`\a`	alarm (beep)
`\n`	newline
`\r`	carriage return
`\t`	tab
`\f`	formfeed
`\e`	escape
`\d`	digit
`\D`	non-digit
`\w`	alphanumeric word char (add a `+` to the end to match an entire word); equals a-zA-Z
`\W`	non-word char
`\s`	whitespace char (same as `[ \t\n\r\f]`) - that is, space, tab, newline, formfeed, or carriage return
`\S`	non-whitespace char

A backslashed one-digit number matches (backreferences) a corresponding substring (eg, a value returned by a pattern within parenthesis) - eg. \1,\2, etc... Again, this doesn't just repeat that pattern, it references the actual value returned by that pattern when it ran. If your substitution was overwriting a value you wanted to preserve, you could use this to put it right back (within the replacement string). To avoid saving a backreference, use(?: ...) You can backreference outside of a pattern with$1, $2, etc... - scope extends to the end of the enclosing block or eval string, or to the next successful pattern match (whichever comes first).
In addition, you can do $+ to return whatever the last bracket match matched, $& to return the entire matched string,$'to return everything before the matched string, and $' to return everything after the matched string.
A backslashed two or three-digit octal number (eg \033) matches the character with that value, unless a backreference with that number exists. This also works with hexadecimal values.
A backslashed c followed by any character matches the corresponding control character.
Examples:
- s/^([^ ]+) +([^ ]+)/$2 $1/; # swap the first two words
- /(\w+)\s*=\s*\1/; # match "foo = foo"
- /.{80,}/; # match line of at least 80 chars
- if (/Time: (..):(..):(..)/) {    # pull fields out of a line
        $hours = $1;      $minutes = $2;
        $seconds = $3;
  }
Common mistakes: /^fee|fie|foe$/ matches "fee" at the beginning of the string, or "fie" anywhere, or "foe" at the end. /^(fee|fie|foe)$/ matches a string consisting solely of "fee", "fie", or "foe".
Regex extensions:
- The format (?...) represents a regex extension
- (?# someText) is used to comment your patterns (a simple # is sufficient if you've enabled the /x switch)
- (?:...) groups a pattern, but prevents the saving of a backreference
- (?=...) lookahead assertion (eg, subpattern); returns positive if ... matches. Note: this assertion should come at the end of your pattern, not before.
- (?!...) lookahead assertion (eg, subpattern); returns negative if ... matches. Note: this assertion should come at the end of your pattern, not before.

Pattern matching operators:

comes at the end of the regex string (eg.

 m/somePattern/someOperator)

`g`	match globally (find all occurences); this returns a list of matches (in a list context) or returns true for every match until it finally returns false (in a scalar context)
`i`	case-insensitive
`m`	treat string as multiple lines
`o`	only compile pattern once
`s`	treat string as single line (`\n` not matched by`^` or `$`)
`x`	use extended regular expressions (ignore whitespace and `#` in the pattern, so that you can make it more readable)
`e`	only applicable to substitutions; tells the engine to treat the text in the replacement position (eg.`s/PATTERN/REPLACEMENT/e`) as an expression, rather than simple text. `e` operators can be stacked - the more you use, the more passes a regex will make (and, proportionally, the more evaluations)

tr/SEARCHLIST/REPLACELIST/ doesn't use regular expressions; it scans a string character by character and replaces SEARCHLIST with REPLACELIST, then returns the number of characters replaced or deleted. This is actually faster than regex, so its good to use it when possible. Its modifiers are:
- c - search for every character NOT in SEARCHLIST
- d - delete characters specified in SEARCHLIST (and found), but not partnered by an item in REPLACELIST. If this is not done, the last letter in REPLACELIST will be used to replace every un-partnered character in SEARCHLIST
- s - sequences of characters that were translated to the same character are reduced to a single instance
- example: tr/a-zA-Z//s changes "bookkepper" to "bokeper"
- example: tr/a-zA-Z/ /cs changes non-alphas to a single space

File Test Operators:

Takes one argument, either a filename or filehandle, and tests for something. Unless otherwise documented, returns 1 for true and "" for false

`-r`	file is readable by effective uid/gid (in caps, by real uid/gid)
`-w`	file is writable by effective uid/gid (in caps, by real uid/gid)
`-x`	file is executable by effective uid/gid (in caps, by real uid/gid)
`-o`	file is owned by effective uid/gid (in caps, by real uid/gid)
`-e`	file exists
`-z`	file has zero size
`-s`	file has non-zero size
`-f`	file is a plain file
`-d`	file is a directory
`-l`	file is a symbolic link
`-p`	file is a named pipe
`-S`	file is a socket
`-b`	file is a blcok special file
`-c`	file is a character special file
`-t`	filehandle is opened to a tty
`-T`	file is a text file
`-B`	file is a binary file (opposite of `-T`)
`-M`	age of file (at script startup) in days since modification; returned in days (including fractional)
`-A`	age of file (at script startup) in days since last access; returned in days (including fractional)
`-C`	age of file (at script startup) in days since inode change; returned in days (including fractional)

For M, A, and C, if the file changed after the script began running, a negative time is returned
When testing with -T, good to test with -f first (eg, next unless -f $file && -T _;)
Solitary underscore (as in e.g. above) represents same file as was referenced by the last file test operator called

Range Operator: in a list context, .. returns a list of values (counting by ones) from the left value to the right value - useful for loops. In a scalar context, .. returns a boolean value. That value is false as long as the left operand is false. Once the left value is true, the value will be true until the right operand is true
Conditional Operator:Expression ? If_True_Then : If_False_Then
Statements: Any statement may be followed by a single modifier, just before the semicolon. This includes: If expr,Or expr, Unless expr, While expr, and Until expr
For Loop: for ($1 = 1; $1 < 10; $i++) { ... }
Foreach Loop:foreach $varName (@aryName) { ... }
Referring to a variable that has not been declared makes it global - it is accessible from anywhere, and never expires. To avoid this, declare the variable using my, which will limit the scope to the block, subroutine, eval, or file in question. A block is simply a block of code enclosed in solitary curly braces - useful for limiting scope. Declare multiple values this way by enclosing them in parenthesis: eg, my ($var1, $var2); Declare them with local to, at least, make the variable(s) accessible to other subroutines/functions called from within the block in question; otherwise, they won't even be available there. local is also the only way to limit the scope of global special variables.
Pragmas: these force the compiler to act in certain ways. For example, Perl assumes all arithmetic must be done in floating point. Sayinguser integer;tells the compiler that it may use integer operations till the end of the enclosing block.
Subroutines: Functions are the same thing as subroutines in Perl. Declare a subroutine using: sub subName(LIST) {...}; Note: (LIST) is not required. Passing in an array as an argument turns the array into a flat list of scalars. The arguments passed into a sub can be accessed through the local @_ variable. Another way of calling subroutines includes prefixing the subName with &, which is mostly optional, but makes a difference within a subroutine itself (calling &subName automatically pulls in @_, subName by itself does not). Subroutines can return more than one value; for example:
Outside the sub:($rVar1, $rVar2) = &subName(...);
Inside the sub: return ($var1, $var2);
Global Special Variables:
- $_ - the default input an pattern-searching space ($ARG)
- $. - current input line number of the last filehandle read ($INPUT_LINE_NUMBER, $NR)
- $/ - input record separator; newline by default ($INPUT_RECORD_SEPARATOR, $RS)
- $! - contains current error number ($OS_ERROR, $ERRNO)
- $` - the string preceding whatever was matched by the last successful pattern match
- $' - the string following whatever was matched by the last successful pattern match
- $, - defines what Perl should put in between elements of lists that are printed
- and more...
Modules: A module is like a subroutine, but differs in that it is an entirely separate script that can be called from any other. Modules are invoked as follows:use moduleName;
To make a module: find out where your copy of Perl is installed. Within that directory there should be one called lib. Within lib, make a directory for your module. Save the script into that directory as scriptName.pm (perl module). If the script is simply one or more subroutines, it should end with: 1; which is necessary because all Perl modules must return true. To use the module, just call it with:use dirName::scriptName;then, you can call any subroutine within the module.

Hundreds of modules exist for download on CPAN, others are built into standard Perl distributions. Some of these include:
- File::Find - enables use of find function, which takes two params: a subroutine that determines what to do with the list of files returned, and a list of directories to be searched. The filenames found are stored in $_ (within your subroutine)
- strict - a pragma which requires all variables, references, and subroutines be explicitely declared
- Win32::ChangeNotify - one of many win32 modules. This one waits until something changes in a directory, then acts. Use as follows:
  
  $someVar = Win32::ChangeNotify->new($pathToWatch, $subTrees, $events); while (1) { $someVar->wait; ... $someVar->reset; }
  
  $events contains a predefined string saying what sort of event to watch for (for example, "FILE_NAME", which covers filename changes). The wait method (within the loop) causes the program to pause until the directory is changed somehow. The reset method does what it says after a change has occured, and the code between wait and reset has run.

Functions

abs - returns absolute value of supplied argument (or$_, if no argument)
alarm int - fires an alarm signal; int is the number of seconds from now for it to go off
chdir directory - changes Perl's working directory (you should ...or die "Error"; when using chdir)
join delimiter, @aryName - joins the elements of @aryName into one string, separated by delimiter. A comma-delimited list of strings or scalar variables can be supplied to join in place of @aryName
grep expression, @aryName - returns a new array containing elements of @aryName that match expression. If expression is not a simple regex (say, a statement including conditional logic), it should be enclosed in {} - in which case, a comma is not necessary after it. If expression is not a match condition at all, but an action (say, chop), the chopped remains of elements will be returned. Expression can also be a subroutine call
int(value) - casts value as an integer (no decimal points)
lc (strValue) - makes strValuelowercase
map expression, @aryName - like grep, it returns an array, but that array contains the results of each true evaluation of @aryName against expression, rather than the matching elements themselves. If expression is not a match condition at all, but an action (say, chop), the letters that were chopped will be returned. Expression can also be a subroutine call. If you want map to return values rather than results, you can add ;$_ to the end of expression (within the curly braces), which results in $_ being the last thing evaluated.
printf FILEHANDLE list - outputs a formatted string to STDOUT if FILEHANDLE is omitted. To simply return, rather than output, use sprintf. The first item in list is a string that indicates how to format the rest of the items. That string should be formatted as follows: %m.nx - where m and n are optioanl sizes, and x is one of the following:
c - character
d - decimal integer
e - exponential format
f - fixed point format
...
s - string
x - hexidecimal number
X - hex num with upercase letters
split delimiter, $varName, numElements - turns $varName into an array, sliced bydelimiter(a string value). The target can be a scalar (in which case the number of elements created is returned), or an array (in which case the elements are plugged into it), or a list of scalars and/or arrays (in which case as many elements as there are room for get plugged in; the rest are discarded). In the latter case, the target list should be enclosed in parentheses. The last parameter, numElements, sets the maximum number of return elements.
sort subname, list - sorts list (array or list of scalars) by standard string comparison, unless you use subname, which should call a subroutine that returns either 1, 0, and -1, based on evaluations of $a and $b, which should *not* be modified within the subroutine - just referenced.
srand - initializes the random number generator. After this, you can call rand, which will either generate a number between 0 and 1, or between 0 and whatever number is fed to it as a parameter

References and Nested Data Strucutres

A symbolic reference is a reference to the name of an item (variable, subroutine, etc). A hard reference, on the other hand, is a special reference to the value of that item itself. Symbolic references are easily defined with typeglobs. Hard references are defined with the backslash operator; for example:

$someVar = \$someOtherVar; $someConstant = \someOtherConstant; $someSub = \&someOtherSub;To access the value of a referenced item, you need to "dereference" it. Otherwise, the reference you create is nothing more than a pointer at the value's space in memory, which is great for passing it around, but useless once you actually need the value itself. For example:

$var = $$var2; &$someSub(1,2,3); $$arrayref[0] = "January";
You can stack references; for example:$refrefref = \\\"hello";the string "hello" can later be retrieved with $$$$refrefref
You can also dereference a reference with the infix (arrow) operator, ->For example, instead of:$$arrayref[0] = "January", you can do:$arrayref->[0] = "January"
If you dereference an item that is not a hard reference, the value is treated a a symbolic reference. In other words, the reference is interpreted as a string, which is taken to be the name of the item dereferenced. For example:

$var = "hey"; $$var = 1; # $hey is set to equal 1