Perl Crash Course: References and Complex Data Structures (CDS)

References and Complex Data Structures (CDS)

Now that we have a pretty good understanding of the 3 types of data in Perl, it’s time we bump it up a notch and put them all together. Perl is most famous for its text manipulation using regular expressions, but it should also be noted by the ability to create multilevel data structures, regardless of them being scalars, arrays, or hashes.

One way that I like to explain multilevel or multidimensional data is by taking ‘Folders’ panel the Windows Explorer windows as example:

Example of Multilevel Folders

Example of Multilevel Folders

As you can see (despite the bad image quality), there is the main drive (root variable), which contains several folders (arrays/hashes) or files (scalar elements). These folders in turn can contain other folders (arrays/hashes) and files (scalar elements). In Perl you can have any kind of variable (array, hash, or scalar) containing any number of nested arrays, hashes, or scalars. This is done thanks to references, and anonymous arrays and hashes.

Here’s how it works:

When you create a variable, its name and contents get stored in your computer’s memory. A reference is a (singular) pointer in memory to another memory address. The most common way of getting a reference to a data structure is to escape the identifier symbol ($, @, or %) with a backslash.

$array_ref = @array;
$hash_ref = %hash;

print "$array_refn";  #WRONG: you will get something similar to ARRAY(0x125f6e0)
print "$hash_refn"; #WRONG: you will get something similar to HASH(0x125f6e0)

print $array_ref->[0]; #RIGHT: prints the first element of the referenced array
print ${$array_ref}[0]; #RIGHT: same results as above
print $$array_ref[0]; #RIGHT: same thing

print $hash_ref->{key1}; #RIGHT: prints value for key 'key1'
print ${$hash_ref}{key1}; #RIGHT: same as above
print $$hash_ref{key1}; #RIGHT: same thing

In the example above, we see that we can’t just print the value of a referenced data structure by calling print and the scalar that’s holding it (scalar because it’s holding one reference – no, I can’t stress that enough). We need to dereference our reference.

The two ways of dereferencing a reference (TMTOWTDI) are either by delimiting our ref within ${} or by using the ->. You may also have noticed that we use ->[] for arrays and ->{} for hashes. Use @{$var} and %{$var} if you want to dereference the whole array or hash, instead of just one key or element. Same goes for @$var and %$var.

To create multilevel data, you can nest the references inside the assignment lists:

@arr1 = (1,2,3); # Regular single level array

@multi = (@arr1,4,5); # WRONG: results in (1,2,3,4,5)
@multi = (@arr1,4,5); # RIGHT: 3 element array, being the first element another array

$multi->[0]; # prints ARRAY(0x1233ad8)
$multi->[1]; # prints 4

$multi->[0]->[0]; # prints 1 - the first element of @arr1
$$multi[0][0]; # same thing
${$multi}[0][0]; # ditto

It is also possible and even common to assign anonymous (unnamed) data structures to scalars directly. While lists are bound by parens or the qw() function (seen in detail further along), anonymous arrays are bound by square brackets ([ ]) and anonymous hashes by braces ({ }).

$arr = []; # kick off $arr as a reference to an empty, unnamed array
$arr2 = [ 1,2,3,4 ] ; # does not have to start off empty

$arr->[0] = 'a'; # first element of $arr is now 'a'

$hsh = {}; # empty anonymous hash
$hsh->{somekey} = 'somevalue'; # added key 'somekey' with value 'somevalue'

$hsh2 =  {
                'key1' => 'val1',
                'key2' => 'val2',  # you can leave the comma
             } ; # even though you do not have any more pairs

Now that you have your references, put it all together.

@multi = ($arr, $arr2, $hsh, $hsh2, 1, 'a'); # data structure galore

print $multi->[1]->[-1]; # last element of second element: 4
                                 # Lets put it all together, no temp scalars
$VAR1 = [
                  {
                       key2' => [
                                            0,
                                            1,
                                            2,
                                            3
                                        ],
                      'key1' => 'val1'
                   },
                   'a',
                   'something',
                   [
                      'red',
                      'blue',
                      'green'
                    ]
            ];

# Results:

print $VAR1->[0]->{key1}; # val1
print $VAR1->[0]->{key2}; # ARRAY(0x1233ad8) - needs dereferencing
print $VAR1->[0]->{key2}->[2]; # third element of key2: 2
print $VAR1->[-1]->[-2]; # blue
$VAR1->[-1]->[-2] = 'yellow'; # was blue, is now yellow

If you find this all mind boggling, don’t worry – you’re not alone. Fortunately we can rely on a very handy module called Data::Dumper which is really handy for times like these. More on it later on.

« Hashes | TOC | Gettin’ jiggy wit it »

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.