<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>UseStrict Consulting &#187; CDS</title>
	<atom:link href="http://usestrict.net/tag/cds/feed/" rel="self" type="application/rss+xml" />
	<link>http://usestrict.net</link>
	<description>Professional IT Solutions &#38; Training</description>
	<lastBuildDate>Fri, 10 Feb 2012 12:01:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Recursion with Perl and CDS</title>
		<link>http://usestrict.net/2009/06/recursion-with-perl-and-cds/</link>
		<comments>http://usestrict.net/2009/06/recursion-with-perl-and-cds/#comments</comments>
		<pubDate>Wed, 10 Jun 2009 22:16:10 +0000</pubDate>
		<dc:creator>vinny</dc:creator>
				<category><![CDATA[Perl]]></category>
		<category><![CDATA[CDS]]></category>
		<category><![CDATA[complex data structures]]></category>
		<category><![CDATA[recursion]]></category>
		<category><![CDATA[subroutine]]></category>
		<category><![CDATA[trim]]></category>
		<category><![CDATA[trimming]]></category>

		<guid isPermaLink="false">http://usestrict.net/?p=720</guid>
		<description><![CDATA[Recursion on Perl Complex Data Structures made easy.]]></description>
			<content:encoded><![CDATA[<p><strong>Update</strong>: Changed subroutine to comply with <a href="http://www.amazon.com/gp/product/0596001738?ie=UTF8&#038;tag=usst-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=0596001738" target="_blank">Perl Best Practices</a><img src="http://www.assoc-amazon.com/e/ir?t=usst-20&#038;l=as2&#038;o=1&#038;a=0596001738" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
<p><strong>Update2</strong>: Removed the prototype from the subroutine.</p>
<p>I&#8217;ve always had a problem with recursion. Not with the general theory that a function will call itself, etc &#8211; no, that&#8217;s easy. The hard part was when I had to deal with complex data structures in Perl (an array- or hashref containing a hash of arrays of hashes, a gazillion levels deep). Well, I guess anyone would have a hard time with that kind of data.</p>
<p>Anyway, in this post I don&#8217;t intend to get all complicated explaining all the kinds of recursions out there. If you want that, check <a href="http://en.wikipedia.org/wiki/Recursion_(computer_science)" target="_blank">this article at wikipedia</a>. What I do want to do is help all of those who are in the situation I was in, by explaining in the simplest way possible how to deal with this scenario.<span id="more-720"></span></p>
<p>Let&#8217;s start with a need. I have a complex data structure that needs its spaces trimmed on both sides. But since I&#8217;m lazy, I&#8217;d like my subroutine to modify the data directly, and not return the modified value (pass by reference, not pass by value). </p>
<p>Here&#8217;s our data structure:</p>
<p>&nbsp;</p>
<pre class="brush:perl">
    my $data = [
                       {
                          key1 => '   trim me!   ',
                          key2 => '   trim me too!    ',
                       },
                       [
                          'some element to trim   ',
                          '    another one    ',
                       ],
                       '    a simple string needing trimming    ',
                  ];
</pre>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><code>$data explained</code>:  an array containing 3 elements: element 0 is a hashref of keys <code>"key1"</code> and <code>"key2"</code>, element 1 is an arrayref of 2 elements. Element 3 is a simple string. All values have some extra spaces that need trimming (or so they say). We could use whatever number of levels and data types we want (except for anonymous subroutines, I guess &#8211; let&#8217;s not get too complicated).</p>
<p>Now, to trim all that, I want to be able to simply call <code>trim()</code> à la PHP. </p>
<p>&nbsp;</p>
<pre class="brush:perl">
     trim($data); # note the lack of the lvalue (lvalue = rvalue)
</pre>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>I also want it to accept simple arrays and hashes, and the references thereof: <code>trim(@array); trim(@array); trim(%hash); trim(%hash); trim($string)</code>. After all, I never know what kind of data my colleagues will be working with. Better have it deal with everything.</p>
<p>The logic to do that is this:  our subroutine will have to do the trimming (s///g) on scalars only. For that, it has to check if the data it received is a hash, array, etc, and if it is, iterate through each element and trim the value&#8230; but only if the element is not itself a hash, array, etc. Found it confusing? No problem, it really is.</p>
<p>In Perl, if I tried to remove the white space from element 0 of my <code>$data</code>  variable, it wouldn&#8217;t work. The reason being is that if I printed <code>$data->[0]</code> onto the screen, I&#8217;d get a funny looking output, something like <code>HASH(0x1004f5f0)</code>. That&#8217;s Perl&#8217;s way of saying that you have a HASH structure stored in memory position 0x1004f5f0. You can try to trim the spaces off of that string, but it won&#8217;t do you any good. The elements of your hash will still be untouched. That&#8217;s why you need to <em>de-reference</em> your data structures and dive into them.</p>
<p>To de-reference a structure is simple, just add a % in front of the variable if it&#8217;s a hashref, or an @ if it&#8217;s an array. But how do you know which is which? Use <code>ref()</code>.</p>
<p>&nbsp;</p>
<pre class="brush:perl">
       print ref($data->[0]) . "n"; # HASH
       print ref($data->[1]) . "n"; # ARRAY
       print ref($data->[3]) . "n"; # empty string, which is false
</pre>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><code>ref()</code> tells you what kind of data you are dealing with. It returns <code>CODE</code> if you have a closure or anonymous subroutine, but we&#8217;re not going there today.</p>
<p>So, now that we know how to identify the type of element we&#8217;re going to be working with, we can build our subroutine&#8230;</p>
<p>&nbsp;</p>
<pre class="brush:perl">
sub trim() {
	for my $param (@_) {
		if (ref($param) eq 'ARRAY') {
			for my $element (@{$param}) {
				trim($element);
			}
		}
		elsif (ref($param) eq 'HASH') {
			for my $val (values %{$param}) {
				trim($val);
			}
		}
		elsif (ref($param) eq 'CODE') {
			return;
		}
		else {
			$param =~ s/(^s+|s+$)//g;
		}
	}
}
</pre>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><code>trim()</code> explained:</p>
<p>We&#8217;re working with passing elements by reference instead of by value. This means that the elements themselves will be modified &#8211; no need to return any data. The first thing we do is to iterate through all parameters passed to <code>trim()</code>. In a subroutine, parameters (in our case, variables) are populated into the special <code>@_</code> array, allowing us to call <code>trim($var1, $var2, $var3)</code>  if we want.</p>
<p>We iterate through all elements of <code>@_</code> and verify if they are an Array. If they are, we iterate through each of their elements once, and call <code>trim()</code> again against them. That will handle as many nested arrays we want (or that your computer can handle). Now we have to make it deal with hashes. Same technique &#8211; use <code>ref()</code>  to see if it&#8217;s a hash. If it is, then iterate through each of its key/pair elements. There are several ways to do that. I personally prefer calling <code>keys</code>  to get the keys and use them to fetch the values of the hash. The value of the hash is passed to <code>trim()</code> for more validation. We also check to see if we received a <code>sub { }</code> (anonymous subroutine). In that case, we do nothing, just return.</p>
<p>Finally, after handling Arrays, Hashes and Anonymous subroutines, we can set up the actual trimming of the strings. We take the <code>$_[$i]</code> which is the parameter passed and remove the leading and trailing spaces with one neat substitution: <code>^s+</code> stands for leading spaces, <code>s+$</code> stands for trailing spaces, and it&#8217;s all joined by the <code>(  |  )</code> (this or that). We only call it once because we&#8217;re using the global (g) modifier of the substitution <code>s///g</code>.</p>
<p>And that&#8217;s all there is to it!</p>
<p><em>A note on prototypes:</em> This post generated a healthy discussion on prototypes. I had originally added the dollar prototype to the <code>trim()</code> subroutine, but that was forcing it to accept only scalars (strings and references), and not working with normal hashes and arrays. Thanks to everyone who participated in the discussion.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://usestrict.net/2009/06/recursion-with-perl-and-cds/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
	</channel>
</rss>

