Monday, January 18, 2016

Rule & Pattern Programming Example: Run-Length Encoding

In Sal Mangano's excellent Mathematica Cookbook, he says this is one of his favorite functions due to its concise use of rule-based pattern-matching programming.

runLengthEncode@l_List:=Map[{#,1}&,l]//.{head___,{x_,n_},{x_,m_},tail___}:>{head,{x,n+m},tail}

runLengthEncode@{1,1,2,2,2,1,3,3,3,3}

{{1,2},{2,3},{1,1},{3,4}}

How does this concise rule & pattern program work? The "Map" clause puts a template for the run-length encoding format, {symbol, run-length}, at every expression in the List, then the ReplaceRepeated "//." clause works its magic by repeatedly detecting duplicate consecutive symbols and incrementing their run-length counter until no consecutive duplicate is detected, at which point it moves to the next symbol.

How can we see this? A critical step is where the {symbol, run-length} initial template is set up for each symbol in the List , which is simply this fundamental operation of Function (here in its abbreviated form):

{#,1}&@n

{n,1}

Trace is helpful but can be tedious to parse. One way to reveal one-liner mechanisms is to use a symbolic input. A second way, just as with longer programs, is to break them apart and step through them. You just have to figure out how to divide up the key steps. Here is the first key step for this one, mapping the initial template at each symbol:

Map[{#,1}&,{1,1,2,2,2,1,3,3,3,3}]

{{1,1},{1,1},{2,1},{2,1},{2,1},{1,1},{3,1},{3,1},{3,1},{3,1}}

And the second key step is the consolidation of the counts:

%//.{head___,{x_,n_},{x_,m_},tail___}:>{head,{x,n+m},tail}

{{1,2},{2,3},{1,1},{3,4}}

And here is the Trace.

runLengthEncode@{1,1,2,2,2,1,3,3,3,3}//Trace

{runLengthEncode[{1,1,2,2,2,1,3,3,3,3}],({#1,1}&)/@{1,1,2,2,2,1,3,3,3,3}//. {head___,{Removed[x]:_,n_},{Removed[x]:_,m_},tail___}:>{head,{Removed[x],n+m},tail},{({#1,1}&)/@{1,1,2,2,2,1,3,3,3,3},{({#1,1}&)[1],({#1,1}&)[1],({#1,1}&)[2],({#1,1}&)[2],({#1,1}&)[2],({#1,1}&)[1],({#1,1}&)[3],({#1,1}&)[3],({#1,1}&)[3],({#1,1}&)[3]},{({#1,1}&)[1],{1,1}},{({#1,1}&)[1],{1,1}},{({#1,1}&)[2],{2,1}},{({#1,1}&)[2],{2,1}},{({#1,1}&)[2],{2,1}},{({#1,1}&)[1],{1,1}},{({#1,1}&)[3],{3,1}},{({#1,1}&)[3],{3,1}},{({#1,1}&)[3],{3,1}},{({#1,1}&)[3],{3,1}},{{1,1},{1,1},{2,1},{2,1},{2,1},{1,1},{3,1},{3,1},{3,1},{3,1}}},{{head___,{Removed[x]:_,n_},{Removed[x]:_,m_},tail___}:>{head,{Removed[x],n+m},tail},{head___,{Removed[x]:_,n_},{Removed[x]:_,m_},tail___}:>{head,{Removed[x],n+m},tail}},{{1,1},{1,1},{2,1},{2,1},{2,1},{1,1},{3,1},{3,1},{3,1},{3,1}}//. {head___,{Removed[x]:_,n_},{Removed[x]:_,m_},tail___}:>{head,{Removed[x],n+m},tail},{{1,2},{2,3},{1,1},{3,4}}}