Tuesday, December 30, 2014

String Replacement Methods: StringTemplate

A String Replacement Overview is here.

StringTemplate

StringTemplate saves you the trouble of searching for a String subset within a String to replace or setting up your own marker to flag the StringPosition in the String at which to perform a replacement.

Further, good programming practice dictates that we use selectors and constructors – specialized, dedicated functions to extract a subset of a file or to change a subset of a file – and to always use those rather than ad hoc one liners scattered in our functions and programs.1,2 StringTemplate conveniently formalizes and enforces the use of selectors and constructors.

StringForm is simpler to understand and use than StringTemplate, so I use StringForm when you need to output a message from your function. I don't end the command with a semi-colon so you can see the InputForm of a TemplateObject including its default Options.

stringTemplate1=StringTemplate@"The quick brown `` jumped over the lazy white ``."

TemplateObject[{The quick brown ,TemplateSlot[1], jumped over the lazy white ,TemplateSlot[2],.},InsertionFunction->TextString,CombinerFunction->StringJoin]

You can directly Apply any StringTemplate as a function to a List of its arguments that fits its requirements, or use TemplateApply to do the same thing.

stringTemplate1@@{"mink","peccadillo"}

The quick brown mink jumped over the lazy white peccadillo.

Equivalently, here StringTemplate is used as a function as you would any other function – use it as the Head of an Expression with its arguments.

stringTemplate1["mink","peccadillo"]

The quick brown mink jumped over the lazy white peccadillo.

Equivalently, using TemplateApply:

TemplateApply[stringTemplate1,{"mink","peccadillo"}]

The quick brown mink jumped over the lazy white peccadillo.

1. Maeder, Roman, Computer Science with Mathematica. Cambridge: Cambridge University Press, 2000. Chapter 5.3. Design of Abstract Data Types.

2. Maeder, Roman, M220: Programming in Mathematica  (course given by Wolfram Education Group, which I have taken twice and recommend).

String Replacement Methods: Overview

Here are String replacement methods that I have used in code from one-liners up to programs producing hundreds of thousands of text and html files. In general, use the simplest method or one that you understand clearly. Use StringForm to output messages from your functions and programs. For longer functions or programs, StringTemplate is the new best practice.

There is a function I don't discuss, StringInsert, which inserts a substring at a given StringPosition in a control String. I don't advocate its use since it's very brittle in that if you add or delete even one character before the StringPosition then the insertion point will be wrong.

StringForm

Literal Replacement, Markers, and Delimiters

String Replacement Methods: Literal Replacement, Markers, and Delimiters

A String Replacement Overview is here.

Note that the next three methods all use StringReplace. This is in keeping with my principle that the fastest way to learn Mathematica is to become a power user of its 70 or so core functions. In String processing, for instance, StringInsert is not a function you need to know. Instead learn to use the more powerful and robust function, StringReplace.

Literal Replacement

Literal replacement works by using StringReplace to find a literal substring within a String and substitute another substring for it. Literal replacement is very simple and easy to use.

string1="The quick brown fox jumped over the lazy white dog.";

StringReplace[string1,{"fox"->"mink","dog"->"pecadillo"}]

The quick brown mink jumped over the lazy white pecadillo.

Markers

Using markers to indicate the replacement position can improve code legibility. Use StringReplace to replace just the marked text.

string2="The quick brown <animal1> jumped over the lazy white <animal2>.";

StringReplace[string2,{"<animal1>"->"mink","<animal2>"->"pecadillo"}]

The quick brown mink jumped over the lazy white pecadillo.

Delimiters

Use StringReplace to replace text between the delimiters. This is very useful when you want to replace a lot of text in a document, especially in a long document. However, the new function StringTemplate is a superior method overall.

sitemapTemplate="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">
<!-- put list of urls here with a line feed after each one -->
</urlset>";

urls="<url><loc>http://www.blah.net/page1.html</loc></url>
<url><loc>http://www.blah.net/page2.html</loc></url>";

Note that you use StringExpression (shorthand "~~") to concatenate quoted Strings with Blanks in the String to be found by StringReplace, but you must use StringJoin (shorthand "<>") if you concatenate different Strings in the replacement String.

sitemapTemplateWithURLs=StringReplace[sitemapTemplate,"<!-- put list"~~urlsList__~~"each one -->"->urls]

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>http://www.blah.net/page1.html</loc></url>
<url><loc>http://www.blah.net/page2.html</loc></url>
</urlset>

String Replacement Methods: StringForm

A String Replacement Overview is here.

StringForm

StringForm is a simple, elegant String template function. Use it in your functions where Print isn't enough since you need to fill in some variables, such as calculations on the fly. In a function use the following form with Print to output it since lines end with an output-suppressing semi-colon.

Print@StringForm["control string", variables];

Here the double backtick marks tell Mathematica where to fill in the blanks with arguments you give in the order in which they are inserted into the String. An argument can be a String or an Expression of unlimited complexity, which will be evaluated before insertion. If you don't want the inserted Expression to be evaluated, though, use HoldForm (example below).

StringForm["Use `` for relatively short and simple String templates, such as output messages in your functions. For example, the cube root of `` is ``.","StringForm",27,27^(1/3)]

Use StringForm for relatively short and simple String templates, such as output messages in your functions. For example, the cube root of 27 is 3.

If you're going to use an argument twice, switch the order, or use a number of arguments and you want to prevent mistakes, use numbered, rather than ordered, backticks. You often want a line break, for which the \n escape character is used within the quotation marks that are Mathematica's String delimiters.

StringForm["Flying or gliding mammals include `1`, `2`,\n`3`, `4`, `5`, `6`, and `7`.\nThe most common species are in the `3` family.","flying possums","greater glider","bats","flying squirrels","flying lemurs","flying monkeys","cats"]

Flying or gliding mammals include flying possums, greater glider,
bats, flying squirrels, flying lemurs, flying monkeys, and cats.
The most common species are in the bats family.

To prevent the inserted Expression from being evaluated, use HoldForm:

StringForm["For example, the sixth term of the Fibonacci series is the sum of the preceding two terms: ``.",HoldForm[1+1+2+3+5=8]]


For example, the sixth term of the Fibonacci series is the sum of the preceding two terms: 1+1+2+3+5=8.

Tuesday, December 23, 2014

Memory Management Tools

While Mathematica is designed to manage memory for you, under certain circumstances it can get bogged down, mainly because it keeps a record of all your inputs and outputs with In and Out. So if you're using functions that output a lot of computation, or working with large files, you may notice Mathematica slowing down.

There are a number of ways that you can manage memory in Mathematica. Here is a summary (see also How to Find Memory Used in Computations).

Command
Effect
?Global`*
Shows all Symbols in a non-accessible table
Names@”Global`*”
Returns a List of all Symbols that you can access
Clear@symbol
Clears the value of symbol but leaves its name in memory
Clear@”Global`*”
Clears the values of all Symbols but leaves their names in memory
Remove@symbol
Removes the name symbol and its value from memory
Remove@”Global`*”
Removes all Symbols and their values from memory

If you're going to go as far as removing all Global Symbols, consider starting a new session by entering Quit[] in your Notebook or Quit Kernel → Local under the Evaluation menu.

Beginners hesitate to Quit the kernel, but there's little downside. Even if you haven't saved your Notebooks, the kernel is a separate entity and you can save them.

To automate resuming after quitting the kernel or in general, use Initialization Cells. You can set Initialization in the menu under Cell → Cell Properties or by right-clicking on the cell and selecting Initialization Cell. A little downward tick mark appears in the upper right corner of the cell.

Then when you re-start the kernel by selecting any cell, selecting Evaluation → Evaluate Initialization Cells, or re-open the Notebook, all the Initialization cells are automatically re-Evaluated. In this way you lose very little time by quitting the kernel and re-starting.

Memory-Management Commands to Use Occasionally


Memory currently used by the kernel:

In[157]:= MemoryInUse[]

Out[157]= 135450976

Memory currently used by the front end (all of your open Notebooks):

In[158]:= MemoryInUse@$FrontEnd

Out[158]= 543264768

The maximum memory used by the kernel during your current Mathematica session:

In[159]:= MaxMemoryUsed[]

Out[159]= 137155304

Clear a cell that consumed lots of memory in your session:

Unprotect[Out]; Out[537] =.;
Protect@Out;

Easy Ways to Create Variations of Function

Easy Ways to Create Variations of Function

You will often need to create more than one version of a function. Here are two different methods, a piecewise function and using an Option. Either are fine, but in general simply for conciseness I use a piecewise function when the function is short (like less than a dozen lines) and an Option when the function is longer.

Here is the dataset for the examples. To see the 6 functions used to create arrays in Mathematica, see Ways to Create Arrays.

dataset=Array[List,{5,3}]

{{{1,1},{1,2},{1,3}},{{2,1},{2,2},{2,3}},{{3,1},{3,2},{3,3}},{{4,1},{4,2},{4,3}},{{5,1},{5,2},{5,3}}}

Here's the first version. It takes a List of data as a first argument and an Integer as its second argument. This simple example uses the index to pick out a Part from the data.

function1[data_List,index_Integer]:=data[[index]]

function1[dataset,4]

{{4,1},{4,2},{4,3}}

The Piecewise Approach

Now we need a second version to take a List of indices in the second argument. We can simply use datatyping by specifying the Head of the second argument. This is one way to do the "piecewise" method. The domain of possible inputs is split into pieces (sub-domains) and each variation of a function is designed to pick out the correct piece (sub-domain) for which it is designed.

function2[data_List,index_List]:=data[[index]]

function2[dataset,{2,4}]

{{{2,1},{2,2},{2,3}},{{4,1},{4,2},{4,3}}}

The Options Approach

You can see that if there were a minor change in a long function, copying the function is not as concise as the following approach using Options. There is more overhead in this example to create the Option and handle it in the function, but in a long function that overhead is less than copying as in the Piecewise approach.

Note that for concise creation of Piecewise mathematical functions, the built-in function Piecewise should be used. For an explanation of how I use Options, see A Template for Optional Arguments [to be published].

ClearAll@function3;
Options@function3="indexOption"->"Integer";

function3[data_List,index_,options___?OptionQ]:=
Module[{indexOption="indexOption"/.{options}/.Options@function3},

If[indexOption=="Integer",dataset[[index]]  ];
If[indexOption=="List",dataset[[{index}]]  ];

(*endModule*)]

Here is the function that prompted me to write this primer. It takes the membrane voltage time series from simulated spiking neuron cells and counts how many spikes occur, with cell range and time series range as arguments. I needed to expand it so I could specify counting spikes in several time series ranges. I didn't want to copy and vary it with the piecewise approach, so I used the options approach. The changes to the original function to implement the new ones are italicized.

ClearAll@batchMeanFiringRateTable;
Options@batchMeanFiringRateTable={"Export"->Off,"MultipleBatch"->Off,"MultipleTimeSeries"->False};

batchMeanFiringRateTable[spikeFileDirectory_String,cellRange_List,timeSeriesRange_List,leftColumnHeading_String:"Neural Group",rightColumnHeading_String:"Average Firing Rate",options___?OptionQ]:=
Module[{exportFileName,firingData,
spikeFiles=getSpikeFiles@spikeFileDirectory,

exportOption="Export"/.{options}/.Options@batchMeanFiringRateTable,
multipleBatchOption="MultipleBatch"/.{options}/.Options@batchMeanFiringRateTable,
multipleTimeSeriesOption="MultipleTimeSeries"/.{options}/.Options@batchMeanFiringRateTable},

(*With "Batch"\[Rule]On, meanFiringRateFromSpikeFile will just return the averageSpikeRate each time it's called*)

(*Need FileBaseName to identify cell group in left column if TableForm; need First@timeSeriesRange to remove the outer List for a single time series range*)

If[multipleTimeSeriesOption==False,firingData=Table[{FileBaseName@batchFile,meanFiringRateFromSpikeFile[batchFile,cellRange,First@timeSeriesRange,"Batch"->On]},{batchFile,spikeFiles}]  (*endIf*)];

If [multipleTimeSeriesOption==True,firingData=Table[{FileBaseName@batchFile,meanFiringRateFromSpikeFile[batchFile,cellRange,timeSeries,"Batch"->On]},{batchFile,spikeFiles},{timeSeries,timeSeriesRange}]/.{{x_,y_},{x_,z_}}->{x,y,z};(*Table will take each spike file and iterate over timeSeriesRange. Don't need First@timeSeriesRange here. *) (*endIf*)];

If[multipleBatchOption==Off,Print@"Multiple batch is off.";Print[firingData//TableForm[#,TableAlignments->Left,TableHeadings->{None,{leftColumnHeading,"Ave AP Rate: "<>ToString@timeSeriesRange}}]&],Return@firingData (*endIf*)];

If[exportOption==On&&multipleBatchOption==Off,exportFileName=FileNameJoin@{DirectoryName@spikeFiles[[1]],ToString@FileNameTake[DirectoryName@spikeFiles[[1]],-1]<>"-BatchFileTable.xlsx"};
Export[exportFileName,firingData];Print@StringForm["File exported to ``.",exportFileName]];
]


batchMeanFiringRateTable["C:\\Users\\Public\\Documents\\UNCuS16_09_2013\\DataFiles\\WDR-Abeta-14-12only\\WDR-Abeta-14-12only-CS8-9-20-RS0p28\\",{1,80},{{1,45},{46,85}},"Neural Group","Method"->"OverSpikingCells","MultipleBatch"->Off,"Export"->Off,"MultipleTimeSeries"->True]





Ways to Create Arrays

Programmers often have to create an array. Of course, in Mathematica, you don't have to allocate memory for an array or dimension it or any of that outdated nonsense. Here are the six standard functions and common methods used to create arrays in Mathematica. Use the easiest-to-implement and most readable method for your application.

First there is the versatile function Array that applies a function to the array indices of any desired array dimension.

Array[g,{2,3}]

{{g[1,1],g[1,2],g[1,3]},{g[2,1],g[2,2],g[2,3]}}

To simply create a nested List of Integers you can use List as the function to apply to the array. "Give me 5 rows of 3 elements each:"

dataset=Array[List,{5,3}]

{{{1,1},{1,2},{1,3}},{{2,1},{2,2},{2,3}},{{3,1},{3,2},{3,3}},{{4,1},{4,2},{4,3}},{{5,1},{5,2},{5,3}}}

It is quite easy to create a similar array with one of the most versatile and commonly-used functions in functional programming, Table. It pays handsomely to become a power user of Table.

Table[{i,j},{i,5},{j,3}]

{{{1,1},{1,2},{1,3}},{{2,1},{2,2},{2,3}},{{3,1},{3,2},{3,3}},{{4,1},{4,2},{4,3}},{{5,1},{5,2},{5,3}}}

Table is highly versatile and can certainly be used to do many things. In some cases there is a built-in function that is more specialized. For instance, Table can be used to create a constant array. "Give me 25 7s:"

Table[7,{25}]

{7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7}

ConstantArray is a specialized function that does the same thing, but except for a little bit of clarity in reading the code doesn't do more than Table.

ConstantArray[7,25]

{7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7}

Table can generate random numbers using any of the random function family:

Table[RandomInteger[],{10}]

{1,1,1,0,0,0,0,1,1,1}

But for simple as well as higher-dimension arrays, the several half-dozen built-in random functions are superior and more readable than doing the same thing in Table.

RandomInteger[{1,5},{2,3}]

{{2,4,3},{2,1,3}}

Likewise, while Table can easily create a range of numbers, Range is superior.

Table[i,{i,7}]

{1,2,3,4,5,6,7}

Range@7

{1,2,3,4,5,6,7}

From 14 to 28 in steps of 2:

Range[14,28,2]

{14,16,18,20,22,24,26,28}

Those applications yield vectors, but here is a lesser-known usage of Range with a List as argument that generates an array. For each number in the List, Range yields a List of integers from 1 up to that number.

Range[{2,5,3}]

{{1,2},{1,2,3,4,5},{1,2,3}}

To more carefully control aspects of the array such as start, end, and step size, you may need to manually assemble the array with Table or Range.

{Range[14,28,2],Range[10^6,10^5,-2*10^5]}

{{14,16,18,20,22,24,26,28},{1000000,800000,600000,400000,200000}}

{Table[i,{i,14,28,2}],Table[j,{j,10^6,10^5,-2*10^5}]}

{{14,16,18,20,22,24,26,28},{1000000,800000,600000,400000,200000}}

Finally, MapIndexed is used in general to generate a List where a function is applied to a List of arguments along with an index number, which is the second argument.

MapIndexed[f,{a,b,c,d}]

{f[a,{1}],f[b,{2}],f[c,{3}],f[d,{4}]}

Using First strips off the enclosing List of the index.

MapIndexed[{#^2,First@#2}&,Range@5]

{{1,1},{4,2},{9,3},{16,4},{25,5}}