Sunday, October 28, 2012

Getting Help: Mathematica User Groups

MathGroup

MathGroup, www.mathgroup.org, is the original Mathematica usegroup founded and moderated by Stephen Christensen. This high-quality bulletin board is a treasure trove of invaluable Mathematica know-how, and an incredibly useful resource for answering your Mathematica questions and problems. Introduction and instructions for subscribing and posting can be found from here: http://forums.wolfram.com/mathgroup/.


StackExchange Mathematica

There is a newer Mathematica usegroup in the StackExchange group of collaboratively edited question and answer sites: http://mathematica.stackexchange.com/. Their intro says : "Welcome! This is a collaboratively edited question and answer site for users of Mathematica. It's 100% free, no registration required." The functionality of the StackExchange sites is interesting (http://stackexchange.com/sites).


File Operations: Deleting Files


Delete one or more files

DeleteFile deletes one file, or more files if they are given in a List. If you are deleting more than one file, typically you'd first use FileNames to acquire those files from a directory. DeleteFile returns Null (i.e. nothing) if it works and $Failed if it doesn't. To delete a few files in the current directory it's as simple as this.

FileNames["test*", "C:\\Users\\82V\\Documents"] // DeleteFile

Here is a more complicated example in which I delete 3500 files in multiple directories.


Acquire FileNames and Delete Files in all subdirectories

The usual sequence of steps goes like this. Set the directory in which you want to search for files. In Windows, I usually use Windows Explorer to navigate to the directory, then click in the address bar to select it, Ctrl+C to copy it, and paste it after a quote mark in Mathematica, which says to me:



In[114]:= SetDirectory@
  "C:\\Users\\82V\\Documents\\Neuroscience\\Arle-Shils\\commandUNCuS\\\
DataFiles";

Search for all the files using a wildcard and store their names in a variable (datFiles). Search in all subdirectories within your directory by using a wildcard for subdirectory names in the optional second argument, and Infinity for depth of subdirectories in the optional third argument.

In[97]:= datFiles = FileNames["*.dat", {"*"}, Infinity];

See how many filenames you captured.

In[98]:= datFiles // Length

Out[98]= 3594

Take a quick peek at the first five files to be deleted to make sure you're capturing the right ones.

In[99]:= datFiles[[1 ;; 5]]

Out[99]= {"Short_Test_File\\tc125_101.dat", "Short_Test_File\\tc125_101pre.dat", \
"Short_Test_File\\tc125_102.dat", "Short_Test_File\\tc125_102pre.dat", \
"Short_Test_File\\tc125_103.dat"}

Delete all the files to be deleted.

In[100]:= DeleteFile@datFiles

Reset the directory to whatever it was before you set it to the one in which to search for files.

In[112]:= Directory[]

Out[112]= "C:\\Users\\82V\\Documents\\Neuroscience\\Arle-Shils\\commandUNCuS\\DataFiles"

In[113]:= ResetDirectory[]

Out[113]= "C:\\Users\\82V\\Documents"

Note that if you set the directory n times to delete files in each, you need to repeat ResetDirectory[] that many times to get it back to where it was, since each time you set a directory it pushes the previous ones down a stack (which you can reveal using DirectoryStack[]), and each time you use ResetDirectory[] it pops the directories back up the stack. You can use a Do loop on ResetDirectory[] n times to do this. Using the Print statement will show you the pop sequence of directories you have visited. I prefaced the command with Directory[] to show the current directory first.

In[117]:= Directory[]; Do[Print@ResetDirectory[], {3}]

C:\Users\82V\Documents\commandUNCuS

C:\Users\82V\Documents\Neuroscience\Arle-Shils\commandUNCuS\DataFiles

C:\Users\82V\Documents


Saturday, October 27, 2012

A Mathematica Bibliography, Part 1: How to Use Mathematica and its Programming Language


Consider reading Steve McConnell's book, cited under Programming and Computer Science books, in parallel with these if you undertake any serious programming task or just for the enjoyment and edification of learning programming best practices. I am reminded of the beautiful remark I heard by Marvin Minsky in one of his AI classes: "Anyone who doesn't program is missing one of the more interesting cultural experiences of our time."

Abell, Martha L., Braselton, James P., and Rafter, John A., Statistics with Mathematica (San Diego: Academic Press, 1999). Good introduction even though now dated due to the huge inclusion of statistics functions in Mathematica 8.

Blachman, Nancy R., and Williams, Colin P., Mathematica, A Practical Approach (Upper Saddle River NJ; Prentice Hall PTR, 1999). Introduction for beginners. I started with this book.

Dick, Samuel, Riddle, Alfred, and Stein, Douglas, Mathematica in the Laboratory (Cambridge: Cambridge University Press, 1997). An overview of using Mathematica with laboratory data and instruments. But also has excellent practical guidance on importing and exporting files and data, and file operations, and a good introduction to fitting data although you should look at the new tutorials, such as tutorial/CurveFitting for the basics, and tutorial/StatisticalModelAnalysis for more advanced analysis.

Gray, John W., Mastering Mathematica, Programming Methods and Applications 2nd edition (San Diego: Academic Press, 1998). For beginners to experts.

Maeder, Roman E., Programming in Mathematica 3rd edition (Reading MA: Addison-Wesley, 1997). Maeder was the key player in designing the Mathematica programming language with Wolfram, which I consider a major scientific and linguistic synthesis. He then showed how to use the language in significant programming areas and wrote the Wolfram Education Group course on Programming in Mathematica. I have to say, leaving a comparison of importance to history, that this is like Halley or Newton's other followers showing how to use his synthesis, e.g. to calculate the orbit and time of return of a comet. Maeder's approach is elegantly concise, high-level, and to the point.

Mangano, Sal, Mathematica Cookbook (Sebastopol, CA, USA: O'Reilly, 2010). This is my current favorite book to sample for edification and pure enjoyment. I'd say most of it is for intermediate to advanced users.

Trott, Michael, The Mathematica GuideBook for Programming (New York: Springer, 2004).  For beginners to advanced users. I've read most of this book and Trott's style, like Wolfram's, has had a strong influence on my programming style. In particular, I follow Hoare and Trott toward the goal of writing code that can be read like prose, such as using long, descriptive variable names instead of abbreviations. This is one of four 1000-page tomes by the truly prolific Trott, who also compiled the astonishing 310,000-function Wolfram Functions site http://functions.wolfram.com. Until I saw a single individual walking down the hall at the 2007 WRI Technology Conference with a badge saying "Michael Trott," I really didn't think he could be one person. I thought he must be another Bourbaki, the legendary French mathematical collaborative that published under that name.

Wagner, David B., Power Programming with  Mathematica: The Kernel (New York: McGraw-Hill, 1996). Truly a classic to which I was introduced by MathGroup.

Wellin, Paul, Gaylord, Richard, and Kamin, Samuel, An Introduction to Programming with Mathematica (Cambridge: Cambridge University Press, 2005). For beginners to intermediate users. As clear as can be. Paul Wellin heads the Wolfram Education Group.

Wolfram, Stephen, A New Kind of Science (Champaign, IL: Wolfram Media, 2002). I include NKS (as it's known), even though it's an advanced tome, because the code that is downloadable from the book's website is as exemplary for good functional programming as can be found anywhere. For instance, the simple, elegant development of the code that led to what is now CellularAutomaton and TuringMachine at the start of Chapter 5 had a major influence toward simplifying my programming style and approach to programming as a series of "one liners" (q.v. http://blog.wolfram.com/2011/12/01/the-2011-mathematica-one-liner-competition/ by my WRI namesake, Chris Carlson).

Wednesday, October 3, 2012

How It Works: DeleteDuplicates


Delete Duplicates

While Union is commonly used to select all unique elements from a List, including a set of Lists, DeleteDuplicates is commonly used to select unique elements from a single List, which Union can do, too. Union sorts the result, while DeleteDuplicates leaves the result in its original order. Consequently, DeleteDuplicates is a faster function if you do not need the Sort. Both functions include an optional second argument to specify the function used to remove duplicates, which greatly increases their power and versatility. First, here is DeleteDuplicates' basic functionality.

DeleteDuplicates@{c,a,b,d,a,c,a,e,e,a,a,e}

{c,a,b,d,e}

Note that if you do feed DeleteDuplicates a set of Lists, you do need to enclose the Lists in curly brackets.

DeleteDuplicates[{c,a,b},{d,a,c},{a,e},{e,a},{a,e}]

DeleteDuplicates::argb: DeleteDuplicates called with 5 arguments; between 1 and 3 arguments are expected. >>

DeleteDuplicates[{{c,a,b},{d,a,c},{a,e},{e,a},{a,e}}]

{{c,a,b},{d,a,c},{a,e},{e,a}}


You can use DeleteDuplicates' second argument to increase its breadth by specifying how it will detect the duplicates. So in the example above, by default neither Union nor DeleteDuplicates treats Lists with the same elements as equivalent, as would be the case in set theory, while this can be done with their sameness test.


It is relatively straightforward to construct the second argument if you keep in mind that the default is DeleteDuplicates[expr, SameQ] and therefore extensions of the function can take the form DeleteDuplicates[expr, f@#~SameQ~f#2&], where the comparison function f can be as complex as you wish. Here we need Sort because:

{a,b}~SameQ~{b,a}

False

DeleteDuplicates[{{c,a,b},{d,a,c},{a,e},{e,a},{a,e}},Sort@#~SameQ~Sort@#2&]

{{c,a,b},{d,a,c},{a,e}}

Here is a second, neat example from the Doc Center. Extending the power of DeleteDuplicates, this function uses Equal instead of SameQ, possibly since Equal will yield True for Reals and non-Reals. I've modified the example to show that.

5==5.

True

5===5.

False

list1 = {{0,0,0,1,0},{1,0,1,0,1},{1.,1.,1.,0.,0.},{0,0,0,0,1},{1,1,1,0,1}};

DeleteDuplicates[list1,Total@#==Total@#2&]

{{0,0,0,1,0},{1,0,1,0,1},{1,1,1,0,1}}

Here is a potential issue that limits the power of DeleteDuplicates. Trace tells us that the second 4 gets removed before it can be compared to the 16.

squaresList=Table[{x,x^2},{x,2,4}]//Flatten

{2,4,3,9,4,16}

DeleteDuplicates[squaresList,#2==#^2&]//Trace
{{squaresList,{2,4,3,9,4,16}},DeleteDuplicates[{2,4,3,9,4,16},#2==#1^2&],{(#2==#1^2&)[2,4],4==2^2,{2^2,4},4==4,True},{(#2==#1^2&)[2,3],3==2^2,{2^2,4},3==4,False},{(#2==#1^2&)[2,9],9==2^2,{2^2,4},9==4,False},{(#2==#1^2&)[2,4],4==2^2,{2^2,4},4==4,True},{(#2==#1^2&)[2,16],16==2^2,{2^2,4},16==4,False},{(#2==#1^2&)[3,9],9==3^2,{3^2,9},9==9,True},{(#2==#1^2&)[3,4],4==3^2,{3^2,9},4==9,False},{(#2==#1^2&)[3,16],16==3^2,{3^2,9},16==9,False},{2,3,16}}

We're asking DeleteDuplicates to do something beyond deleting duplicates. We should use DeleteCases to do this job.

DeleteCases[squaresList,x_/;(Sqrt@x//IntegerQ)]

{2,3}

How It Works

Somewhere Roman Maeder gives a solution to deleting duplicates and implies that his solution is efficient. From memory here is the solution (with my more efficient syntax). We create a simple list of duplicate integers.

dupeList = Table[{i, i}, {i, 10}] // Flatten

{1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10}

We partition them into sets of two with offset 1 (meaning overlap of one on the original List).

dupeListPartitioned2Offset1 = Partition[dupeList, 2, 1]

{{1, 1}, {1, 2}, {2, 2}, {2, 3}, {3, 3}, {3, 4}, {4, 4}, {4, 5}, {5, 5}, {5, 6}, {6, 6}, {6, 7}, {7, 7}, {7, 8}, {8, 8}, {8, 9}, {9, 9}, {9, 10}, {10, 10}}

Now it's a simple matter to Select the sets where Part 1 is not the same as Part 2. Select always takes a predicate, sometimes of the form testQ, but here with an abbreviated operator, UnsameQ. Unequal would work for numerical entries, but UnsameQ will also work for symbols and Strings.

dupeListdupeSetsDeleted = Select[dupeListPartitioned2Offset1, #[[1]] =!= #[[2]] &]

{{1, 2}, {2, 3}, {3, 4}, {4, 5}, {5, 6}, {6, 7}, {7, 8}, {8, 9}, {9, 10}}

We're still left with duplicates. So we take the First entry of each set, and Append the Last entry of the Last set, which would have been left out. I remember thinking at this point that I would have spent time trying to not 'hack' this last part--somehow capture that last entry without another operation--but if it's good enough for Maeder, it's good enough for me. The lesson is to do what is expedient and move on to the next task.

Append[First /@ dupeListdupeSetsDeleted, Last@Last@dupeListdupeSetsDeleted]

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

DeleteDuplicates