Software

I develop a variety of open source software and scripts. I generally use perl for scripts and C/C++ for software, but I also code in R, PHP, javascript, and Matlab.

Major:

  • Brownie: Software for analyzing rate of continuous character evolution and testing for different rates in different groups. A version in beta testing allows investigation of correlation of discrete and continuous traits, many more models for discrete and continuous trait evolution, species delimitation, and more. There is even a graphical user interface! The code hosting has moved to Google code
  • phylobase: I co-organized a hackathon in comparative methods in R; the project I became involved in was a new package for R for loading and manipulating phylogenetic trees and data. My main contribution, developed with Derrick Zwickl, was code to load in NEXUS files of both trees and data.
  • MrFisher: MrBayes modified to do likelihood searches. It’s just being tested more before being released. Code available here
  • TreeTapper: The TreeTapper site uses a mixture of PostgreSQL, PHP, and javascript to organize and display data, such as automatically linking authors to their collaborators. It’s still in very active development, but will be released once it’s stabilized. See the development blog for more info.
  • PhyloRunner: PhyloRunner is a little program that allows you to double-click an icon to start a program and use Finder to navigate to the input file you want. It’s a way of running command-line only programs (r8s, recent PAUP, raxml, etc.) without having to use Terminal or learning Unix commands.

Minor:

  • Superdouble: A header file with a new number class for C++ to reduce overflow and underflow errors (such as when calculating likelihoods on a tree and log-transforming isn’t an option). See blog post about this.
  • DBGraphNav: This was proposed by me but written and largely designed by Paul McMillan, a UC Berkeley undergraduate I mentored through Google Summer of Code. It allows visual navigation of elements in a relational database by generating GraphViz image maps. It’s used in TreeTapper for coauthorship networks.
  • ModelPartition: A simple online tool for generating partitioned search commands for MrBayes.
  • SGE scripts: A series of bash and perl scripts I use to manage my jobs on a Rocks Sun Grid Engine Cluster. It basically allows queuing on top of the SGE queue, so that if I have thousands of jobs to submit, I can do so gradually rather than preventing other users from getting on the queue.
  • Concatenate nexus files script (.zip compressed): A simple perl script that takes a list of taxon names and a folder containing nexus files for individual genes and makes a new nexus file containing all the genes, dealing with missing taxa from some of the datasets appropriately, and makes charsets representing each of the genes. If the gene datasets have information on codon position from Mesquite or MacClade, this is used to make charsets for each gene and codon position, including introns.
  • Compare clades script (.zip compressed): A simple perl script that takes a list of observed newick gene trees, a list of simulated newick gene trees, and a list of assigments of samples to species and uses them to count the number of times each observed tree occurs in the list of simulated trees taking into account the many to one mappings of samples to species. I.e., if species A has samples 1,2,3 and species B has samples 4,5,6, tree ((1,(2,4)),(3(5,6)) matches tree ((2,(3,6)),(1(5,4)) [because they are both ((A,(A,B)),(A(B,B))]
  • ProtTest to RAxML script (.zip compressed): A simple perl script that takes a ProtTest output file and runs the appropriate RAxML model. It is still a bit rough (though it seems to work), so double check (that is why it outputs info on the models it is not using). It is currently configured to start a 200 rep quick bootstrap run automatically — you can turn this off by deleting the line that says “system($systemstring)”
  • Research Tools: Software to manage collections and sequencing projects, including posting real-time information to the web and making dichotomous keys. Still in development, but see it in action on the Myrmecocystus pages.

Helpful links:

  • Nexus Class Library: Allows C++ software to read NEXUS files. An older version is used in Brownie.
  • wxWidgets: Cross-platform library for creation of graphical user interfaces.
  • GNU Scientific Library: C code for math (linear algebra, numerical optimization, etc.).
  • Rod Page: Brownie uses his TreeLib.
  • ms: A program to simulate gene trees under various models of population structure.
  • Yahoo! User Interface Library (YUI): Javascript library for Ajax, tables, etc. (competing libraries are Dojo and Prototype). YUI has the best documentation and examples of any of the various javascript libraries I’ve seen.