Lab 29 - MrBayes 3.0b4

Zool 575 Introduction to Biosystematics, (Sikes) Winter 2006

This lab, and many of the subsequent labs, will not be graded. I will be available during lab next week to answer questions (or emaill me at any time). You can complete this lab at your leisure and although MRBayes is on the computers in BI 182, it is a free download and you can install it on any Unix, PC, or Mac computer on which you have permission to do so. The program is available here: http://morphbank.ebc.uu.se/mrbayes/

This lab is based, heavily, on labs prepared by two phylo-statisticians who know far more about Bayesian analysis than I do: Paul Lewis and Fredrik Ronquist (these guys really know this stuff - they write the programs - Ronquist is the co-developer of MrBayes).

At this site are some links to the originals by Ronquist: http://www.csit.fsu.edu/~ronquist/mrbayes/

Here is the original lab by Lewis for a course he is currently teaching - there is a very interesting and useful section near the end which describes running MrBayes without data (so the posterior probability is equal to the prior) this is a great way to investigate the priors that are used:

http://hydrodictyon.eeb.uconn.edu/people/plewis/courses/phylogenetics/labs/mrbayes.html

The Ronquist site includes a few tutorials and other documents to help you understand how (and why) to conduct a Bayesian analysis. (Thanks to Abdul for finding this site! ) Consider the Lewis & Ronquist sites as optional reads if you want to extend your skills beyond the very introductory level presented here.

1. Copy the primate-mtDNA-interleaved.nex file from the PAUP* sample Nexus files folder to the MrBayes folder

The primate-mtDNA-interleaved.nex file is in the PAUP* folder inside the SAMPLE NEXUS FILES folder. Click to highlight the file and right-click to get a menu, choose, 'copy'. Then navigate to the MrBayes folder in the program files folder and open the MrBayes folder. Make sure you can see the executable program (MrBayes3.0_b4.exe) [although the extention '.exe' may be hidden on your computer]. Right click on the background of the folder and choose 'paste' from the menu.

Important: Then click into the name of the file primates-mtDNA-interleaved.nex and change it to primates.nex (for some reason MRBayes has trouble with filenames that include dashes or underscores).

MRBayes expects the datafile it analyzes to be in the same folder (directory) as the program. It will also write some results files to this same folder.

You can use your own dataset instead if you prefer but this lab is written to work best with the primate datafile.

2. Prepare the datafile for MrBayes

MrBayes is not as flexible in reading Nexus files as some other programs like PAUP*. A file that executes fine with PAUP* might cause errors with MrBayes. There are two primate datafiles in the PAUP* sample Nexus folder but only the interleaved file will work with MrBayes (not because it is interleaved but because the other file uses a unique system of coding the states) - and you must modify it as described:

Once you have the file in the MrBayes folder you need to open it in a text editor, - the PAUP* editor will do:

Delete all the text above the data themselves and paste the following text in its place:

#NEXUS
begin data;
dimensions ntax=12 nchar=898;
format datatype=dna interleave=yes gap=-;
matrix

Then delete all the text below the matrix as well (the assumptions and PAUP block) - this may not be necessary but I like to strip my datafiles of all extraneous commands to make sure MrBayes doesn't get confused.

3. Work through the tutorial prepared by F. Ronquist

This tutorial was written to use the primates.nex file and covers all the basics of a Bayesian analysis.

Ronquist MrBayes tutorial (pdf file)

Please do this tutorial slowly enough to understand what you are doing. If you skip this step and go straight to the next part of this lab you could end up producing a Bayesian analysis without having a clue as to how it was done or what options you selected and why.

4. Add a MrBayes block to the datafile

Typically we don't type most of the commands at the MrBayes prompt - instead we add a MrBayes block that includes all the commands to create a log file, to set the model, the set the MCMC chain, and to summarize the data when done:

a MrBayes block that you can use with the primates datafile is - place this below the datamatrix:

BEGIN MRBAYES;

log start filename=primates.log.txt;

set autoclose=yes;


lset nst=6 rates=invgamma;

showmodel;


mcmc ngen=25000 samplefreq=100 printfreq=100 nchains=4 savebrlens=yes;

sump filename=primates.nex.p burnin=50;
sumt filename=primates.nex.t burnin=50;
log stop;

END;

This is a small simple dataset - for larger more complex datasets it is best to run the MCMC chain longer (1 million steps) and to use a larger burnin (up to 20% of the samples).

With this block saved into your file when you execute the file in MrBayes all the commands in the block will be read & executed as well. The commands are explained in the tutorial (except the autoclose=yes command, this tells MrBayes that after it has run the MCMC for the specified number of steps to close the chains without waiting for user input.)

Tip: When you have completed a run take all the files produced by MrBayes, including the dataset file and move them into a new folder named something like 'MrB run1'. This way you prevent those files from being overwritten by MrBayes when you next run your dataset. To run your dataset again, copy the datafile back into the same folder as MrBayes and repeat. Keeping the datafile in your results folder is a good idea because we often find ourselves modifying our datafiles (adding new species etc) and it can become difficult to determine what datafile was used for what analysis unless you keep them together.

Recall the trees produced by the Parsimony and ML searches we did with this dataset - compare them to the tree produced by MrBayes.

Appendix: BayesPhylogenies

A program for Bayesian inference that can implement a ´mixture model´ (Pagel and Meade, 2004) allowing the user to fit more than one model of sequence evolution, without partitioning the data.

For those interested in Bayesian inference this program is worth investigating. The mixture-model approach is a very powerful method to detect significant partitions in your data.

http://www.rubic.rdg.ac.uk/meade/Mark/bayesphylogenies.html

Pagel, M. and Meade, A. (2004) A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol 53: 571-581.