Systematics and Comparative Biology
Instructor: Derek S. Sikes
The course outline is available here. However, the PDF of the course outline does not have the assigned readings or other notes that are listed below - so consider this webpage to be the definitive outline.
Note: I will try to post PDFs of lecture notes before lecture so you can print them out and bring them to lecture. HOWEVER, in most cases they will not be ready until the morning of the day of lecture (or late the previous night).
READINGS - to be completed before lecture if possible or soon after if not (some optional)
Those marked with an asterisk (*) you must find yourself using the library journal system. Those not marked with an asterisk will be either available through the course website as PDFs. Those marked with two asterisks (**) are available in the reserve reading room of the library and may be read there or photocopied there for reading later.
Lecture 1: Introduction to biological systematics (value)
The role and value of systematics. Is biological systematics an endangered discipline? Does the decline in systematic biology matter? This lecture provides a skeleton of the profession which will be fleshed out during the course.
Mayr, E.& P. D. Ashlock (1991) Principles of Systematic Zoology, 2nd Edition, McGraw-Hill, Inc., NY. pp. 1-8 (note: PDF has 1-18 but only read to pg8). [PDF]
*Godfray, H. C. J. (2002) Challenges for taxonomy. Nature 417: 17-19.
A short newsletter (Zoogoer) piece on the decline of taxonomy by C. Mims (2003): http://nationalzoo.si.edu/Publications/ZooGoer/2003/4/EndangeredScience.cfm
Flowers et al. (2002) Does the decline in systematic biology matter? Chapter 4 of report to the House of Lords (UK). Select Committee on Science & Technology. [webpage] - Note: just skim this.
In addition to the required readings, which includes one chapter of the Science and Technology report to the UK House of Lords, written in 2002, there is the entire report which can be accessed here:
Optional: Erwin, T. L. 1982. Tropical forests: Their richness in Coleoptera and other arthropod species. Coleopterists Bulletin 36:74-75. [PDF]
Lecture 2: Value of biosystematics continued; History of taxonomy
The role and value of beta taxonomy (phylogenetics). Unlike alpha taxonomy, this subdiscipline of biosystematics is not in danger of extinction, due primarily to the avalanche of molecular data that has invigorated and modernized the field.
The history of alpha taxonomy: key people and events.
Required Readings: Catch up on readings - be sure to complete lecture 1 readings...
*Gould, S. J. 2000. Linnaeus's Luck? Natural History. Vol 109 iss.7 : (September): 19-25, 66-76.
Mayr, E.& P. D. Ashlock (1991) Principles of Systematic Zoology, 2nd Edition, McGraw-Hill, Inc., NY. pp. 9-18. [PDF - see link under lecture 1]
Optional: Wheeler, Q. D. 2004. Taxonomic triage and the poverty of phylogeny. Phil. Trans. R. Soc. Lond. B. 359: 571-583.
Lecture 3: Lab taxonomy on the web (web exercise) - meet in the computer lab, Irving 1 (room 303)
Websites exercise - turn in answers during lecture on Monday
This lab will introduce you to a handful of biosystematic web resources.
Required Readings: (review Godfray 2002)
*Gewin, V. (2002) Taxonomy: All living things, online. Nature 418: 362-363. (news feature)
*Mallet, J. & K. Willmott (2003) Taxonomy: renaissance or Tower of Babel? Trends in Ecology and Evolution 18(2): 57-59.
Lecture 4: Species & taxonomy
A brief introduction to The Species Problem. Species are fundamental units of biology, especially of biosystematics. This lecture deals with the questions of "Are species real?", "If so, and of greatest relevance to taxonomy, how are they defined?" Typically these subjects require multiple lectures or even entire courses so this overview is quite brief.
*Sites, J.W., Jr., and Marshall, J.C. (2003). Delimiting species: a renaissance issue in systematic biology. Trends in Ecology and Evolution 18: 462-470. Although this a review paper, it covers some fairly complex aspects of delimiting species - don't get bogged down in the details of each method. The point of reading this is to get a sense for the variety of methods and the complexity of the issues.
Mayr, E.& P. D. Ashlock (1991) Principles of Systematic Zoology, 2nd Edition, McGraw-Hill, Inc., NY. pp. 39-54. [PDF]
*Wilson, E. O. and W. L. Brown (1953) The subspecies concept and its taxonomic application. Syst. Zool. 2: 97-111. [access via JSTOR]
Optional: *Mousseau, T., Sikes, D., S. 2011. Almost but not quite a subspecies: a case of genetic but not morphological diagnosability in Nicrophorus (Coleoptera: Silphidae). Biological Journal of the Linnean Society 102: 311-333.
Optional: *Amadon D. 1949. The seventy-five per cent rule for subspecies. The Condor 51: 250258.
Optional: *Winker, K. 2009. Reuniting Phenotype and Genotype in Biodiversity Research. BioScience 59: 657-665.
Lecture 5: Nomenclature & Classification
Central to all alpha taxonomic work is the issue of names. Which is the correct name for a species? A combination of taxonomy and nomenclature is required. This lecture will introduce you to the International Code of Zoological Nomenclature (ICZN) and hopefully acquaint you with some of the issues that arise during taxonomic work (and how they can be solved).
Mayr, E.& P. D. Ashlock (1991) Principles of Systematic Zoology, 2nd Edition, McGraw-Hill, Inc., NY. pp. 383-406. [PDF]
Thompson, F. C. 2003. Nomenclature and Classification, Principles of. Pp. 798-807. In Resh, V. H. & Cardé, R. T. (Eds.), 2003, Encyclopedia of Insects xxx + 1266 pp. Academic Pres, San Diego [NOTE: skim 798-802, read 803-807 [PDF]
**Winston, J. E. (1999) Describing species: Practical Taxonomic Procedure for Biologists. Columbia University Press, NY. pp. 19-40, 407-432.
Lecture 6: Lab Nomenclature exercise & ICZN
Nomenclature Exercise (due in 1 week)
Nomenclature Exercise Key
We will finish our lecture on nomenclature by covering information on type specimens. This week's lab will require you to work through a number of rather ordinary, but possibly challenging, nomenclatural problems.
(review reading of Lecture 5.)
Lecture 7: Specimens, collections, curation
Loose ends on nomenclature and some new web resources. A thorough look at specimens, collections, collecting, and curation, including "is collecting ethical?"
**Wiley, E. O. (1981) Phylogenetics: The Theory and Practice of Phylogenetic Systematics. John Wiley & Sons, Inc. pp. 306-317.
**Winston, J. E. (1999) Describing species: Practical Taxonomic Procedure for Biologists. Columbia University Press, NY. pp. 95-112, 173-188.
Suarez, A., & N. D. Tsutsui. (2004) The value of museum collections for research and society. Bioscience 54(1):66-74 [PDF]
Lecture 8: Modern Taxonomy DNA barcodes, etc.
Solutions to the "taxonomic impediment": digitization & web dissemination of data. On-line identification keys. Use of DNA data for species demarcation. We will spend the final 15 minutes of lecture discussing Hebert's idea of "DNA barcoding."
*Hebert, P. D. N., A. Cywinska, S. L. Ball, and F. R. deWaard. (2003) Biological identifications through DNA barcodes. Proceedings of the Royal Society of London, Serial B 270: 313-321.
Janzen, D. H. (2004) Now is the time. Phil. Trans. R. Soc. Lond. B 359: 731732 [PDF]
*Lipscomb, D. N. Platnick, & Q. Wheeler. (2003) The intellectual content of taxonomy: a comment on DNA taxonomy. Trends in Ecology & Evolution 18(2): 65-66.
Sperling, F. (2003) DNA Barcoding: Deus ex Machina. Newsletter of the Biological Survey of Canada (Terrestrial Arthropods). 22(1)
*Tautz, D. P. Arctander, A. Minelli, R. H. Thomas, & A. P. Vogler. (2003) A plea for DNA taxonomy. Trends in Ecology & Evolution 18(2): 70-74
*Will, K. W. & D. Rubinoff. (2004) Myth of the molecule: DNA barcodes for species cannot replace morphology for identification and classification. Cladistics 20: 47-55. [PDF]
Optional Readings: An entire issue devoted to DNA barcoding: Phil. Trans. R. Soc. B (2005) 360
Lecture 9: Lab - beetle exercise: finding characters & keys; descriptions and diagnoses
Required Reading: **Winston, J. E. (1999) Describing species: Practical Taxonomic Procedure for Biologists. Columbia University Press, NY. pp. 189-240, 367-381.
Optional Readings: Before we leave alpha taxonomy here is another (very good!) recent article on the importance of this science including a nice contrast with phylogenetics and more discussion of DNA barcoding.
Wheeler, Q. D. (2004) Taxonomic triage and the poverty of phylogeny. Phil. Trans. R. Soc. Lond. B. 359: 571-583. [PDF]
Two additional readings on museums, also optional, are:
Sabloff, J. A. (2002) Whither the Museum? Science, 298:755-756.
Gropp, R. E. (2003) Are University Natural Science Collections Going Extinct? BioScience, 53(6):55
Lecture 10: Phylogenetic inference history / introduction
We begin our study of phylogenetics with this introduction to the history of the methods employed.
*Felsenstein, J. (2001) The troubled growth of statistical phylogenetics. Systematic Biology 50(4): 465-467.
**Wiley, E. O. (1981) Phylogenetics: The Theory and Practice of Phylogenetic Systematics. John Wiley & Sons, Inc. pp. 240-276. [this is dense and long, sorry! skim]
An optional reading:
Felsenstein, J. (2004) Inferring Phylogenies. Sinauer Associates, Inc. Massachusetts, xx + 664. pages: 123- 146.
Lecture 11: Homology
This lecture deals with the first steps of phylogenetic analysis: selecting the data.
**Wiley, E. O. (1981) Phylogenetics: The Theory and Practice of Phylogenetic Systematics. John Wiley & Sons, Inc. pp. 115-138.
Salemi, M. and Vandamme, A.-M. (eds). (2003) The Phylogenetic Handbook: A practical approach to DNA and protein phylogeny, Cambridge Univ. Press., 1st Edition. READ: Foreword, Chapter 1
A very good source for those of you who might want to continue with phylogenetics after the course:
Wiley, E. O., D. Siegel-Causey,, D. R. Brooks, and V. Funk. 1991. The Compleat Cladist: A Primer of Phylogenetic Procedures. University of Kansas Museum of Natural History, Special Publication 19. This is out of print, a pdf was available from http://www.nhm.ukans.edu/cc.html. but if that link is not working use this temporary link [PDF]
Lecture 12: Lab - beetle exercise, keys
Optional Reading: Edwards, M. & D. R. Morse. 1995. The potential for computer-aided identification in biodiversity research. Trends in Ecology and Evolution 10:153-158.
Lecture 13: Molecular Homology, Alignment
Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chapter 3
*Kjer, K. M. (2004) Aligned 18S and Insect Phylogeny. Systematic Biology 53:506-514.
Ogden, T. H., M. F. Whiting, W. Wheeler. 2005. Poor taxon sampling, poor character sampling, and non-repeatable analyses of a contrived dataset do not provide a more credible estimate of insect phylogeny: a reply to Kjer. Cladistics, 21:295-302.
**Page, R. D. M. & Holmes, E. C. (1998) Molecular Evolution: A Phylogenetic Approach. Blackwell Science Ltd. pp. 30-33.
Lecture 14: Trees - Parsimony
There are various methods to infer phylogenies. We will begin with one of the easiest to understand. We will only begin parsimony with this lecture but will return to it in various future lectures.
Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chapter 8 - THEORY.
*Grant, T., Faivovich, J., & Pol, D. (2003) The perils of ‘point-and-click’ systematics. Cladistics 19: 276-285.
Lecture 15: Lab - Alignment, Clustal: Data
This assignment introduces you to both alignment by secondary structure and automated alignments using CLUSTAL-W.
Required Reading: Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chapter 2 theory & practice.
Lecture 16: Distance methods
This lecture will contrast distance methods with character based methods and also contrast optimality criterion methods with clustering methods. Examples will be shown to illustrate the undesirable nature of clustering methods.
Required Readings: Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chapter 5 theory.
*Scotland, R. W., R. G. Olmstead, and J. R. Bennett. 2003. Phylogeny reconstruction: The role of morphology. Systematic Biology 52: 539-548. - read before Wiens (2004)
*Wiens, J. 2004. The role of morphological data in phylogeny reconstruction. Systematic Biology 53: 653-661. (reply to Scotland et al.)
Farris, J. S., V. A. Albert, M. Källersjö, D. Lipscomb and A. G. Kluge. 1996. Parsimony jackknifing outperforms neighbor-joining. Cladistics 12: 99-124. [A classic exposure of the more dramatic weaknesses of neighbor-joining methods]
Philips, M. J., F. Delsuc, D. Penny. 2004. Genome-scale phylogeny and the detection of systematic biases. MBE 21(7):1455-1458 [Demonstration of systematic error with use of Minimum Evolution]
Lecture 17: Large datasets - Heuristic searching
This lecture will address solutions to the unique challenges presented by large datasets (>25 OTUs).
*Rice, K. A., M. J. Donoghue, and R. G. Olmstead. 1997. Analyzing large data sets: rbcL 500 revisited. Syst. Biol. 46: 554-563.
*Philippe, H, Brinkmann, H., Lavrov, D. V., Littlewood, T. J., Manuel, M., Wörheide, G., Baurain, D., 2011. Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biol 9(3): e1000602. doi:10.1371/ journal.pbio.1000602
Sikes, D. S. and P. O. Lewis. 2001. beta software, version 1. PAUPRat: PAUP* implementation of the parsimony ratchet. Distributed by the authors. Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, USA. read the manual
Nixon, K. C. (1999) The Parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics 15: 407-414.
Lecture 18: Lab - introduction to PAUP*: Distance methods & parsimony
This asignment will get you started using PAUP*. You will analyze a simple fake dataset to see some of the differences between various distance and character methods.
Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chapter 5 & 7 practice BRING TEXT TO LAB.
Prepare for midterm (Monday 15 Oct). Note also that the lecture notes have example exam / study questions at the end.
Study guide for midterm exam
Lecture 19 - Models, correction, model choice
Handout lec. 19
We begin our study of models of sequence evolution with this lecture. Although we will use the log likelihood score to help decide which model to use you will not learn likelihood until next lecture.
Required Reading: Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: pp. 111-125; pp. 303-312. (don't worry too much about all the math, I'll explain a few of the equations in lecture)
Lecture 20 - Lab: introduction to PAUP* II: Nexus file format etc.
This assignment, which will not be graded, will allow you to expand your skills with using PAUP*. No assigned reading. The text material for lab 18 could be useful, however.
Lecture 21 - Maximum Likelihood
Handout lec. 21
*Susko, E. Y. Inagaki, and A. J. Roger. 2004. On inconsistency of the neighbor-joining, least squares, and minimum evolution estimation when substitution processes are incorrectly modeled. Molecular Biology and Evolution 21(9): 1629-1642. READ - this is follow-up on lecture 20. (Skip the mathematics) Two points are obvious: 1) they dismiss maximum likelihood as infeasible for large problems due to its computation complexity and suggest distance methods are a viable alternative - they fail to point out that Bayesian methods are a superior alternative for these larger problems because they are less computationally demanding than ML and they are character - based methods and therefore use more of the data than do distance methods, and 2) they reiterate an important point - model misspecification can cause any method to fail.
**Page, R. D. M. & Holmes, E. C. (1998) Molecular Evolution: A Phylogenetic Approach. Blackwell Science Ltd. pp. 148-162; 193-196. A very readable description of the maximum likelihood method, should be useful to clarify any confusion you may have on this topic.
or.. [choose either or both of these last two, (Page & Holmes is far easier to understand)]
Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: 137-150; skim most of the math but read the text.
Optional Reading: Foster, P. 2001. The Idiot’s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. - a PDF which does a good job of explaining ML
Lecture 22 - Accuracy and performance of MP & ML
Handout lec. 22
This lecture compares MP to ML and tries to explain the causes of the performance differences. We also cover a number of attributes of different methods to help decide which method is best to use for a given problem.
Required Readings: **Hillis, D. M., B. K. Mable, and C. Moritz. 1996. Applications of molecular systematics: The state of the field and a look to the future. Chapter 12 in Hillis, D. M., C. Moritz, & B. K. Mable (eds). Molecular Systematics (2nd ed). Sinaeur Associates, Inc. Massachusetts. xvi + 655 pp. READ: 526-530 & make sure you understand figure 5.
*Siddall, M. E. 1998. Success of Parsimony in the four-taxon case: Long-branch repulsion by likelihood in the Farris zone. Cladistics 14: 209-220. Note that Siddall says "...the number of synapomorphies recovered for a pair of sister taxa need not all actually be homologies for the method to have behaved correctly."
*Swofford, D. L., P. J. Waddell, J. P. Huelsenbeck, P. G. Foster, P. O. Lewis, and J. S. Rogers. 2001. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Systematic Biology 50(4): 525-539. Reply to Siddall 1998.
Lecture 23 - Lab: Model Selection
Lab Assignment 1 - Bayes Factors
Lab assignment 2 - ModelTest This link will bring you to a lab demonstrating how to use the free program Modeltest to select the best model for a DNA-based likelihood analysis. It interacts with the program PAUP.
This lab will teach you the very basics of conducting a Bayesian analysis and how to use Bayes Factors and ModeltTest to select the best-fitting model for your data.
Required Reading: *Nylander, J.A., Ronquist, F., Huelsenbeck, J.P. and Nieves-Aldrey, J.L. (2004). Bayesian phylogenetic analysis of combined data. Systematic Biology 53: 47-67. READ: pp. 48-49 "Bayesian Model Selection" - this is a good source to see how Bayes Factors can be used to select models.
Optional Reading: Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chap 10 (Selecting models of evolution - Posada) and pp. 303-308 - pay attention to how Swofford & Sullivan recommend one choose a best fitting model. Posada (who wrote the software ModelTest) explains more about model selection, see figure 10.1 for a nice flowchart of models.
Lecture 24 - MP & ML continued, Assessment, Tree Confidence
Handout lec. 24
We finish looking at the differences between MP & ML in this lecture and begin exploring different means to (hopefully) assess the strength of the phylogenetic signal in a dataset. I say "hopefully" because all we can really assess is the repeatibility of the inference and it is up to us to take it on faith that repeatibility is related to accuracy.
We will begin with some lesser-used methods (CI, PTP-test) that attempt to assess the strength of the signal in the data. There is no required reading on these methods. You might use the time to get a head start on the readings for lecture 25.
Lecture 25 - Assessment 2: Consensus trees, Decay Indices, Bootstrapping
Handout lec. 25
This lecture covers three of the most common methods to quantify the uncertainty in the data - (none of which are ideal).
Required Readings: Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: pp. 156-60, 295 - 300 for a mechanical & practical explanation of bootstrapping etc. Also READ chapter 12 - Testing Tree Topologies. This chapter doesn't include Bayes Factors as a Bayesian method of topology testing, for information on this check out the MrBayes Manual "4.4 Testing a Topological Hypothesis".
*Huelsenbeck, J. P. and B. Rannala. (2004) Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Systematic Biology 53 (6): 904-913. Note: we will read this in more detail after we cover Bayesian methods. For now, focus on the introduction in which these authors do a good job of explaining Bootstrapping.
Lecture 26 - Lab: PAUP* IV: Confidence
This lab will teach you how to build consensus trees and conduct bootstrapping searches using PAUP*.
Required Reading: *Buckley, T.R. and Cunningham, C.W. (2002) The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support. Molecular Biology and Evolution 19: 394-405. [This very important paper evaluates a number of difficult but well-known phylogenies and compares how different models, including parsimony, do with estimates of bootstrap support]
Optional Readings: *DeBry, R. (2001) Improving interpretation of the decay index for DNA sequences. Syst. Biol. 50:742752 Helps to understand why decay indices appear to not be readily comparable between different branches of the same tree.
*Sullivan, J., Abdo, Z., Joyce, P. and Swofford, D.L. (2005) Evaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation. Mol Biol Evol [This paper is a very important test of the question "Can we iteratively estimate model parameters on a poor tree and use them to infer the best tree? Or will these parameters be "starting-point dependent" and bias us away from the best tree?"]
*Abdo, Z., Minin, V.N., Joyce, P. and Sullivan, J. (2005) Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation. Mol Biol Evol 22: 691-703. [This paper is related to the prior but examines the process of finding the best model directly - this paper is only indirectly related to ML bootstrapping]
Lecture 27: Statistical Hypothesis Testing - Comparison of two trees; Bayesian Inference 1
Handout lec. 27
We continue with methods to assess the strength of the signal in the data by looking at tests designed to determine if the differences in two trees is greater than might be expected due to sampling error. This follows nicely into Bayesian Phylogenetic Inference which is a method of obtaining trees and branch support simultaneously. Bayesian inference of phylogeny is a new method (1996+) that is still being developed but appears to offer many advantages over both ML and MP - not without a few disadvantages as well (see lecture 28).
*Lewis, P. O. (2001) Phylogenetic systematics turns over a new leaf. Trends in Ecology & Evolution 16(1): 30-37 Focus on the section & boxes in which Lewis explains Bayesian methods.
*Holder, M. and Lewis, P.O. (2003) Phylogeny estimation: traditional and Bayesian approaches. Nature Reviews Genetics 4: 275-284. This is an excellent survey of all the phylogenetic methods you've learned about so far in comparison to Bayesian. If you have been slacking off and not doing the readings I suggest you read at least this one!
I also recommend highly a little reading on Bayesian versus Frequentist statistics. There are various educational sources of this topic on the web - a good, one page, example is here: Charles Annis - Statistical Engineering: Frequentists and Bayesians - What IS "probability?" Confidence Intervals vs Credible Intervals
Lecture 28: Bayesian Inference 2 - Bootstrapping vs Posterior Probabilities, MCMC
Handout lec. 28
In this lecture you will learn how we apply Bayes' Rule using a method called MCMC and we will get to a controversial issue - a comparison of Bayesian PP with Bootstrap values.
Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. READ: Chapter 7
*Huelsenbeck, J. P. and B. Rannala. (2004) Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Systematic Biology 53 (6): 904-913. Now read the methods, results & discussion (you read the introduction for lecture 25). These authors point out that to simulate data that follows the assumptions of a Bayesian analysis one must draw the parameters from the prior distributions.
*Erixon, P., B. Svennblad, T. Britton, & B. Oxelman. (2003) Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Systematic Biology 52(5): 665-673. An excellent examination of pp and bootstrapping.
Lecture 29: Lab - MrBayes 3.1.2
Lab Assignment 29
In this lab you will learn the basics of Bayesian analysis. Links are provided to resources that will allow you to move from a basic to advanced skill level if you so desire. If you haven't completed the prior labs today you can catch up and / or continue to use the skills of these labs to analyze your chosen datasets.
*Lewis, P.O. (2001). A likelihood approach to estimating phylogeny from discrete morphological character data. Syst Biol 50: 913-925. This paper introduced a likelihood model "improvement" of parsimony, ( i.e. a non statistically pathological model).
Lecture 30: Bayesian Inference 3 & Ancetral State Reconstruction
Handout lec. 30
In this lecture we will finish Bayesian Inference by looking at how well it fairs in the Felsenstein Zone relative to ML. We will then shift gears and begin to look at things that can be done once we have a reliable tree - such as infer ancestral character states.
*Cunningham, C.W., Omland, K. E., Oakley, T. H. (1998) Reconstructing ancestral character states: A critical reappraisal. Trends in Ecology and Evolution 13: 361-366. A citation classic for ancestral state reconstruction - good comparison of Parsimony reconstruction to Likelihood.
*Cummings, M.P., Handley, S.A., Myers, D.S., Reed, D.L., Rokas, A. and Winka, K. (2003) Comparing bootstrap and posterior probability values in the four-taxon case. Syst Biol 52: 477-487. An optional read - demonstration that Bayesian Inference is less able to nail the correct tree in the Felsenstein Zone than is ML. Note that Bayesian Inference is not showing inconsistent behavior in this zone as does MP (ie it does not prefer the wrong tree with strong support) but is simply being more equivocal (less decisive) than ML.
Lecture 31: Ancestral Character State Reconstruction 2
Handout lec. 31
We continue with the problems associated with reconstruction of ancestral states. You will read papers that address the contentious claim of Whiting et al. (2003) that stick insects re-evolved wings after millions of years of winglessness.
*Whiting, M.F., Bradler, S. and Maxwell, T. (2003) Loss and recovery of wings in stick insects. Nature 421: 264-267. A controversial study which suggests that stick insects "re-evolved" wings that had been absent for millions of years.
*Trueman, J.W.H., Pfeil, B.E., Kelchner, S.A. and Yeates, D.K. (2004) Did stick insects really regain their wings? Systematic Entomology 29: 138-139. A counter-argument to Whiting et al (2003).
*Whiting, M.F. and Whiting, A.S. (2004) Is wing recurrence really impossible?: a reply to Trueman et al. Systematic Entomology 29: 140-141. A reply to the counter-argument.
*Stone, G. and French, V. (2003) Evolution: Have wings come, gone and come again? Current Biology 13: R436-R438. a good reiteration of the issues involved.
Lecture 32: Lab - Work on projects & Optional Ancestral Character State Reconstruction
OPTIONAL - This link provides tutorials for using BayesTraits, a new program and approach to ACSR.
OPTIONAL - Mesquite, Lab tutorial
(This lab will introduce you to the program Mesquite. This is not a program to infer phylogenies, such as PAUP* or MrBayes, instead, this is a program to study character evolution and trees, among many other analyses. This is a flexible and extensible program that will increase in functionality as more modules are written for it. Being written in Java it runs on all computer platforms. It is a free package and can be found at this website (which also provides extensive documentation). )
Lecture 33: Troubleshooting Phylogenies
Handout lec. 33
*Sanderson, M.J. and Shaffer, H.B. (2002) Troubleshooting molecular phylogenetic analyses. Annual Review of Ecology and Systematics 33: 49-72. An excellent review paper of the challenges of phylogenetic inference.
McCracken, K.G. and Soreson, M.D. (2005) Is Homoplasy or Lineage Sorting the Source of Incongruent mtDNA and Nuclear Gene Trees in the Stiff-Tailed Ducks (Nomonyx-Oxyura)? Syst Biol 54: 35-55. [You can download the PDF here] A good example of how challenging some phylogenetic problems can be: "Despite collecting more than 8,000 base pairs of sequence data per taxon, we are left with an unresolved trichotomy..."
Foster, P. G. & Hickey D. A. (1999) Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. J Mol Evol 48: 284-290. An optional read - an excellent paper showing the risk of error due to compositional bias
Cummings, M.P., Otto, S. P., and Wakely, J. (1995) Sampling properties of DNA sequence data in phylogenetic analysis. Molecular Biology and Evolution 12: 814-822. An optional read - another excellent paper - assessing the amount of data needed to reach 95% accuracy with different methods of analysis. Quite relevant to the McCracken & Sorenson (2005) paper.
Zander, R. H. (preprint extract) Unaccounted Assumptions. website This is little more than a list of problems that can afflict phylogenetic analyses. It is long but not exhaustive, and includes some redundancy. Rarely are all of these potential problems addressed in a single study, nor are there many practicing phylogeneticists who are even aware of all these potential problems. Note, however, that not all of these issues need be addressed in any study, and some of them, (like unequal clade priors) have been dismissed as non-problems. This is a helpful list akin to Sanderson & Shaffer's 2002 paper above, unfortunately unlike Sanderson & Shaffer's paper Zander doesn't provide the solutions to these problems - leaving the reader with perhaps an overly pessimistic view about the field of phylogenetics.
Lecture 34: Moleculare Divergence Dating
Handout lec. 34
Required Readings: Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. Read pp. 265-269 on testing for a molecular clock.
*Arbogast, B.S., Edwards, S.V., Wakeley, J., Beerli, P. and Slowinski, J.B. (2002) Estimating divergence times from molecular data on phylogenetic and population genetic timescales. Annual Review of Ecology and Systematics 33: 707-740. An excellent review paper on the issues of molecular dating.
[TO BE UPDATED WITH BEAST - IGNORE FOR NOW]We will not have a lab on divergence dating but you are free to explore this subject on your own. However, the software and algorithms for divergence dating are far from complete - a good deal of research remains to be done on the subject and the software is far from user friendly! (read: only for use by the truly brave - succeeding with these packages will earn you respect among systematists). Here are two websites to obtain software and information on divergence dating:
1. Thorne, Kishino, & Yang's Bayesian Dating program multidivtime
2. Sanderson's dating program (nonparametric rate smoothing, penalized likelihood) r8s (pronounced 'rates')
Note: these programs are useful for the common situation in which your data reject a molecular clock. To determine if your data reject a clock you must perform a LRT - how this can be done is described on Brian O'Mera's webpage (see 'testing for a molecular clock):
Lemey, P., Salemi, M. and Vandamme, A.-M. (eds). (2009) The Phylogenetic Handbook: A practical approach to Phylogenetic Analysis and Hypothesis Testing, Cambridge Univ. Press., 2nd Edition. Your text has a good description of how to use ModelTest to test for a molecular clock on pp. 275-277.
Hillis, D.M., B. K. Mable C. Moritz (1996) Applications of molecular systematics: The state of the field and a look to the future. Chapter 12 in Hillis, D. M., C. Moritz, & B. K. Mable (eds). Molecular Systematics (2nd ed). Sinaeur Associates, Inc. Massachusetts xvi + 655 pp. [pp. 531 - 540 address the issue of molecular clocks & divergence dating. Highly recommended (required, actually) for anyone planning on using molecular data to estimate time.]
1. Email me your list of 5 questions with correct answers before last lecture - I will post them on the website as a FAQs page
- Must have date of question (approximate OK)
- Question (proper spelling, grammar, etc.)
- Answer (if answer is ambiguous, or unknown, do not use)
- Worth bonus points of maximum 5%
Lecture 35: Lab (Friday) - Work on PROJECTS
Lecture 36: New Uses for New Phylogenies
Handout lec. 36
This lecture is a series of examples of phylogenies being used to answer questions - from the more basic questions of classification to questions that initially might seem intractable using a phylogenetic approach (like "Did early humans scavenge meat from carcasses in Africa and when did they start doing this?").
*Metzker, M.L., Mindell, D.P., Liu, X.M., Ptak, R.G., Gibbs, R.A. and Hillis, D.M. (2002) Molecular evidence of HIV-1 transmission in a criminal case. Proc Natl Acad Sci U S A 99: 14292-14297. The first use of phylogenetics in a US crimminal court case. Branch support values take on a new meaning!
*Grant, T., Faivovich, J., & Pol, D. (2003) The perils of ‘point-and-click’ systematics. Cladistics 19: 276-285. Re-read this paper now that you are better able to understand it. This will help you gauge how much you have learned in the course.
Lecture 37: Lab - work on projects
Handout lec. 37
This lecture covers some of the more difficult material of the course - again: primarily aspects having to do with models, correction, ML and MP
Lecture 38: Recent Methods - Species Trees
Lecture 39: Next Gen Sequencing & systematics
Lecture 40: Lab - work on projects
Lecture 41: TBD
FINAL EXAM: Wed 12 December, 8-10AM
The list of past year's questions can be found here: FAQs