Estimation of phylogeny and invariant sites under the general Markov model of nucleotide sequence evolution

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

30 Citations (Scopus)

Abstract

The models of nucleotide substitution used by most maximum likelihood-based methods assume that the evolutionary process is stationary, reversible, and homogeneous. We present an extension of the Barry and Hartigan model, which can be used to estimate parameters by maximum likelihood (ML) when the data contain invariant sites and there are violations of the assumptions of stationarity, reversibility, and homogeneity. Unlike most ML methods for estimating invariant sites, we estimate the nucleotide composition of invariant sites separately from that of variable sites. We analyze a bacterial data set where problems due to lack of stationarity and homogeneity have been previously well noted and use the parametric bootstrap to show that the data are consistent with our general Markov model. We also show that estimates of invariant sites obtained using our method are fairly accurate when applied to data simulated under the general Markov model.

Original languageEnglish
Pages (from-to)155-162
Number of pages8
JournalSystematic Biology
Volume56
Issue number2
DOIs
Publication statusPublished - Mar 2007
Externally publishedYes

Keywords

  • Invariant sites
  • Maximum likelihood
  • Nonhomogeneous process
  • Nonstationary process
  • Nucleotide substitution
  • Phylogenetics

Fingerprint

Dive into the research topics of 'Estimation of phylogeny and invariant sites under the general Markov model of nucleotide sequence evolution'. Together they form a unique fingerprint.

Cite this