GithubHelp home page GithubHelp logo

finalproject's Introduction

Phylogenetic Biology - Final Project

An Investigation into the Relationship between Procolophonid Biogeography and Tooth Morphology

Introduction and Goals

The goal of my project is to answer the following question: are procolophonids that have similar tooth sizes more closely related to one another than procolophonids with differing tooth sizes? I am interested to see if more closely related procolophonids may have more similar diets and lifestyles.

For this project, I will compare the various dentitions of procolophonids, an extinct clade of anapsid reptiles that lived during the late Permian through the Triassic (~300-200 million years ago). Procolophonids are the only clade within the larger clade Parareptilia to survive the Permo-Triassic mass extinction, the most devastating mass extinction to occur in earth’s history (Essentials of Geology). Procolophonids are a large and successful group of parareptiles with more than 30 known genera (Cisneros, 2008). They occupied a wide geographic range, including the Americas, Europe, Asia, Africa, and Antarctica (Dias-de-Silva, Modesto and Schultz, 2006; Cisneros and Ruta, 2010). Research has shown that procolophonids likely had a fossorial (burrowing) lifestyle, feeding on insects and plants (Botha-Brink and Smith, 2012). Most researchers think procolophonids are herbivorous or durophagous, but there no full analysis of procolophonid teeth has been done in relation to their places on the tree of life.

The methods I will use to do this include phylogenetic generalized least squares (PGLS). The tree I will be using is taken from Cisneros' 2008 paper on procolophonid phylogeny, and I will be using Character #28, which is the presence/absence of labiolingual tooth broadening. Once this analysis is complete, I will then study whether dentition changes correlate with changes in diet. Using procolophonid biogeographical data, which is available in the literature, I will be able to see if shifts in habitat and dentition occur simultaneously. If they do occur simultaneously, it may be hypothesized that a change in diet occurred during this period.

I will use the R package 'caper' to run the PGLS analysis and do a phylogenetic regression of dentition characters on geography. Instead of using BioGeoBEARS for this analysis of dentition and geographic location, I decided (with the help of Professor Dunn!) to do a regression of dentition characters onto geography. Instead of creating a separate BioGeoBEARS geography text file, I will code the geographical location as a character (using 0 - 6) and add it to the original Nexus file I obtained from Cisneros 2008.

The data I will use is from a character state matrix taken from Cisneros 2008, as well as the phylogeny used. To conduct the initial phylogenetic inference, I will be using traits detailed in Juan C. Cisneros' 2008 paper on procolophonid phylogeny and using this tree (and the traits) to perform a phylogenetic signal analysis based on dentition. Additionally, I will perform a PGLS analysis based on geographic location. This analysis will aid in understanding the evolutionary relationship between dentition and habitat, and it will, hopefully, shed light on procolophonid radiation in the Triassic and diversity in niche occupation.

Because I suspect that there was a Triassic radiation of procolophonids, I would expect there to be lower association between procolophonids that exhibit vastly different tooth sizes and greater association between procolophonids that have similar tooth sizes (and therefore likely had similar diets and occupied similar niches).

Methods

DATA MATRIX from Cisneros 2008.

Cisneros_2008_Data_Matrix

This data matrix was converted into a text file using an OCR, and then this character information was put into Nexus format. This data (in Nexus format) is included in the files of this repository. The tree shown belown was rendered in IcyTree for visualization of the procolophonid phylogeny from Cisneros 2008. This is the original tree, formed from a .nex file that used the data matrix given in Cisneros 2008 (this data matrix was included in-text, but was not included as a separate file in any supplementals, to this student's chagrin). However, eventually, in correspondence with Cisneros, I was able to obtain the proper data/file formats needed for this analysis, which I have included in the repository.

Taxa

Owenettidae

Coletta seca

Pintosaurus magnidentis

Sauropareion anoplus

Phaanthosaurus spp.

Eumetabolodon dongshengensis

Theledectes perforatus

Tichvinskia vjatkensis

Timanophon raridentatus

Kapes spp.

Thelephon contritus

Eumetabolodon bathycephalus

Procolophon trigoniceps

Thelerpeton oppressus

Teratophon spinigenis

Pentaedrusaurus ordocianus

Neoprocolophon asiaticus

Sclerosaurus armatus

Scoloparia glyphanodon

Leptopleuron lacertinum

Soturnia caliodon

Hypsognathus fenneri

Phonodus dutoitorum

Kitchingnathus untabeni

Lasasaurus beltanae

Anomoiodon liliensterni

Procolina teresae

Mandaphon nadra

Eomurruna yurrgensis

Cisneros_2008_Procolophonid_Data_Matrix_Icytree

Once this character information was in Nexus format, I then input the file in IQTree using the Grace cluster provided by Yale University. After this tree was rendered by IQTree, I then created a second .nex file that included my data (in the data matrix). Because I did not have tooth width and length data for all of the specimens included in this tree, I put ? to label data as missing, as is standard in the Nexus format. After this was completed, I input this .nex file into IQTree. I was able to fix the errors I kept getting and was able to render a tree in IQTree.

Following this, I then began to use the 'caper' package (and the PGLS analysis) in R.

Geographic Regions

I first had to compile a list of my geographic regions of interest and their relevance to my taxa of interest. Instead of using political boundaries, I instead use modern continents for this geographic list. Because procolophonids are of Permian and Triassic age, the geographic regions are (quite) different from how they are in the modern day. I am using modern continents for my analysis with caper/PGLS and will then discuss how this geographic distribution would have looked during the Permian and Triassic in the discussion.

Regions and the number they will be encoded as characters in my Nexus file:

Africa: 0

South America: 1

North America: 2

Asia: 3

Antarctica: 4

Europe: 5

Oceania: 6

I am planning on adding geographic region as a character to the current Nexus file taken from Cisneros 2008. The list of areas that will be used in this analysis is shown above. The maximum number of areas in the analysis is 7 (1 per state). Because each of the taxa has to exist on a continent, the null range will not be included in possible states. I will use the define_tipranges_object to input my geographic data for my taxa of interest.

Teeth Morphology (Whether or not teeth are labio-lingually broadened)

Rather than import my own data from my own research (due to time constraints and how I've drawn out this project), I decided in the end to use dentition data taken from the data matrix provided to me by Cisneros. I had my pick of dentition characters and I decided to use "labio-lingual broadening (widening)" as the factor of interest in this project. "Labio" means "towards the front of the mouth", and "lingual" means "towards the back of the mouth". Tooth morpology can be connected to diet, and identifying/understanding the dental morphologies of various animals can aid in determing their potential diets. This is, of course, helpful when working with long-extinct animals, as it can be quite difficult to ascertain their ways of life any other way.

Labiolingually-broadened Teeth

Results

Diagnostic Plots (I)

The first two show the fit of the residuals to a normal distribution: a density plot of the distribution of the residuals and a normal Q-Q plot: the distribution of the residuals against their expected distribution under a normal distribution. These plots are both roughly normal and point towards the normality of our data, giving our analysis viability.

Density_Default_Curve Normal_QQPlot

Diagnostic Plots (II)

The second two show the fitted values against both the residuals and the observed value to look for pattern in the residuals within the model. There is a non-random pattern in this model.

ResidualValue vs. Fitted Value FittedValue vs. Observed Value

1. Character Data Acquisition

I have downloaded the available specimen data made available to me by John C. Cisneros via messages on ResearchGate. This data matrix (originally formatted for TNT and Mesquite) was then input into IQTree to provide a TREE file. This tree file (of the phylogeny) was read using the ‘caper’ package’s “read.tree” command. The specimen data was read using the ‘caper’ package’s “read.nexus.data” command, and, after this, the data has to be converted into a data matrix to be used by this package and then I had to transpose it in order to have the data be read correctly by the “comparative.data” and “pgls” commands.

2. Continent Data Acquisition

I compiled the continent data from FossilWorks, a website that provides query, download, and analysis tools that utilize the Paleobiology Database 's large relational database assembled by hundreds of paleontologists from around the world. After gathering this data, I then created a file with the encoded continent data (located in the Github Repository).

3. Phylogenetic Generalized Least Squares (PGLS)

When interpreting these results, the “intercept” is the likelihood of a species from a specific area present labiolingual tooth broadening. The “intercept” in this coefficient table is, by default, the “0” value (which I have labeled as “Africa”), so the subsequent values (as.factor(GeoLoc)1, etc.) show the likelihood relative to the intercept (i.e. the “0” value, i.e. Africa). The slope shows the difference in likelihood between species of different continents presenting labiolingual tooth broadening or not presenting labiolingual tooth broadening.

I used the command: “Geo.from.Max.Teeth <- pgls(as.numeric(V28) ~ as.factor(GeoLoc), cdat)” to examine the relationship between labiolingual tooth broadening and geographic location (continent).

There are no significant relationships to be reported, and there was no relationship identified between geographic location and whether or not teeth are labio-lingually broadened.

Discussion

As mentioned above, this study reports no significant relationships and that no relationship was identified between geographic location and whether or not teeth are labio-lingually broadened. My hypothesis was that, because there was a Triassic radiation of procolophonids, I would expect there to be lower association between procolophonids that exhibit vastly different tooth sizes and greater association between procolophonids that have similar tooth sizes (and therefore likely had similar diets and occupied similar niches). I reject my hypothesis in this study.

The rightmost values (beneath “intercept”) in the coefficient table shown above are relative to the first “intercept” value (i.e. the “0” value, i.e. Africa). Instead of being a 50/50 reference point for the coefficients, we have -0.0596993 value which is close to 0.

Adjusted R.Squared

The adjusted R-squared value is -0.001568. This value is so low and indicates that this model (using geographic location to predict labiolingual tooth broadening ) is not good enough for predicting tooth morphology (labiolingual tooth broadening). I do not think this was an adequate coefficient of determination, as it predicts nowhere near 100% of the data. The negative, low value shows that this adjusted R-square value was negligible. Because of these results, I will conduct further analyses and attempt to find a variable (and its respective coefficient of determination) that may better explain the rapid radiation of procolophonids we see in the Triassic.

I initially decided to go with an analysis using modern continents because my knowledge of previous environments, climates, etc. is not too developed. In the future, I would like to improve on this. Because of the time constraints I am facing currently, I failed to conduct a more extensive study of biogeographical regions and dental morphology. I plan to conduct a more in-depth study later this year.

Reflection

The biggest difficulty in implementing these analyses was gathering my data and determining what would make the most practical sense given the timeline of the course. Wrangling my data proved the most challenging and frustrating part of this project. Initially, I planned to use a character data matrix provided by Cisneros 2008, but the way the data matrix was uploaded made it near-unusable to use for recreating Cisneros' original analysis or using his data for further research. I reached out to Cisneros to obtain an actual usable file for my analysis. After this, I decided to try my hand at making my own data matrix (in the proper NEXUS, downloadable format), but I struggled to get IQTree (on the Cluster) to read my data. Eventually, my requests were answered! Cisneros messaged me with a usable file for my analysis and is the MatrixforEomurruna.nex file that I use in this project. I mention all of this because this difficulty in obtaining data from the literature showed me how important it is to leave data easily accessible and available for reproduction of research. It's integral to science!

If I did these analyses again, I would try to be better about outlining exactly what I wanted to do in this analysis and try to be better about what exactly I need to do to conduct this project. I would also try to be more proactive about seeking help (especially with the techinal R/coding portion of this project) earlier in the project.

References

finalproject's People

Contributors

selenamart28 avatar caseywdunn avatar

Watchers

 avatar

finalproject's Issues

using revbayes for nexus

I need to use revbayes to read my nexus tree. I have fallen behind due to personal reasons. Working on it, I promise!

issue with revbayes in rstudio

having a problem with revbayes:
Error: Failed to install 'RevKnitr' from GitHub:
(converted from warning) cannot remove prior installation of package ‘xfun’

revbayes won't read nexus

Nexus is a very flexible format, but not all programs implement the same features. So a nexus file that is fine for one program may not work for another.

Take a look at this tutorial - https://revbayes.github.io/tutorials/morph_tree/V2.html . It has a nexus file, Cinctans.nex, that works with revbayes. Edit your nexus so that it follows the same conventions. This would include:

  • Remove your TAXA block

  • include ntaxa in your DIMENSIONS line

  • Remove the tree from the file. You can place this in its own file and import it separately.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.