GithubHelp home page GithubHelp logo

bhartidk / centipede_macrogenetics Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 34.47 MB

This repository contains R scripts to reproduce results and figures from our macrogenetics study where we investigate the species traits and biogeographic variables correlated with intra-specific genetic diversity in centipede species.

Home Page: https://datadryad.org/stash/share/Z8tG0lFyS-DBIMgLbAQqF7FNIs1n9pwciblORMsI1M0

License: MIT License

R 100.00%
biogeography r macrogenetics

centipede_macrogenetics's Introduction

Genetic diversity varies with species traits and latitude in predatory soil arthropods (Myriapoda: Chilopoda)


Summary of contents

The folders in this repository contain raw data and R scripts to reproduce the results in our manuscript, where we investigate the correlates of intra-specific genetic diversity in centipedes using species traits and biogeography as explanatory variables. The scripts can be used to read in the raw data with sequence and coordinate information, perform species-wise sequence alignments, calculate alignment summary statistics, create input files for beta regression and perform sample size sensitivity analysis. Figures 2 to 5 and Table 1 in the main manuscript, and all the figures and tables in the Supplementary Information can be reproduced using the data and scripts provided.

Input data and folder structure

To reproduce the analysis, download the folders to a suitable directory. Change the working directory in the beginning of each R script within the 'scripts' folder and run the R code sequentially based on the file name.

1. scripts

This folder contains the R code required for data processing, analysis and visualization. The scripts are numbered sequentially as code01 to code05 based on the order in which they should be run.

2. data_raw

This folder contains all the raw data files needed for analysis.

The sequence and coordinate information is present in 'sheet4_popgen_database_analysis_clean_11Apr23.csv' and the species trait information is present in 'centipedes_life_history_11Apr23.csv'.

3. data

This folder contains all the processed data files generated using the R scripts.

This includes the files 'input_betareg_no_introductions_11Apr23.csv' and 'input_betareg_11Apr23.csv', which contain the input data for beta regression analysis.

4. figures

This folder contains all the main and supplementary figures generated using the R scripts.

5. results

This folder contains all the tables generated using the R scripts.

Analysis code

The R scripts present in the 'scripts' folder are:

1. code01_sequence_alignment_11Apr23.R

This code is to clean and process raw data, query accession numbers on GenBank, perform multiple sequence alignment for each species and save these files to disk.

2. code02_phylogatr_sequences_11Apr23.R

This code is to process phylogatr data, compare our database with the phylogatr database, look at common and different accession numbers between the two and save results.

3. code03_sequence_summary_11Apr23.R

This code is to calculate sequence summaries from the multiple sequence alignment of each species, combine species trait and biogeography information and create input files for statistical analysis.

4. code04a_analysis_plots_no_introductions_19Apr23.R

This code is to prepare input data for analysis, run beta regression models, perform model diagnostics, correct for spatial autocorrelation in model residuals, obtain bootstrapped model coefficients, test for phylogenetic signal in model residuals and create figures and tables.

5. code04b_analysis_plots_11Apr23.R

Same analysis pipeline as above but using data including synanthropic introductions.

6. code05_sample_size_sensitivity_11Apr23.R

This code is to resample sequence alignments for each species iteratively across a range of sample sizes, calculate the variance in estimated genetic diversity for each sample size and run beta regression models using the optimal sample size.

Sharing and access information

Please use the following citation for the use of the data compilation or analysis code:

Bharti, D. K., Pawar, P. Y., Edgecombe, G. D., & Joshi, J. (2023). Genetic diversity varies with species traits and latitude in predatory soil arthropods (Myriapoda: Chilopoda). Global Ecology and Biogeography.

Links to other publicly accessible locations of the data:

Detailed information regarding data sources is available in the Supplementary Information associated with our manuscript:

  • Appendix S1: Mitochondrial COI accession numbers and associated sequence coordinates used for data analysis
  • Appendix S2: Centipede species traits and biogeographic variables used for data analysis

Supplementary Information is available at https://doi.org/10.5281/zenodo.7940353

centipede_macrogenetics's People

Contributors

bhartidk avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.