GithubHelp home page GithubHelp logo

neotomadb / bulk-baconizing Goto Github PK

View Code? Open in Web Editor NEW
3.0 7.0 4.0 16.46 MB

Using bacon to generate a large number of new chronologies from existing Neotoma records.

License: MIT License

R 9.16% HTML 89.79% TeX 0.97% Makefile 0.02% Shell 0.07%
neotoma bacon workflow rmarkdown chronologies travis paleoecology

bulk-baconizing's Introduction

DOI DUB lifecycle Build Status NSF-1550707 NSF-1241868 NSF-1740694

Bulk Baconizing

For cases where a large number of records need to be processed using Bacon, this repository serves as a template to generate the required age files in an organized fashion. It provides default parameters for initial runs, and provides a tracking module to indicate whether any issues were encountered in the construction of the geochronological table.

Citation

Please cite your use of this repository as software:

Goring SJ, Dawson A, Stegner MA, Wang Y. 2019. Bulk Baconizing. Gitub Repository. DOI: 10.5281/zenodo.2545891

Or import with BibTeX:

@misc{Goring2019,
  author = {Goring, S.J.},
  title = {Bulk Baconizing},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/NeotomaDB/bulk-baconizing}},
  doi = {10.5281/zenodo.2545891}
}

Contributions

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Maintenance Files

Continuous Integration and Quality Assurance

.travis.yaml, Makefile and DESCRIPTION are used to integrate this repository with the Travis Continuous Integration platform. You can keep these files, or not, but they are required to support a process in which each new commit to the master branch is independently loaded in a virtual machine hosted by Travis. For more on the use of Continuous Integration with RMarkdown, please visit the post Adding CI to RMarkdown Documents.

Organizational Files

The files CODE_OF_CONDUCT.md, the LICENSE file and this README.md file are part of best practices for public code. The Code of Conduct helps define how we as an organization expect to be treated and defines how we should aspire to treat others. It also governs how individuals who interact with this repository and others should expect to be treated, and should treat others.

The LICENSE file uses an MIT license. This license is a permissive license with conditions only requiring preservation of copyright and license notices. Neotoma is funded by the National Sciences Foundation. Neotoma maintains a Data Policy that governs the use of data from the Database itself.

The .gitignore file is used to ensure that local files on the developers' systems do not clutter the master repository.

How to Use This Repository

This is intended to be used as a template for users, and not as a solution in and of itself. The process for generating chronologies is itterative, as such, the use of this Rmd script is intended to be an itterative process, whereby you select sites, run Bacon, revise parameters and run the script again. Each itteration will involve modifying the parameters file, and also the settings.yaml file. Please be sure to check carefully as you do this. Errors may result in long wait times, or runs that provide no new useful information.

General workflow

The key steps of the workflow process are:

  1. Running the Rmd file.
  2. Reading in the settings.yaml file.
  3. Loading data from Neotoma using the neotoma package.
  4. Setting default parameters for the Bacon runs (accumulation rates, memory, etc.)
  5. Updating parameters based on past runs (if you have files with alternate settings)
  6. Building age files based on chronological controls from Neotoma
  7. Running Bacon

The implied final step in this process would be modifying the settings file after the first run of this workflow, and adjusting the the parameters in the parameters file generated by the run, to ensure that the Bacon runs for each core reflect the best possible age models. When the Rmd is re-run with the settings.yaml and parameters file adjusted (see Rmd for details), it is possible to do runs faster, since the script is set to run only core that do not have successful runs, so it is possible to tinker with the settings for one or a few cores, while leaving the rest unchanged.

Running with RStudio

Open the file using RStudio and click the knit button.

Running from the command line

Navigate to the working directory and execute the command:

Rscript -e "rmarkdown::render('bulk_baconizing.Rmd')"

This should run the code as it is written. Be sure you have set the appropriate bounding box or geographic bounds for your region of interest.

Feedback, Suggestions and Comments

Please use the issue tracker or email a package maintainer directly.

footer images Neotoma NSF and EarthCube

bulk-baconizing's People

Contributors

araiho avatar simongoring avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bulk-baconizing's Issues

Default settings.yaml for 'settlement'

Hello,

If I'm interpreting the text in the .Rmd file correctly, it appears the default for settings:settlement should be FALSE. However, when I run

cat(paste0(readLines('settings.yaml'), '\n'))

and then

source('R/setup_runs.R', echo=FALSE, verbose=FALSE)'

I get:

settings$settlement
[1] "data/input/expert_assessment.csv"

I've tried adding settings$settlement<-FALSE to setup_runs.R, but then I get the following error when trying to generate the Core Age and Depth Files:

Error in file.exists(settings$settlement) : invalid 'file' argument

Any insight as to how to change this so that the chronological controls are used as-is?

Thanks,
Amanda

Default chron issue for dataset 17597

bulk_baconizing.Rmd throws this error at Chunk 9:

There are multiple default models defined for the best age type. Error in build_agefiles(param = params[i, ], ageorder = ageorder, datasets = dataset_list, :
object 'max_chron' not found

The issue is in the build_agefiles.R script line 78. A logical test is performed using the variable max_chron which is not defined anywhere else in the function. The issue is reproducible if you run the bulk_baconizing script and set dataset_list <- get_dataset(x = 17597) .

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.