GithubHelp home page GithubHelp logo

alrichardbollans / mining_trait_data Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 46.24 MB

Collection of methods for extracting and compiling plant trait data

License: GNU General Public License v3.0

Python 70.60% HTML 29.40%
botany data-collection traits drug-discovery ethnobotany

mining_trait_data's Introduction

Python packages for gathering plant trait data

Disclaimer

WARNING: The information contained herein is provided as a public service with the understanding that authors make no warranties, either expressed or implied, concerning the accuracy, completeness, reliability, or suitability of the information.

In particular, concerning lists of poisonous/toxic plants --- just because a plant is not on the list DOES NOT mean that it is not dangerous/poisonous/toxic. Similarly, concerning lists of non_poisonous plants --- just because a plant is on the list DOES NOT mean that it is not dangerous/poisonous/toxic.

Installation

pip install git+https://github.com/alrichardbollans/[email protected]

Sources

See cite.txt file in each package for lists of sources.

Name resolution

Names from different datasets are matched using the wcvpy package (https://github.com/alrichardbollans/wcvpy)

Metabolite Methods

Note that methods related to metabolite data have been moved to: https://github.com/alrichardbollans/phytochempy

mining_trait_data's People

Contributors

alrichardbollans avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

mining_trait_data's Issues

Pykew install

For powo searches, the installation of pykew currently needs modifying as per the commit here: RBGKew/pykew@5363379

However, the relevant functions should be made to work with the existing version of pykew

Spatial autocorrelation

Add methods to clean_plant_occurrences to thin occurrence datasets by e.g. keeping only one record per 10 x10 arcmin grid cell to limit spatial autocorrelation (as in Pengjuan Zu et al., ‘Pollen Sterols Are Associated with Phylogeny and Environment but Not with Pollinator Guilds’, New Phytologist 230, no. 3 (May 2021): 1169–84, https://doi.org/10.1111/nph.17227.)

Citations

Improve citation guidelines for each package

Source filters

Allowing filtering by source, and better documentation of validity of sources

Catch empty dataframes in cleaning

Currently when empty dataframes without the 'Accepted_Name' column are passed to compile_hits an error is raised. However, these dataframes should just be ignored. Moreover, when an empty list of dataframes is passed to this method an error is raised. it would be better in this instance if an empty csv is output with the usual headings

Unnecessary imports on module loading

The modules are set up to import everything necessary for all the methods in the module to run. However, this means that every usage of the module requires all the libraries even if only a variable is required. For example, getting the poison data ends up requiring an install of pykew from powo_searches, which is totally unnecessary. Fix this by only importing libraries within the functions where they are required.

Metabolite vars

Improve usability of metabolite searches by simplifying processes.

Methods raising errors with no data

Some methods will end up raising an error when doing name matching in cases where no data is found e.g. when search_powo returns no hits it will try to do name matching on a nonexistent column.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.