GithubHelp home page GithubHelp logo

changwanseo / funid Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 0.0 119.99 MB

Fungal Identification Pipeline

License: GNU General Public License v3.0

Makefile 0.14% Python 99.60% HTML 0.08% Batchfile 0.17%
fungi phylogenetic-tree species-delimitation species-identification taxonomy multigene

funid's Introduction

DOI

This is Beta release. Bug reports are welcomed

Scheduling

Beta release part 1 (2023 Feburary ~ As paper published, ver 0.3)

  • Will be tested by our lab memebers to fix bugs and advance features

Beta release part 2 (As paper published ~ When pipeline gets stabled, ver 0.4)

  • Will be tested by peer taxonomists

Stable release (ver 1.0)

FunVIP

"Fun"gal "V"alidation & "I"dentification "P"ipeline

An automatic tree-based sequence identification and validation pipeline for fungal species

  • Automatic tree-based identification
  • Works with multigene
  • Data validation algorithm implemented

See tutorial for detailed usage

Requirements

Windows

  1. Install visual c++ here
  2. conda create -n FunVIP python>=3.8
  3. conda activate FunVIP
  4. pip install FunVIP
  5. run FunVIP --test Terrei --email [your email] to check installation
  • For upgrade use this command pip install FunVIP --upgrade

Linux

  1. conda create -n FunVIP python>=3.8
  2. conda activate FunVIP
  3. pip install FunVIP
  4. conda install -c bioconda raxml iqtree modeltest-ng mmseqs2 "blast>=2.12" mafft trimal gblocks fasttree
  5. run FunVIP --test Terrei --email [your email] to check installation
  • For intel mac system, this method probably work, but we couldn't test it because we don't have any intel mac device. We're looking for feedbacks in intel mac

Apple Silicon Mac

  1. CONDA_SUBDIR=osx-64 conda create -n FunVIP python>=3.8
  2. conda activate FunVIP
  3. conda config --env --set subdir osx-64
  4. conda install pyqt
  5. pip install FunVIP
  6. conda install -c bioconda raxml iqtree mmseqs2 "blast>=2.12" mafft trimal gblocks fasttree
  7. run FunVIP --test Terrei --email [your email] to check installation

Installation from source (For developers and core users)

  • this is for developmental steps
  1. git clone https://github.com/Changwanseo/FunVIP.git
  2. Move to ~/FunVIP
  3. conda create -n FunVIP python=3.10
  4. conda activate FunVIP
  5. pip install ./
  6. run FunVIP --test Terrei --email [your email] to check installation

Usage

FunVIP --db {Your database file} --query {Your query file} --email {Your email} --gene {Your genes} --preset {fast or accurate}

Example

FunVIP --db Penicillium.xlsx --query Query.xlsx --email {Your email} --gene ITS BenA RPB2 CaM --preset fast

* See documentation for detailed usage

How to make database?

Fig 2 Database and command configuration of FunID (ver2)

See example database here

Results

  • Section Assignment.xlsx : Your clustering result is here. You can find which of your sequences are clustered to which section
  • Identification_result.xlsx : Your final identification result. Shows how your sequences were assigned to species level through tree-based identification
  • report.xlsx : overall statistics about the tree. If your find taxon ends with numbers, these taxon are found to be paraphyletic, so should be checked
  • /Tree/{section}_{gene}.svg : Final collapsed tree in svg format. Can be edited in vector graphics programs, or in powerpoint (by ungroup)
  • /Tree/{section}_{gene}_original.svg : Uncollapsed tree for inspection

How does FunVIP work?

figure1 - ver4

License

GPL 3.0

funid's People

Contributors

changwanseo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

funid's Issues

Bugs fix & Improvements required

  1. Bug fixes
    1.1 Weird Exception happens during outgroup selection (should find the case)
    1.2 When error occurs during input management, the information goes directly to result without modifications (should find the case)
    1.3 When no corresponding sequence exists in query, the result should be "no sequence" rather than "error"

  2. Code improvements
    2.1 Better version control of implied softwares
    2.2 Seperate is_valid_accession and is_valid_sequence function from manage_input.py
    2.3 Make it quiet during unidecode, makeblastdb, modeltest and mafft
    2.4 add overall input command to report

  3. Installation issues
    3.1 Change FunID to FunIP in pypi and conda
    3.2 Find installation fails in diverse environment

  4. Long term development
    4.1 Move multiprocessing to Ray
    4.2 Multiprocessing logging

Make tutorial

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.