GithubHelp home page GithubHelp logo

mag_contig_purifier's Introduction

MAG_Contig_Purifier

Purpose:

To take an existing set of bins from a metagenomic analysis and repatriate non-binning contigs, remove contigs that contaminate bins, and eventually merge/split bins using a combination of de-novo and reference-based methods.

Outline

Starting point:

  1. Raw assembly file(s)
  2. Bin set
  3. Bin identification file
  • This is a tab-delimited file with two columns: bin_name and contig_name
  • The bin_name should be in the format: "Bin.[number]"
  • The contig_name should be in the format: ">[contig_name]"

Step 0: Standardize all inputs

Step 1a: Remove contamination from bins based on CATBAT annotation scores

Step 1b: Add contigs into bins based on CATBAT annotation scores

Step 2: Add contigs into bins based on ANI relations across samples

Step 3: Merge bins together based on multiple lines of evidence:

a. Shared taxonomy b. ANI patterns across samples c. Non-overlapping gene content

mag_contig_purifier's People

Contributors

ddeemerpurdue avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.