GithubHelp home page GithubHelp logo

mgs-canopy-algorithm's Introduction

mgs-canopy-algorithm's People

Contributors

epruesse avatar pdworzynski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mgs-canopy-algorithm's Issues

Draft genome from mgs

Thanks for the good tool of canopy-based gene clustering.

The pipeline of this method is as follows:

  1. Assembled metagenome data to contigs.
  2. Gene prediction and non-redundant gene catalogue.
  3. Calculation of gene abundance (reads mapping to gene catalogue)
  4. Getting CAGs or MGS based on clustering of gene abundance profile.
  5. Mapping reads to MGS and do MGS-augmented assembly to acquire draft genomes.

Well, the MGS was build from gene sets with similar abundance. As a matter of fact, these genes are coding sequences (CDS) predicted using MetaGeneMark. That is to say, the draft genome was regular combination of the CDS. So is it biologically reasonable?


In addition, when i run ```make``, error happened as:

g++ -o Stats.o -fopenmp -c -Wall -Wextra -O3 -march=native -I./  Stats.cpp
/tmp/cckNXjgd.s: Assembler messages:
/tmp/cckNXjgd.s:140: Error: suffix or operands invalid for `vbroadcastsd'
/tmp/cckNXjgd.s:141: Error: suffix or operands invalid for `vbroadcastsd'
make: *** [Makefile:63: Stats.o] Error 1

Hoping for your reply.

Error when running makefile

Hi, thanks for this great program!

But when I try to run "make -f Makefile" on my local Mac laptop, there is an error happened:

clang: error: unsupported option '-fopenmp'

Any ideas to solve this issue? Thanks!

Result affected by qsub parameter

Hi,
When I submitted a shell script file (run.sh) (cc.bin -n 16 -i /Path/gene.abundance.tab -o clusters_out -c profiles_out -p CAG_out --max_canopy_dist 0.1 --max_close_dist 0.4 --max_merge_dist 0.1 --min_step_dist 0.005 --max_num_canopy_walks 5 --stop_fraction 1 --canopy_size_stats_file progress_stat_file) via qsub, I found the result from different qsub parameter (linear) differed.
Why simply changing the thread number will cause the different result? Do you use any special treatment to the input in the cc.bin?

Regards,
Da

Checkpointing possibility

Hello,

Thank you for the implementation.
I had some issues running the canopy clustering succesfully. Either it takes very long or it gets stuck in the merging canopies step (maybe memory related).
Is it possible to pick up where the job was killed e.g. with the "--not_processed_points_file" option and if yes how would this work?

Output from my run:
cc_stderr.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.