GithubHelp home page GithubHelp logo

victorian-bioinformatics-consortium / nesoni Goto Github PK

View Code? Open in Web Editor NEW
30.0 30.0 10.0 3.22 MB

High throughput sequencing analysis tools

License: GNU General Public License v2.0

Python 94.17% R 5.83%

nesoni's People

Contributors

drpowell avatar pfh avatar simongladman avatar slugger70 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nesoni's Issues

Do you mind 'pin'ing you python dependencies

Hi Paul,

I am building a brew formula for nesoni. I will send a PR when done.

Do you mind telling me which versions of the following python libraries you use/develop with:

  • BioPython
  • numpy
  • matplotlib

Cheers :-)

Issue with nesoni import?

Hi there,

I'm trying to use nesoni to polish a genome assembled from pacbio reads using some moleculo reads aligned with pbalign.py/blasr. To import these I'm trying to use nesoni import, by running:

nesoni import:
--snp-cost 2.000
n2
/Users/ewilbanks/Dropbox/_personal/BERRIES/Metagenomics/pacbio/PSB_binMod_asm/reads_and_annot/pbalign1/mol_otuA_pbalign1.sorted.bam
PROKKA_07022014.sizeSorted.fna

Everything seems to go OK, until I get the error shown below. It doesn't seem to have created the appropriate files for importing the bam alignements since later filter / variant detection fails. Any thoughts on what I can do to troubleshoot it?

Thanks!!
Lizzy


[[other normal seeming output then....]]

  • Processing contig unitig_59
  • Processing contig unitig_110
    Loaded Genome
    Saving genome map to n2/reference/reference-cs

Traceback:
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/config.py", line 1069, in shell_run
action.run()
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/config.py", line 633, in inner
return func(self,_args,__kwargs)
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/config.py", line 633, in inner
return func(self,_args,**kwargs)
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/samimport.py", line 21, in run
workspace.setup_reference(self.reference)
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/working_directory.py", line 16, in setup_reference
reference_directory.Make_reference(path, filenames=filenames, bowtie=bowtie).run()
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/config.py", line 633, in inner
return func(self,*args,**kwargs)
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/config.py", line 633, in inner
return func(self,args,*kwargs)
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/reference_directory.py", line 225, in run
if config.apply_ifavailable_jar(self.snpeff, 'snpEff.jar'):
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/config.py", line 269, in apply_ifavailable_jar
try: io.find_jar(jar)
File "/Users/ewilbanks/anaconda/lib/python2.7/site-packages/nesoni/io.py", line 28, in find_jar
raise Error('Couldn't find "%s". Directories listed in JARPATH and PATH were searched. %s' % (jarname, extra_help))

NameError:
global name 'Error' is not defined

nesoni annotation error

I've been trying to run nesoni make-reference (v0.122) using a .gbk file (mainly to preserve the annotations for nway), but keep getting an error:

jkwong@dna:~/Ecoli/analysis/nesoni/outbreak$ nesoni make-reference test-ref BPH0530.gbk

nesoni make-reference: \
    --ls ifavailable \
    --cs ifavailable \
    --bowtie ifavailable \
    --genome yes \
    --genome-select -source \
    --snpeff ifavailable \
    test-ref \
    BPH0530.gbk


Traceback:
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/config.py", line 1069, in shell_run
    action.run()
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/config.py", line 633, in inner
    return func(self,*args,**kwargs)
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/config.py", line 633, in inner
    return func(self,*args,**kwargs)
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/reference_directory.py", line 213, in run
    reference.set_sequences(sequences)
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/reference_directory.py", line 34, in set_sequences
    for name, seq in io.read_sequences(filename, genbank_callback=genbank_callback):
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/io.py", line 419, in read_genbank_sequence
    from Bio import SeqIO
  File "/bio/sw/python/env-pypy/site-packages/Bio/SeqIO/__init__.py", line 353, in <module>
    from Bio.File import as_handle
  File "/bio/sw/python/env-pypy/site-packages/Bio/File.py", line 35, in <module>
    from sqlite3 import dbapi2 as _sqlite
  File "/bio/sw/python/download/pypy/lib-python/2.7/sqlite3/__init__.py", line 24, in <module>
    from dbapi2 import *
  File "/bio/sw/python/download/pypy/lib-python/2.7/sqlite3/dbapi2.py", line 27, in <module>
    from _sqlite3 import *
  File "/bio/sw/python/download/pypy/lib_pypy/_sqlite3.py", line 52, in <module>
    _ffi = _FFI()
  File "/bio/sw/python/download/pypy/lib_pypy/cffi/api.py", line 59, in __init__
    backend.__version__ == __version__[:3])

AssertionError:

I've also tried using make-reference with a .gff file, but after then using nesoni bowtie to map reads, the resulting report.txt file is empty.

Using a reference file in FASTA format for nesoni make-reference is successful, but adding the annotation in nway (after running make-reference, bowtie, consensus, nway) by specifying the --gbk flag again comes up with an AssertionError with both .gbk and .gff files:

jkwong@dna:~/Ecoli/analysis/nesoni/outbreak$ nesoni nway --output outbreak.txt --as table --gbk BPH0530.gbk Ec_BPH*

nesoni nway: \
    --output outbreak.txt \
    --as table \
    --gbk BPH0530.gbk \
    --evidence yes \
    --consequences yes \
    --reference yes \
    --indels yes \
    --require-all no \
    --require-bisect no \
    --full no \
    Ec_BPH0532 Ec_BPH0657 Ec_BPH0658 Ec_BPH0659


Traceback:
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/config.py", line 1069, in shell_run
    action.run()
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/nway_diff.py", line 284, in run
    split_a=self.splitting, split_b=self.from_, f=f)
  File "/bio/sw/python/env-pypy/site-packages/nesoni-0.122-py2.7.egg/nesoni/nway_diff.py", line 344, in nway_main
    from Bio import SeqIO
  File "/bio/sw/python/env-pypy/site-packages/Bio/SeqIO/__init__.py", line 353, in <module>
    from Bio.File import as_handle
  File "/bio/sw/python/env-pypy/site-packages/Bio/File.py", line 35, in <module>
    from sqlite3 import dbapi2 as _sqlite
  File "/bio/sw/python/download/pypy/lib-python/2.7/sqlite3/__init__.py", line 24, in <module>
    from dbapi2 import *
  File "/bio/sw/python/download/pypy/lib-python/2.7/sqlite3/dbapi2.py", line 27, in <module>
    from _sqlite3 import *
  File "/bio/sw/python/download/pypy/lib_pypy/_sqlite3.py", line 52, in <module>
    _ffi = _FFI()
  File "/bio/sw/python/download/pypy/lib_pypy/cffi/api.py", line 59, in __init__
    backend.__version__ == __version__[:3])

AssertionError:

jkwong@dna:~/Ecoli/analysis/nesoni/outbreak$ 

The reference genome is a closed E.coli genome that was annotated with Prokka.

Any ideas? Thanks for the help.

Nesoni issues with CentOS/RHEL?

I installed Nesoni on two CentOS servers and a laptop running on Debian. It works perfectly fine on the Debian laptop but I get errors whenever I try to run Nesoni on the CentOS computers. I tried installing it using pip, source, and in a virtualenv but I keep getting this:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/runpy.py", line 151, in _run_module_as_main
mod_name, loader, code, fname = _get_module_details(mod_name)
File "/usr/local/lib/python2.7/runpy.py", line 109, in _get_module_details
return _get_module_details(pkg_main_name)
File "/usr/local/lib/python2.7/runpy.py", line 101, in _get_module_details
loader = get_loader(mod_name)
File "/usr/local/lib/python2.7/pkgutil.py", line 464, in get_loader
return find_loader(fullname)
File "/usr/local/lib/python2.7/pkgutil.py", line 474, in find_loader
for importer in iter_importers(fullname):
File "/usr/local/lib/python2.7/pkgutil.py", line 430, in iter_importers
import(pkg)
File "nesoni/init.py", line 8, in
from reference_directory import Make_reference
File "nesoni/reference_directory.py", line 5, in
from nesoni import io, grace, config, annotation, legion
File "nesoni/io.py", line 7, in
from nesoni import grace, legion, selection
File "nesoni/legion.py", line 997, in
class Make(config.Action):
File "nesoni/legion.py", line 1003, in Make
make_address = os.environ.get('NESONI_ADDRESS') or socket.gethostbyname(socket.gethostname())
socket.gaierror: [Errno -2] Name or service not known
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(_targs, *_kargs)
File "nesoni/legion.py", line 500, in _check_stages
if LOCAL.stages:
AttributeError: 'NoneType' object has no attribute 'stages'
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(_targs, *_kargs)
File "nesoni/legion.py", line 500, in _check_stages
if LOCAL.stages:
AttributeError: 'NoneType' object has no attribute 'stages'

nesoni nway is broken

nway is looking for reference.fa in the nesoni working directory. This has been replaced with a directory called reference with reference.fa and other files in it.

Command line example:

nesoni nway Agona_SL483/ arizonae_62_z4_z23_RSK2980/ Choleraesuis_SC_B67/

returns:

IOError:
[Errno 2] No such file or directory: 'Agona_SL483/reference.fa'

Checking Agona directory:

[gla048@athena Agona_SL483]$ ll
total 393120
-rw-rw-r-- 1 gla048 gla048 9673526 Jun 26 12:52 alignment.maf
-rw-rw-r-- 1 gla048 gla048 16823401 Jun 26 12:48 alignments.bam
-rw-rw-r-- 1 gla048 gla048 14809603 Jun 26 12:49 alignments_filtered.bam
-rw-rw-r-- 1 gla048 gla048 14809603 Jun 26 12:49 alignments_filtered_sorted.bam
-rw-rw-r-- 1 gla048 gla048 14608 Jun 26 12:49 alignments_filtered_sorted.bam.bai
-rw-rw-r-- 1 gla048 gla048 4905796 Jun 26 12:52 consensus.fa
-rw-rw-r-- 1 gla048 gla048 1325 Jun 26 12:52 consensus_log.txt
-rw-rw-r-- 1 gla048 gla048 4905796 Jun 26 12:52 consensus_masked.fa
-rw-rw-r-- 1 gla048 gla048 4905981 Jun 26 12:44 contigs.fna
-rw-rw-r-- 1 gla048 gla048 6983269 Jun 26 12:50 depths.pickle.gz
-rw-rw-r-- 1 gla048 gla048 113911 Jun 26 12:50 gi_197247299_ref_NC_011148_1_-ambiguous-depth.userplot
-rw-rw-r-- 1 gla048 gla048 113471 Jun 26 12:50 gi_197247299_ref_NC_011148_1_-depth.userplot
-rw-rw-r-- 1 gla048 gla048 823350 Jun 26 12:50 gi_197247299_ref_NC_011148_1_-evidence.txt
-rw-rw-r-- 1 gla048 gla048 1083 Jun 26 12:50 gi_197247299_ref_NC_011148_1_.gff
-rw-rw-r-- 1 gla048 gla048 14396006 Jun 26 12:50 gi_197247352_ref_NC_011149_1_-ambiguous-depth.userplot
-rw-rw-r-- 1 gla048 gla048 14273285 Jun 26 12:50 gi_197247352_ref_NC_011149_1_-depth.userplot
-rw-rw-r-- 1 gla048 gla048 113511293 Jun 26 12:52 gi_197247352_ref_NC_011149_1_-evidence.txt
-rw-rw-r-- 1 gla048 gla048 1083 Jun 26 12:52 gi_197247352_ref_NC_011149_1_.gff
-rw-rw-r-- 1 gla048 gla048 46 Jun 26 12:50 parameters
drwxrwxr-x 2 gla048 gla048 131 Jun 26 12:44 reference
-rw-rw-r-- 1 gla048 gla048 1083 Jun 26 12:52 report.gff
-rw-rw-r-- 1 gla048 gla048 73 Jun 26 12:52 report.txt
-rw-rw-r-- 1 gla048 gla048 5346 Jun 26 12:48 shrimp_log.txt
-rw-rw-r-- 1 gla048 gla048 181433975 Jun 26 12:44 s_single.fa
-rw-rw-r-- 1 gla048 gla048 0 Jun 26 12:48 unmapped_paired.fq
-rw-rw-r-- 1 gla048 gla048 0 Jun 26 12:48 unmapped_single.fq

As can be seen, reference is now a directory not a file...

Nesoni clip: is discarding sequence descriptions

Illumina now produces FASTQ without the /1 and /2, and instead uses a fasta description (ID, space, DESC).

@M00855:4:000000000-A16FH:1:1101:14529:1450 2:N:0:1
TGGGCAGCAGCGACTTCTGCCACAGTGTCGGTGACATGCCAAACGGTGGGT

When passing through the clip: tool, the DESC is discarde

d@M00855:4:000000000-A16FH:1:1101:14529:1450
GGGCAGCCTCAGCGCCCCGATGGGCGGAATGGGCCTGTCGGGCGT

ie. the "2:N:0:1" is missing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.