GithubHelp home page GithubHelp logo

seqenv's People

Contributors

chrisquince avatar evangelospafilis avatar xapple avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

seqenv's Issues

Bug with upui normalisation

Running sequences through the upui normalisation gives error:

seqenv AllSites_C05.fa --min_identity 0.95 --num_threads 32 --out_dir All_95_upui --min_coverage 0.95 --max_targets 100 --normalization upui            
seqenv version 1.1.4 (pid 60382)
The exact version of the code is: 4e407e1
Start at: 2016-05-03 13:34:45.570319
--> STEP 1: Parse the input FASTA file.
Elapsed time: 0:00:00.046785
Using: All_95_upui/renamed.fasta
--> STEP 2: Similarity search against the 'nt' database with 32 processes
Elapsed time: 0:01:33.193498
--> STEP 3: Filter out bad hits from the search results
Elapsed time: 0:00:00.017426
--> STEP 4: Parsing the search results
Elapsed time: 0:00:00.027052
--> STEP 5: Setting up the SQLite3 database connection.
Elapsed time: 0:00:00.000913
Got 4114 GI hits and 3851 of them had one or more EnvO terms associated.
--> STEP 6: Computing EnvO term frequencies.
Traceback (most recent call last):
  File "/home/chris/repos/seqenv/seqenv/seqenv", line 68, in <module>
    seqenv.Analysis(input_path, **kwargs).run()
  File "/home/chris/repos/seqenv/seqenv/analysis.py", line 151, in run
    self.outputs.make_all()
  File "/home/chris/repos/seqenv/seqenv/outputs.py", line 40, in make_all
    self.tsv_seq_to_concepts()
  File "/home/chris/repos/seqenv/seqenv/outputs.py", line 93, in tsv_seq_to_concepts
    content = self.df_seqs_concepts.to_csv(None, sep=self.sep, float_format=self.float_format)
  File "/home/chris/repos/seqenv/seqenv/common/cache.py", line 35, in retrieve_from_cache
    else: result = f(self)
  File "/home/chris/repos/seqenv/seqenv/outputs.py", line 53, in df_seqs_concepts
    df = pandas.DataFrame(self.a.seq_to_counts)
  File "/home/chris/repos/seqenv/seqenv/common/cache.py", line 35, in retrieve_from_cache
    else: result = f(self)
  File "/home/chris/repos/seqenv/seqenv/analysis.py", line 396, in seq_to_counts
    if not results: raise Exception("We found no isolation sources with your input. Sorry.")
Exception: We found no isolation sources with your input. Sorry.

Bug if output_dir not set

If output_dir is not specified program throws error:

Traceback (most recent call last):
  File "/home/chris/repos/seqenv/seqenv/seqenv", line 59, in <module>
    seqenv.Analysis(input_path, **kwargs).run()
  File "/home/chris/repos/seqenv/seqenv/analysis.py", line 128, in __init__
    if not os.path.exists(self.out_dir): os.makedirs(self.out_dir)
  File "/usr/lib64/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 17] File exists: 'AllSites_C05.fa/'

This seems to be because it attempts to create a directory with exactly the same name as the input file.

REAME.md file placement during install

Running 'pip install seqenv' places the seqenv program in /usr/local/bin. However, an error message occurs in the first run:

Traceback (most recent call last):
File "/usr/local/bin/seqenv", line 18, in
doc_params = re.findall('^### All parameters(.+?)###', readme_contents, flags=re.M|re.DOTALL)[0]
IndexError: list index out of range

I think this is an issue with the placement of the README.md file. Line 13 of /usr/local/bin/seqenv sets the README.md path as:

readme_path = current_dir + '../README.md'

This implies that the README.md file should be in /usr/local. In the case of my server, another README.md file was already there. I downloaded the seqenv README.md file from Github and copied it to /usr/local, and this seems to have resolved the issue. Seqenv now works.

Could this be avoided in the install, for example by placing the README.md file somewhere else, giving the README.md file a more unique name, or having a specific action to take when another README.md file is already in /usr/local?

Changing default number of sequences to use

Hi Lucas,

Rather than defaulting to using the 1000 most abundant sequences, could we just default to using all of them? The --N option can remain as is otherwise as a way for people to speed up analyses.

Thanks,
Chris

0 GI hits and 0 of them had one or more Env0 terms associated with test.fasta

Hi-

I've been trying for a few days now to get seqenv working without success, so I figured it's time to reach out.

I'm attempting to run seqenv using your test.fasta file (here). Against the nt database.

I can use local blastn, i.e.:
blastn -db /fdb/blastdb/nt -query test.fasta >> test.out.txt

And I get results for each OTU given, but when I run seqenv using the same input/db parameters:
seqenv test.fasta --search_db /fdb/blastdb/nt

Start at: 2017-10-30 12:31:20.710219
Got 0 GI hits and 0 of them had one or more EnvO terms associated.
--> STEP 6: Computing EnvO term frequencies.
Traceback (most recent call last):
  File "/home/krajacichbj/.conda/envs/krajpy/bin/seqenv", line 64, in <module>
    seqenv.Analysis(input_path, **kwargs).run()
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/analysis.py", line 151, in run
    self.outputs.make_all()
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/outputs.py", line 41, in make_all
    self.tsv_seq_to_concepts()
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/outputs.py", line 108, in tsv_seq_to_concepts
    content = self.df_seqs_concepts.to_csv(None, sep=self.sep, float_format=self.float_format)
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/common/cache.py", line 35, in retrieve_from_cache
    else: result = f(self)
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/outputs.py", line 55, in df_seqs_concepts
    df = pandas.DataFrame(self.a.seq_to_counts)
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/common/cache.py", line 35, in retrieve_from_cache
    else: result = f(self)
  File "/home/krajacichbj/.conda/envs/krajpy/lib/python2.7/site-packages/seqenv/analysis.py", line 398, in seq_to_counts
    if not results: raise Exception("We found no isolation sources with your input. Sorry.")
Exception: We found no isolation sources with your input. Sorry.

I get nothing back. Is this an issue with the seqenv installation/configuration?
blastn is in my path. When I installed seqenv I had to manually add a pygraphviz and orange from conda, but other than that things went smoothly.

I'm pretty new to python and unix is general, so if you need something more to help diagnose, please let me know.

Thanks for any help you can offer.

-Ben

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.