phac-nml / galaxy_tools Goto Github PK
View Code? Open in Web Editor NEWContains a set of Galaxy Tools mostly written by the Bioinformatics core at NML
License: Apache License 2.0
Contains a set of Galaxy Tools mostly written by the Bioinformatics core at NML
License: Apache License 2.0
Update the staramr Galaxy tool to version 0.8.0 (when released).
The GitHub CI may need to be updated. Test failures seem to be related to how docker is being run:
Job in error state.. tool_id: staramr_search, exit_code: 125, stderr: docker: invalid spec: ::rw: empty section between colons.
See 'docker run --help'.
.
It may involve updating our pr.yaml
to be consistent with the galaxyproject version.
Hi all,
We would like to request you to update getmlst tool on galaxy repo https://toolshed.g2.bx.psu.edu/repository?repository_id=5634a03143d6f21d.
Problem is the tool outputs the profile sequences but it doesn't give an output on MLST definitions for any species. Fixing the tool would be very useful for workflow development of MLST analysis.
Thanks,
Jayanthi Gangiredla
Not having an extension (ex .fastq
) when making a collection of reads as an input for biohansel results in the final output not having the files paired even though the dataset collection does.
If I had to guess (and if I remember correctly) this is likely due to how the get_paired_fastq_filename function interacts with the $input.paired_collection.<forward|reverse>
name which I believe utilizes the underlying name from the dataset used to make the collection and not the name in the collection itself
Dataset 1 -> Correctly pairs output:
File names used to make up the paired collection:
Output:
TestX | heidelberg | 0.5.0
Dataset 2 -> Outputs are separated
File names used to make up the paired collection:
Output:
TestX_R1 | heidelberg | 0.5.0
TestX_R2 | heidelberg | 0.5.0
Write wrapper for basic usage
spaTyper -f sequence.fasta -r sparepeats.fasta
There are 2 key improvements that our spatyper wrapper could use.
When running a galaxy workflow that includes bio_hansel, the file names are replaced with generic placeholders so you cannot differentiate between SRRs.
stringmlst tests are comment out as they do not use conda, so we cannot run them on travis ci.
https://github.com/phac-nml/galaxy_tools/blob/master/tools/biohansel/biohansel.xml#L121
Just need quotes around it please.
https://github.com/phac-nml/galaxy_tools/blob/master/tools/assemblystats/assembly_stats_txt.xml starts:
<tool id="assemblystats" name="assemblystats" version="1.0.1">
<description>Summarise an assembly (e.g. N50 metrics)</description>
<requirements>
...
This is the exact same version number as Konrad's original which is also still on the tool shed, which makes it very confusing as your version has a number of changes:
https://toolshed.g2.bx.psu.edu/view/nml/assemblystats/
https://toolshed.g2.bx.psu.edu/view/konradpaszkiewicz/assemblystats/
We tried to assemble a collection of .fastq WGS datasets with SPAdes, and tried to run Biohansel on the dataset list in Galaxy, and got an error. Biohansel does not see any input files.n Perhaps the wrapper needs to be fixed. @apetkau @DarianHole
https://github.com/phac-nml/galaxy_tools/blob/master/tools/quasitools/distance.xml
The measure reported by the tool isn't directly an evolutionary measure, but rather an approximation of one, and this should be more clear.
Additionally, in the description of inputs, the FASTA file isn't always necessarily a "consensus" file.
Currently code is:
https://github.com/phac-nml/galaxy_tools/blob/master/tools/assemblystats/assembly_stats_txt.py#L18
def stop_err(msg):
sys.stderr.write('%s\n' % msg)
sys.exit()
This will return a zero exit level, i.e. no error will be report by Galaxy.
Quick fix:
def stop_err(msg):
sys.stderr.write('%s\n' % msg)
sys.exit(1)
Better fix, replace all calls to stop_err
with sys.exit
which will accept a string, print this to stderr, and exit with return code one.
Hi there,
is it possible to add E.coli to the galaxy wrapper under the pointfinder section?
or add an option to allow user select their species of interest from the Galaxy history?
Many thanks
Having issues where KAT sect will fail on the hard linking of the database files. It occurs randomly but only when a hundreds concurrent KAT jobs are all attempting to use the same database file.
One solution would be to use cp instead of hard linking file with ln
Hi,
I have a question regarding managing StarAMR-related DBs in Galaxy.
I do not see any data tables in the staramr folder and any DM.
How do you manage the DBs then?
We would like to update the DBs on usegalaxy.fr.
Thanks for your help.
Bérénice
Noticed on a real dataset where my Python code did not get the same median contig length, reduced to a minimal test case:
$ cat /tmp/median.fasta
>one
A
>two
AA
>three
AAA
>four
AAAA
There are an even number of sequences, 1, 2, 3, 4, meaning the median should be the mean of the middle two values, which is 2.5 in this trivial case.
As can be seen below, fasta_summary.pl
picks the larger of the middle two values instead (which in the context of sequences will ensure the answer is an integer), so in practise the error will often be larger:
$ perl fasta_summary.pl -i /tmp/median.fasta -o /tmp && more /tmp/stats.txt
Directory '/tmp' exists, so the existing fasta_summary.pl output files will be overwritten
Statistics for read lengths:
Min read length: 1
Max read length: 4
Mean read length: 2.50
Standard deviation of read length: 1.12
Median read length: 3
N50 read length: 3
Statistics for numbers of reads:
Number of reads: 4
Number of reads >=1kb: 0
Number of reads in N50: 2
Statistics for bases in the reads:
Number of bases in all reads: 10
Number of bases in reads >=1kb: 0
GC Content of reads: 0.00 %
Simple Dinucleotide repeats:
Number of reads with over 70% dinucleotode repeats: 0.00 % (0 reads)
AT: 0.00 % (0 reads)
CG: 0.00 % (0 reads)
AC: 0.00 % (0 reads)
TG: 0.00 % (0 reads)
AG: 0.00 % (0 reads)
TC: 0.00 % (0 reads)
Simple mononucleotide repeats:
Number of reads with over 50% mononucleotode repeats: 50.00 % (2 reads)
AA: 50.00 % (2 reads)
TT: 0.00 % (0 reads)
CC: 0.00 % (0 reads)
GG: 0.00 % (0 reads)
CC original author @sujaikumar - given the Perl script has been used widely it may be simpler just to document this behaviour? Is there an official repository for this script?
Update staramr to version 1.0.0 in Galaxy (when released).
Need to add option to allow for interweave and/or compressed reads as well. Default output from fastq-dump from sratoolkit is to have compressed reads so be best that it can work.
Please rename README_ASSEMBLY_STATS
to README.rst
, README.txt
or similar so that the Galaxy Tool Shed will display the information.
During testing of bio_hansel's wrapper, we discovered that bio_hansel will fail travis CI testing without attrs
listed in the requirements. This shouldn't be necessary as the recipe installs attrs
anyways. We found this is an issue with the older versions of Galaxy and Conda. When Galaxy is updated to version 17.09, and Conda is updated to version 3.x.x
Refer to 753a290, on the required work.
Default Galaxy version default python version is now 3.X where it was 2.7.X . Tools needs to be updated
So just re-installed gnuplot=5.0.4 on our Centos 7 machines and looks like one of the dependencies for gnuplot got upgraded (libwebp) from libwebp.so.6 to libwebp.so.7 . Getting following stacktrace from tool.
gnuplot: error while loading shared libraries: libwebp.so.6: cannot open shared object file: No such file or directory
** WARNING: GNUplot pipe returned non-zero status: '32512'
** ERROR: Failed to create /Galaxy/jobs/005/496/5496674/working/dataset_1_files/histogram_bins.dat.png'**
Upgrade gnuplot to latest version and see if it fixes issue and still works with Assemblystats.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.