kevinlibuit / seq_id Goto Github PK

Makefile 7.94% Shell 92.06%

seq_id's Issues

README updates

Made few small changes in 5c87fed.

Few issues:

The Makefile is listed under the "Scripts" heading. The Makefile isn't really a script. Maybe strip this out and put after everything else?
Not sure how to interpret the asterisked line:

*The majority of isolates sequenced at the DCLS are Salonella spp. and Escherichia coli. For this reason, verifification of species does little for quality assurance. E.g. if an entire sequencing run contains Salmonella enterica isoaltes of different serotypes, confirming the species will not help to identify mislabeled samples within that run.

add example output

add example output directory and link to README.

split up oneliner at end of script

Haven't tried this yet but this line seems to run many things at once, and stringing things together with && requires that the previous command succeed before the subsequent one starts. This is okay for simple things, like mkdir -p whatever && cd whatever, but for longer things like this, if something breaks at any one of these steps, you won't reach the final make, and you probably won't know where things failed. To make easier to troubleshoot, consider running each command one at a time, and building if/else statements checking the return value using an approach like this, which would tell the user if and where something failed by echoing what you tried to do and if it was not successful.

Consistency for fresh & restarted run

Appears that seq_id.sh will set a bunch of stuff up as well as run make in the appropriate directory. But if something fails, the user needs to cd into the output dir then run make themselves. Might it not be better for consistency to have it work the same way in either case?

Easiest way is to have the script not run make but simply echo some message instructing the user to enter the directory and run make themselves.
Alternative process is just to have the user re-run the script, at which point the script would need to be modified so as to check that whatever it's doing hasn't already been done, and if so, skip those steps.

does `~/BaseSpace/...` exist?

does ~/BaseSpace/... exist for all users? Should this be checked first? e.g.

if [ -d "$DIRECTORY" ]; then
  # Control will enter here if $DIRECTORY exists.
fi

Basemount typo

In the dependencies file, I think it says basement instead of basemount for BaseSpace. Should be an easy fix!

Split up basespace linking and fastq file inputs

Maybe a usability thing, feel free to close this and ignore.

To me it feels weird to specify a basespace project rather than fastq files as input. What if you stop using basespace? What if basespace changes its layout/project structure? What if you start using basemount instead (maybe you already are?). What if you want to run this on data you got from some other lab that comes as just fastq files, not a basespace project?

I think it's more intuitive to have the user run the genomics pipeline on the raw genomics data, leaving things flexible, rather than requiring things to be in ~/BaseSpace or wherever. Split up the process of symlinking fastq files and actually running the pipeline?

kevinlibuit / seq_id Goto Github PK

seq_id's Issues

README updates

add example output

split up oneliner at end of script

Consistency for fresh & restarted run

does `~/BaseSpace/...` exist?

Basemount typo

Split up basespace linking and fastq file inputs

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs