tadkeys / tabsat Goto Github PK
View Code? Open in Web Editor NEWTargeted Amplicon Bisulfite Sequencing Analysis Tool
Targeted Amplicon Bisulfite Sequencing Analysis Tool
Hi. I've been trying to get the pipeline installed and feel like I put all the right pieces in place, but it seems a few things may not be right still, and I'm stumped. I have attached output streams from both test_tabsat_miseq.sh and test_tabsat_mouse.sh here. In the miseq test, it seems it runs fine but the .csv files produced in the latter stages are all empty except for the header line which (I think) messes up the final stages. In the mouse example it seems to have an error at the lollipop stage. In both cases, it seems to be missing some expected output files at the end, I'm presuming due to the earlier issues.
Also, sometimes running either example it seems to throw a samtools error at the sort stage even though the target file has been generated. This doesn't always happen and seems weird to me, but I ahae observed this outcome a few times now. If the sort fails then obviously downstream processes get compromised. Fortunately, this doesn't seem to happen frequently.
Can you please look over the attached files and see if you can get any hint of what might not be right with my configuration? And if you have any insight into the samtools error that would be great too.
Finally, if it's of any use for you to know, I'm running in a RedHat 7 environment with 24G of RAM, but I am doing the work on an external hard drive due to lack of space on the hard drives of the machine.
Thanks,
John Martinson
I have the following error when running the test for miseq (in bold). It might come from the (...)-zz_test/SRR3296596_1.fastq
which should not contain the '-' before 'zz_test'. I have tried to figure out why this character was inserted, but I couldn't find so far.
CMD subpopulations: /home/gcristofari/tools/tabsat/tools/MethylSubpop/subpopulations.sh -i /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations -p 0.7 -t /home/gcristofari/tools/tabsat/tools/zz_test/target_list_miseq.csv
Starting with methylation pattern analysis
Output will be saved in /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/Output
Whole Target for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Intermediate Positions for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Paste intermediate Positions for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Intermediate Subpops for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Final Positions for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Paste final Positions for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Final Subpops for /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam
Comparision of first and last methylation positions in all samples
Finding methylation subpopulations
Done with workflow
mv: cannot stat '/home/gcristofari/tools/tabsat/tools/-zz_test/SRR3296596_1.fastq /home/gcristofari/tools/tabsat/tools/zz_test/SRR3296596_2.fastq': No such file or directoryCMD patternmap: /home/gcristofari/tools/tabsat/tools/Patternmap/patternmap.sh -i /home/gcristofari/tools/tabsat/tabsat_test_output_miseq -s /home/gcristofari/tools/tabsat/tabsat_test_output_miseq/copied_inputs -t /home/gcristofari/tools/tabsat/tools/zz_test/target_list_miseq.csv
While running it on the test data, script reaches a point where no more STDERR or STDOUT is produced. This is the last line of the combined STDERR and STDOUT:
---- Crete target lists in patternmap
-Patternmap:` removing *.log, *.target, *.jsons
.../tabsat/tabsat_test_output_miseq/Patternmap
.../tabsat/tools/Patternmap/patternmap.sh: line 181: $i: ambiguous redirect
.../tabsat/tools/Patternmap/patternmap.sh: line 182: $i: ambiguous redirect
Script keeps running for hours without any new output. A top command shows a tr tool being run
It is not indicated in which folder the the reference genome should be saved once downloaded. In tabsat/reference
?
In the check_quality.sh
script, the following lines (l. 63 & l. 66) give an error:
samtools sort -o ${BAM_FILE} aa | ${INTERSECTBED} -v -a - -b ${QUALITY_DIR}/target_list.bed > ${NON_INTERSECT_BAM}
samtools sort -o ${BAM_FILE} aa | ${INTERSECTBED} -a - -b ${QUALITY_DIR}/target_list.bed > ${INTERSECT_BAM}
If I understood well, it should be replaced by:
samtools sort ${BAM_FILE} | ${INTERSECTBED} -v -a - -b ${QUALITY_DIR}/target_list.bed > ${NON_INTERSECT_BAM}
samtools sort ${BAM_FILE} | ${INTERSECTBED} -a - -b ${QUALITY_DIR}/target_list.bed > ${INTERSECT_BAM}
Stephan,
I think I found something that should be modified in the patternmap.sh script. The following line appears in there:
SAMPLE_C="${INDIR}/COVERAGE_NONDIR_${ALIGNER}/MethylSubpopulations/Output/SampleComparison.txt
I was doing a "DIR" run so my outputs had gone to "COVERAGE_DIR_${ALIGNER}", therefore the following error message appeared at the pattern map stage (I assume due to the hard coding of "NONDIR" in the line above):
cp /media/MyBook/progs/tabsat-master/fhm_test_output_dir_all2/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/Output/SampleComparison.txt /media/MyBook/progs/tabsat-master/fhm_test_output_dir_all2/Patternmap/All_targets.txt
cp: cannot stat โ/media/MyBook/progs/tabsat-master/fhm_test_output_dir_all2/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/Output/SampleComparison.txtโ: No such file or directory
Thought you would like to know.
John
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.38/containers/tabsat/json: dial unix /var/run/docker.sock: connect: permission denied
The folder paths are hard-coded in multiple scripts, which renders impossible to install tabsat somewhere else than in /home
, whithout manually editing all the .sh
scripts. I would suggest to make a unique configuration file which will load all the relevant paths and variables. Using the export command
might useful.
In the main tabsat
scripts, the command lines (l. 347 & 366) for samtools sort
should be corrected:
samtools sort "${current_sam}_removed_cov_one.bam" "${current_sam}_removed_cov_one_sorted"
into:
samtools sort "${current_sam}_removed_cov_one.bam" > "${current_sam}_removed_cov_one_sorted.bam"
and,
samtools sort "${current_sam}_removed_cov_one.bam" "${current_sam}_removed_cov_one_sorted"
into
samtools sort "${current_sam}_removed_cov_one.bam" > "${current_sam}_removed_cov_one_sorted.bam"
I don't know if this might come from changes in samtools syntaxes, but currently it does not work with samtools 1.3.1 wich is installed on our server.
Please provide a list of pre-requisites including:
When starting the test script ./test_tabsat_tmap.sh
I've got the following error (note that I've got the same error running tabsat manually on the same dataset or using the script for MiSeq):
#################################
#################################
In the patternmap.sh
script, the variable SAMPLE_C
contains the COVERAGE_NONDIR_tmap
path, which does not exist if bowtie2 is used as aligner: SAMPLE_C="${INDIR}/COVERAGE_NONDIR_tmap/MethylSubpopulations/Output/SampleComparision.txt"
This makes the test_tabsat_miseq.sh
script to crash. An ALIGNER variable should be defined (see also this issue: #4).
Script fails to call a few perl scripts from .../tabsat/tools/MethylSubpop/subpopulations.sh
if not nstalled in $HOME
The reason is it expects that the preceding (installation) path to be $HOME/tabsat/...
Hi,
these days, i got paired end ampliseq data.
so i want to run bismark using tmap to align to reference.
but i guess the bismark script of tools/bismakr_tmap does not support paired end when using tmap.
So i wanna modify something in that bismark script.
would you recommend anything?
or is there some possibility to run tmap to align paired end?
Best Regards.
Jeongmin
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.