GithubHelp home page GithubHelp logo

write temporary bam files about tigmint HOT 16 CLOSED

bcgsc avatar bcgsc commented on August 19, 2024
write temporary bam files

from tigmint.

Comments (16)

lcoombe avatar lcoombe commented on August 19, 2024

Hi Dario,

I agree that it is probably not tigmint that caused the crash, since it was just bwa mem running at that stage. Using additional threads would speed up the bwa mem alignment step.
Splitting the alignment stage into multiple parallel jobs is certainly an option, and one that I have done before when dealing with a very large dataset.

You could:

  • split your chromium read file into partitions
  • run the above bwa mem command for each partition
  • run samtools merge -tBX to merge together the separate BAM files
  • Then, continue running the tigmint pipeline from after the bwa mem alignment step. You just have to ensure that the BX-sorted, merged BAM file is named correctly for tigmint to recognize that this file exists and can be used for subsequent steps: Ex. myassembly.myreads.sortbx.bam, where myassembly.fa is the draft assembly file, and myreads.fq.gz is the chromium reads file. You can use the -n option with the tigmint makefile to do a 'dry run' of the commands to double check that the pipeline starts up again after the alignment step as desired.

Hope that helps!
Lauren

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

Hi Lauren,

Thanks for the suggestion, I will go for it.
I am new to using 10X data, and I am not sure about the structure of the fq.gz file in the input (I produced it with longranger basic).
If it is a normal fastq, I am thinking just to split it in blocks, or if the barcoding information is creating issues, is there a specific way to do it, like with Longranger?
Thanks,

Dario

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

Hi Dario,

Yes, you can partition the fq.gz (post-Longranger basic) file the same as you would any other interleaved fastq file. The only special thing about the longranger basic output as opposed to another fastq file is the chromium barcode information encoded in the read header.

Ex:

[lcoombe@lcoombe01 outs]$ gunzip -c barcoded.fastq.gz |head -n 10000 |tail -n 8
@E00247:267:HMVT3CCXX:6:1103:27265:17922 BX:Z:AAACACCAGACAATAC-1
TTATGTGGCAAAACCCAGAAAGATCCATCATGAATCCAAGATACTTTCAGCAAAAAGTTATACCAAAATAAATAAAATAAAATTGAAATAATGCTTAGCTGATCCCAAGTCAAGATTTACGTTTGTAT
+
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKFFKKKKKKKKKKKKKKKKKKK
@E00247:267:HMVT3CCXX:6:1103:27265:17922 BX:Z:AAACACCAGACAATAC-1
GCCACGGTGCCCAGCTCACCTACTGGTTTTAAAGAGGGAACTCTGGGGGCAGTGTGGAGGGAGGGACCTACTGCCATAGAATACAAACGTAAATCTTGACTTGGGATCAGCTAAGCATTATTTCAATTTTATTTTATTTATTTTGGTATAA
+
AAFFFKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKFKKFKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKAFKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKAKKFAFK

So as long as your partitioning approach keeps the R1/R2 pairs together, and keeps the "BX:Z" comment in the read header, you will be fine! Running bwa mem with the -pC option as you have in your above command indicates an interleaved reads file, and will put the barcode information in the BX tag of the resulting SAM file, which tigmint will then use in downstream steps.

Hope that helps -- let me know if you have any other questions!
Lauren

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

@dcopetti - I'll close this for now, but feel free to re-open if you have further questions!

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

Hello,
I have the large bam file (with the sorting/tag/index issues of case #811 above) and I tried to run tigmint (span branch).
After commenting the commands in lines 142-154 in a copy of the script, I get this:

~/bin/tigmint/tigmint-make3 tigmint draft=genome reads=reads
make: Nothing to be done for 'tigmint'.

~/bin/tigmint/tigmint-make3 --dry-run
echo 'Tigmint: Correct misassemblies using linked reads'
echo 'Usage: tigmint-make [COMMAND]... [PARAMETER=VALUE]...'
echo 'Example: tigmint-make tigmint draft=myassembly reads=myreads'
echo 'For more information see https://bcgsc.github.io/tigmint/'

I am not sure how else I should compose the command to use the bam as an input instead.
Thanks,
Dario

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

Hi Dario,

First off, you can just use the master branch -- the logic in the tigmint-span branch used to look for spanning molecules to identify misassemblies has been incorporated into the master branch.

Also, no need to comment out commands to ensure that the Makefile starts at the step that you want. You just need to ensure that your BAM file is named myassembly.myreads.sortbx.bam (using notation from the help page above), as expected by tigmint. You can use the --dry-run option with the full tigmint command to ensure that the pipeline is starting where you want it to.

It appears that tigmint detects that the target files of the pipeline are already present in your directory. Could you note the other files in your working directory, as well as what you named the BAM file?

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

Hi,

I installed the newest version, but it is giving me an error, same as the older installation (both attempts are here below:

[copettid@kp141-242 tigmint]$ tigmint-make
Tigmint: Correct misassemblies using linked reads
Usage: tigmint-make [COMMAND]... [PARAMETER=VALUE]...
Example: tigmint-make tigmint draft=myassembly reads=myreads
For more information see https://bcgsc.github.io/tigmint/
[copettid@kp141-242 tigmint]$ tigmint-make tigmint
make: *** No rule to make target 'draft.reads.as0.65.nm5.molecule.tsv', needed by 'tigmint'.  Stop.
[copettid@kp141-242 tigmint]$ ~/bin/old_tigmint/tigmint-make
Tigmint: Correct misassemblies using linked reads
Usage: tigmint-make [COMMAND]... [PARAMETER=VALUE]...
Example: tigmint-make tigmint draft=myassembly reads=myreads
For more information see https://bcgsc.github.io/tigmint/
[copettid@kp141-242 tigmint]$ ~/bin/old_tigmint/tigmint-make tigmint
make: *** No rule to make target 'draft.reads.as0.65.nm5.molecule.tsv', needed by 'tigmint'.  Stop.
[copettid@kp141-242 tigmint]$ tigmint-make tigmint raft=Rabiosa_genome reads=reads
make: *** No rule to make target 'draft.reads.as0.65.nm5.molecule.tsv', needed by 'tigmint'.  Stop.

Also adding --dry-run gives me the same error.
The folder looks like this:

-rwx------ 1 copettid mpb 4.3G Feb 20 08:44 Rabiosa_genome.fa
-rw-r--r-- 1 copettid mpb 4.3G Feb 21 10:00 Rabiosa_genome.fa.bwt
-rw-r--r-- 1 copettid mpb 1.1G Feb 21 10:01 Rabiosa_genome.fa.pac
-rw-r--r-- 1 copettid mpb 8.9M Feb 21 10:01 Rabiosa_genome.fa.ann
-rw-r--r-- 1 copettid mpb 2.6M Feb 21 10:01 Rabiosa_genome.fa.amb
-rw-r--r-- 1 copettid mpb 2.2G Feb 21 10:22 Rabiosa_genome.fa.sa
lrwxrwxrwx 1 copettid mpb  103 Mar  2 08:53 input_fastqs -> '/home/copettid/public/Molecular Plant Breeding Group/Dario/Lolium/rabiosa_raw_data/10x_raw_data/fastqs/'
-rw-r--r-- 1 copettid mpb  344 Mar  5 10:08 transfer
lrwxrwxrwx 1 copettid mpb  144 Mar  5 10:11 reads.fq.gz -> '/home/copettid/public/Molecular Plant Breeding Group/Dario/Lolium/rabiosa_raw_data/10x_raw_data/longranger_basic_output/4606/outs/barcoded.fq.gz'
-rwx------ 1 copettid mpb 180G Mar 17 10:09 Rabiosa_genome.reads.sortbx.bam
-rw-r--r-- 1 copettid mpb 8.4M Mar 28 09:39 Rabiosa_genome.fa.fai

I am not sure what else I should do.
Thanks,

Dario

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

Looks like you misspelled draft here?
[copettid@kp141-242 tigmint]$ tigmint-make tigmint raft=Rabiosa_genome reads=reads

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

Dang!
That is stupid.

With the --dry-run looks like this:

tigmint-make tigmint draft=Rabiosa_genome reads=reads --dry-run
/home/copettid/bin/tigmint/bin/tigmint-molecule -b Rabiosa_genome.reads.sortbx.bam -o Rabiosa_genome.reads.as0.65.nm5.molecule.tsv -a 0.65 -n 5 -q 0 -d 50000
awk 'NR>1 { print $1"\t"$2-1"\t"$3-1"\tReads="$7",Size="$4",Mapq="$8",AS="$9",NM="$10",BX="$5",MI="$6"\t"$7 }' Rabiosa_genome.reads.as0.65.nm5.molecule.tsv \
        | sort -k1,1 -k2,2n -k3,3n >Rabiosa_genome.reads.as0.65.nm5.molecule.bed
awk '$3 - $2 >= 2000' Rabiosa_genome.reads.as0.65.nm5.molecule.bed >Rabiosa_genome.reads.as0.65.nm5.molecule.size2000.bed
/home/copettid/bin/tigmint/bin/tigmint-cut -p8 -w1000 -n20 -t0 -o Rabiosa_genome.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa Rabiosa_genome.fa Rabiosa_genome.reads.as0.65.nm5.molecule.size2000.bed
ln -sf Rabiosa_genome.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa Rabiosa_genome.tigmint.fa

Then I run it and I get this error:

tigmint-make tigmint draft=Rabiosa_genome reads=reads
/home/copettid/bin/tigmint/bin/tigmint-molecule -b Rabiosa_genome.reads.sortbx.bam -o Rabiosa_genome.reads.as0.65.nm5.molecule.tsv -a 0.65 -n 5 -q 0 -d 50000
Traceback (most recent call last):
  File "/home/copettid/bin/tigmint/bin/tigmint-molecule", line 16, in <module>
    import pysam
ImportError: No module named 'pysam'
/home/copettid/bin/tigmint/bin/tigmint-make:174: recipe for target 'Rabiosa_genome.reads.as0.65.nm5.molecule.tsv' failed
make: *** [Rabiosa_genome.reads.as0.65.nm5.molecule.tsv] Error 1

so I updated pysam:

conda install pysam
Fetching package metadata .................
Solving package specifications: .
Package plan for installation in environment /home/copettid/miniconda2:
The following NEW packages will be INSTALLED:
    bcftools: 1.7-0        bioconda
The following packages will be UPDATED:
    pysam:    0.8.4-py27_0 bioconda --> 0.14.1-py27_htslib1.7_0 bioconda
[...]

move to a new terminal and I get this:

tigmint-make tigmint draft=Rabiosa_genome reads=reads
make: *** No rule to make target 'reads.fq.gz', needed by 'Rabiosa_genome.reads.sortbx.bam'.  Stop.

as if it is not recognizing the bam file?

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

Just checking -- are you in the same working directory as before? It looks like it can't see the file named reads.fq.gz?

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

You are right, the links were not activated.
But the error with pysam persists:

/home/copettid/bin/tigmint/bin/tigmint-molecule -b Rabiosa_genome.reads.sortbx.bam -o Rabiosa_genome.reads.as0.65.nm5.molecule.tsv -a 0.65 -n 5 -q 0 -d 50000
Traceback (most recent call last):
  File "/home/copettid/bin/tigmint/bin/tigmint-molecule", line 16, in <module>
    import pysam
ImportError: No module named 'pysam'
/home/copettid/bin/tigmint/bin/tigmint-make:174: recipe for target 'Rabiosa_genome.reads.as0.65.nm5.molecule.tsv' failed

should I maybe redownlaod tigmint after having updated pysam?

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

I just downloaded again tigmint, same error

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

Can you double check that the python3 that you want to use is being found (ie. in your PATH)?
Ex: when you use which python3, you see the path to your desired miniconda installation?

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

Nope, it is the one in /usr/bin/python3.
I will install it tomorrow with my colleague,
thanks for the help!

from tigmint.

lcoombe avatar lcoombe commented on August 19, 2024

No problem! Let me know if you encounter any other issues.

from tigmint.

dcopetti avatar dcopetti commented on August 19, 2024

Hi,
After installing python3.5 and updating the dependencies, I am now able to run tigmint as
tigmint tigmint draft=Rabiosa_genome reads=reads span=4 nm=4 dist=20000 t=8
Regarding the parameters, are the default already stringent? My goal is to break scaffolds with low (or questionable) 10x support.
In parallel, I am now running longranger wgs to visualize the linked reads around my putative junctions. Will ~12x coverage be enough? Or else I can use all the reads (~25x), but it will take longer.
Thanks for the support, very nice of you!

from tigmint.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.