GithubHelp home page GithubHelp logo

how to use this tool about paired-seq HOT 10 CLOSED

cxzhu avatar cxzhu commented on September 26, 2024
how to use this tool

from paired-seq.

Comments (10)

cxzhu avatar cxzhu commented on September 26, 2024 1

@fangling0913

@cxzhu
Sorry, It still doesn't work.

This is sam file:
SRR8980195.1:TTTTGCTACT:GGGGTGAGTGGTGCCTGGAGGCTAACTCAAGAACACTGGCGTCTCCTTCATTG:DDCDDIIIIHHHHIIHHHHHHIIGIIIIIIHIIIIIIIHIIHIIGIIIIIGFH 0 19:03:55:05 1 255 24M * 0 0 ACGATTGAAACGTCCAAAATGAGG DDBDDIIGHIIIIG?EGEEHIIII XA:i:2 MD:Z:15C7A0 NM:i:2 XM:i:2

'my @sp = split/:/, $tmp[0];
my $readname = "@".$sp[0].":".$sp[1].":".$sp[2].":".$sp[3].":".$sp[4].":".$sp[5].":".$sp[6].":".$cell_id.":".$sp[7];
my $read = $sp[8];'

$sp only has four items. how to write the sp4-8
Is there something wrong with my sam file?

I finally got the point!

The format of standard sam from illumina fastq files: "7001113:1032:HF3CJBCX3:1:1107:8388:2231:TTATTCATAA:CGGACGGACGAATGCTCTGGCCTCTCAAGCACGTGGATTGCTACGGGTCTGAGTTCGCACCGAAACATCGGCCACGTACCTCCTTTATGAATAA:@d@DDHHCHCC?GECG@@HG1CHIFHIC<FCGHFHHEE<CGEHIHFDHDHHHIFGEGHHD<H/<CCEEEGDDHIHIHIHIHH11C@HIIFHEHH"

for the fastq dumped from SRA, the readname is shortened to only 1 word. So please modify the code to: '
my $readname = "@".$sp[0].":".$cell_id.":".$sp[1];
my $read = $sp[2];
'

from paired-seq.

cxzhu avatar cxzhu commented on September 26, 2024

Can you upload the error messages?

I download the raw read and try to get the valid RNA data. The 'proc_RNA.sh' can't run correctly. How can I get RNA data with correct barcode with the rawreads (SRR8980195).

from paired-seq.

fangling0913 avatar fangling0913 commented on September 26, 2024

@cxzhu
proc_RNA.sh
'''
s=$1

reacthools combine CZ${s}_R1.fq.gz CZ${s}_R2.fq.gz
zcat CZ${s}_combined.fq.gz | bowtie /projects/ps-renlab/chz272/genome_ref/cell_id/cell_id -p 8 -v 3 --norc - -S CZ${s}.sam

reachtools convert CZ${s}.sam
trim_galore -a AAAAAAAAAAAAAAAACCTGCAGGNNNNNNNNNN CZ${s}_cov.fq.gz
trim_galore -a GGGGGGNNNNNNNNNNNNNNNN CZ${s}_cov_trimmed.fq.gz

STAR --runThreadN 6 --genomeDir /projects/ps-renlab/chz272/genome_ref/refdata-cellranger-mm10-3.0.0/star/ --readFilesIn CZ${s}_cov_trimmed_trimmed.fq.gz --readFilesCommand zcat --outFileNamePrefix CZ${s}_mm10
samtools sort CZ${s}_mm10Aligned.sam CZ${s}_mm10
reachtools rmdup CZ${s}_mm10.bam

reachtools bam2Mtx CZ${s}_mm10.bam mm10

'''

  1. The rawreads contain RNA and DNA reads but I can't split them in this pipline.
  2. The 'convert' tool always report the error:
    [1]+ Segmentation fault (core dumped)
    I noticed that there are similar tools: convertReads and convRead2. What's the difference?

from paired-seq.

cxzhu avatar cxzhu commented on September 26, 2024

@cxzhu
proc_RNA.sh
'''
s=$1

reacthools combine CZ${s}_R1.fq.gz CZ${s}_R2.fq.gz
zcat CZ${s}_combined.fq.gz | bowtie /projects/ps-renlab/chz272/genome_ref/cell_id/cell_id -p 8 -v 3 --norc - -S CZ${s}.sam

reachtools convert CZ${s}.sam
trim_galore -a AAAAAAAAAAAAAAAACCTGCAGGNNNNNNNNNN CZ${s}_cov.fq.gz
trim_galore -a GGGGGGNNNNNNNNNNNNNNNN CZ${s}_cov_trimmed.fq.gz

STAR --runThreadN 6 --genomeDir /projects/ps-renlab/chz272/genome_ref/refdata-cellranger-mm10-3.0.0/star/ --readFilesIn CZ${s}_cov_trimmed_trimmed.fq.gz --readFilesCommand zcat --outFileNamePrefix CZ${s}_mm10
samtools sort CZ${s}_mm10Aligned.sam CZ${s}_mm10
reachtools rmdup CZ${s}_mm10.bam

reachtools bam2Mtx CZ${s}_mm10.bam mm10

'''

  1. The rawreads contain RNA and DNA reads but I can't split them in this pipline.
  2. The 'convert' tool always report the error:
    [1]+ Segmentation fault (core dumped)
    I noticed that there are similar tools: convertReads and convRead2. What's the difference?
  1. The raw reads only contain RNA reads. The DNA reads are separate files.
  2. Can you also upload the output log from "reachtools combine" and "bowtie". The convertReads and convRead2 are obsolete. Only "convert" function can work.

from paired-seq.

fangling0913 avatar fangling0913 commented on September 26, 2024

@cxzhu
To save time, I used the top 1 million reads to try.
log from "reachtools combine"
'''
1000000 read pairs processed.
454235 read pairs passed docking rate.
'''

log from "bowtie"
'''
reads processed: 454235
reads with at least one reported alignment: 408336 (89.90%)
reads that failed to align: 45899 (10.10%)
Reported 408336 alignments
'''

from paired-seq.

cxzhu avatar cxzhu commented on September 26, 2024

@cxzhu
To save time, I used the top 1 million reads to try.
log from "reachtools combine"
'''
1000000 read pairs processed.
454235 read pairs passed docking rate.
'''

log from "bowtie"
'''
reads processed: 454235
reads with at least one reported alignment: 408336 (89.90%)
reads that failed to align: 45899 (10.10%)
Reported 408336 alignments
'''

As the C error message gives very little information. Please use the following perl script instead of this function:
`
#!/usr/bin/perl
use strict;
use warnings;

open IN, "$ARGV[0]" or die $!;
my $name = substr($ARGV[0],0,length($ARGV[0])-4);
open OUT, "|gzip - > $name_cov.fq.gz";

while(<IN>){
next if m/^@/;
my @tmp = split/\s+/, $_;
next if $tmp[2] eq '*';
my $cell_id = $tmp[2];
my @sp = split/:/, $tmp[0];
my $readname = "@".$sp[0].":".$sp[1].":".$sp[2].":".$sp[3].":".$sp[4].":".$sp[5].":".$sp[6].":".$cell_id.":".$sp[7];
my $read = $sp[8];
my $l = length($read);
my $mark = "+";
my $qual = substr($tmp[0], -$l, $l);
print OUT "$readname\n$read\n$mark\n$qual\n";
}
close IN;
close OUT;
`

from paired-seq.

fangling0913 avatar fangling0913 commented on September 26, 2024

@cxzhu
This is the error message in the while loop of the perl script. I don't know what you want to write
'
Use of uninitialized value $_ in pattern match (m//) at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 10.
Use of uninitialized value $_ in split at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 11.
Use of uninitialized value $tmp[2] in string eq at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 12.
Use of uninitialized value in split at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 14.
Use of uninitialized value $sp[0] in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value $cell_id in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value $l in negation (-) at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 19.
Use of uninitialized value $l in substr at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 19.
Use of uninitialized value in substr at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 19.
Use of uninitialized value $read in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 20.
Use of uninitialized value $_ in pattern match (m//) at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 10.
'

from paired-seq.

cxzhu avatar cxzhu commented on September 26, 2024

@cxzhu
This is the error message in the while loop of the perl script. I don't know what you want to write
'
Use of uninitialized value $_ in pattern match (m//) at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 10.
Use of uninitialized value $_ in split at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 11.
Use of uninitialized value $tmp[2] in string eq at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 12.
Use of uninitialized value in split at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 14.
Use of uninitialized value $sp[0] in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value $cell_id in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 15.
Use of uninitialized value $l in negation (-) at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 19.
Use of uninitialized value $l in substr at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 19.
Use of uninitialized value in substr at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 19.
Use of uninitialized value $read in concatenation (.) or string at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 20.
Use of uninitialized value $_ in pattern match (m//) at /data2/gminix/project_new/fangling/software/Paired-seq/convert.pl line 10.
'

Please refer to "https://github.com/cxzhu/Paired-seq/blob/master/convert.pl". I think github masked some symbols in this comment box.

from paired-seq.

fangling0913 avatar fangling0913 commented on September 26, 2024

@cxzhu
Sorry, It still doesn't work.

This is sam file:
SRR8980195.1:TTTTGCTACT:GGGGTGAGTGGTGCCTGGAGGCTAACTCAAGAACACTGGCGTCTCCTTCATTG:DDCDDIIIIHHHHIIHHHHHHIIGIIIIIIHIIIIIIIHIIHIIGIIIIIGFH 0 19:03:55:05 1 255 24M * 0 0 ACGATTGAAACGTCCAAAATGAGG DDBDDIIGHIIIIG?EGEEHIIII XA:i:2 MD:Z:15C7A0 NM:i:2 XM:i:2

'my @sp = split/:/, $tmp[0];
my $readname = "@".$sp[0].":".$sp[1].":".$sp[2].":".$sp[3].":".$sp[4].":".$sp[5].":".$sp[6].":".$cell_id.":".$sp[7];
my $read = $sp[8];'

$sp only has four items. how to write the sp4-8
Is there something wrong with my sam file?

from paired-seq.

fangling0913 avatar fangling0913 commented on September 26, 2024

@cxzhu It works now. Thanks

from paired-seq.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.