GithubHelp home page GithubHelp logo

hadasvolk / complabngs Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 1.0 160.82 MB

Computational Lab in Next Generation Sequencing and Genomics Data Analysis - TAU 0411358701

License: MIT License

Roff 99.95% Jupyter Notebook 0.05%

complabngs's People

Contributors

hadasvolk avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

shimooper

complabngs's Issues

igv problem

hi,
igv started looking blurry, very strange, and not possible to work with.
I tried update, install again, closing, and opening ubuntu and searching online, but that didn't work.
If someone is familiar with this and has a solution that will be great.
Untitled

Ex6- installing GATK

I'm having trouble installing GATK, this is what i did:

  1. downloaded GATK from their website, put it in my repository and extracted the zip fie.
  2. tried to change the PATH in the ".bashrc" file and run the command "gatk- register".
    that didn't work.

also tried to download the gatk via conda. that didn't worked either.

so i would appreciate some help on how to install gatk.

Lesson9_Submission of two files

Hello Hadas,

I understood that we have to submit two jupyter_notebooks, correct?
Moodle has a limit of 1 file that can be submitted.
Can you update this, so we can submit two files? Or a different solution?

Chers!

Ex8 Q3

I think I did things right, but I can't find this genes on IGV (YDL083c YEL003W YKL180W YNL162W).
I uploaded the reference genome file - S288C_reference_sequence_R64-2-1_20150113.fasta
the annotation - S288C_reference_annotation_proc.gff
and the indexed (by samtools) STAR BAM file - star_mapAligned.sortedByCoord.out.bam

Where I might be wrong?

Zoom link

Hello,
Could you provide a zoom link to the meeting so people can join?
Thanks

GATK AddOrReplaceReadGroups activation (hw 6)

Hello :)
After a successful installation of GATK, I tried to run this command:
$ gatk AddOrReplaceReadGroups -I "SRR1569760_vs_S288C.HQ.sort.bam" -O "out.bam" -LB READS -PL ILLUMINA -PU 1 -SM SRR1569760 --CREATE_INDEX

but I got this error -

Using GATK jar /home/nofar/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/nofar/gatk-4.5.0.0/gatk-package-4.5.0.0-local.jar AddOrReplaceReadGroups -I SRR1569760_vs_S288C.HQ.sort.bam -O out.bam -LB READS -PL ILLUMINA -PU 1 -SM SRR1569760 --CREATE_INDEX
Traceback (most recent call last):
  File "/home/nofar/gatk-4.5.0.0/gatk", line 511, in <module>
    main(sys.argv[1:])
  File "/home/nofar/gatk-4.5.0.0/gatk", line 177, in main
    runGATK(sparkRunner, sparkSubmitCommand, dryRun, gatkArgs, sparkArgs, javaOptions, debugPort, debugSuspend)
  File "/home/nofar/gatk-4.5.0.0/gatk", line 360, in runGATK
    runCommand(cmd, dryrun)
  File "/home/nofar/gatk-4.5.0.0/gatk", line 416, in runCommand
    check_call(cmd, env=gatk_env)
  File "/home/nofar/miniconda3/envs/mamba/lib/python2.7/subprocess.py", line 185, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/home/nofar/miniconda3/envs/mamba/lib/python2.7/subprocess.py", line 172, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/home/nofar/miniconda3/envs/mamba/lib/python2.7/subprocess.py", line 394, in __init__
    errread, errwrite)
  File "/home/nofar/miniconda3/envs/mamba/lib/python2.7/subprocess.py", line 1047, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

The bam file as it named in the command line is accurate to my directory. I don't understand what is the problem, and I would like to get help :)
Thanks in advance,

Nofar :-]

trimming does not work

I have an issue runing trimmomatic. It seems like I am missing a file to actually run the command. It says that it does not find one of teh files, however, if i list everything in my home directory, this file appears.

I tried deleting ALL files in the environment and rerunning everything, the same issue appears.
Anyone can help?

The commandline i am running:
trimmomatic PE SRR1569760_sub_1.fastqc SRR1569760_sub_2.fastqc -baseout SRR1569760 ILLUMINACLIP:NexteraPE-PE.fa:2:30:10

This is the output:
TrimmomaticPE: Started with arguments:
SRR1569760_sub_1.fastqc SRR1569760_sub_2.fastqc -baseout SRR1569760 ILLUMINACLIP:NexteraPE-PE.fa:2:30:10
Multiple cores found: Using 4 threads
Using templated Output files: SRR1569760_1P SRR1569760_1U SRR1569760_2P SRR1569760_2U
Using PrefixPair: 'AGATGTGTATAAGAGACAG' and 'AGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Exception in thread "main" java.io.FileNotFoundException: SRR1569760_sub_1.fastqc (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.(FileInputStream.java:138)
at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135)
at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:265)
at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:555)
at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)

Qualimap installation

Hello :)

I couldn't install Qualimap and I don't know why.
I've tried to install it by this manual - http://qualimap.conesalab.org/doc_html/intro.html#installation, with no success.
I've tried to run the qualimap command line inside qualimap directory -
qualimap_v2.3 rnaseq -bam /home/nofar/CompLabNGS/8-RNA1/sortedAligned.sortedByCoord.out.bam -gtf /home/nofar/CompLabNGS/8-RNA1/S288C_reference_annotation_proc.gtf
with no success..

I would be happy to get help,
Thanks in advance :)

How to convert dseq results to dataframe?

Hello :)

After dealing with a lot of errors, I've finally got the expected table out of deseq script -
`Log2 fold change & Wald test p-value: strain S288C vs RM11
baseMean log2FoldChange lfcSE stat pvalue padj
0 0.000000 NaN NaN NaN NaN NaN
1 0.000000 NaN NaN NaN NaN NaN
2 0.000000 NaN NaN NaN NaN NaN
3 0.160284 0.879404 2.528475 0.347800 0.727990 NaN
4 2.424527 4.707459 2.135109 2.204786 0.027469 NaN
... ... ... ... ... ... ...
6692 0.000000 NaN NaN NaN NaN NaN
6693 1.950290 -0.904463 1.170501 -0.772715 0.439691 NaN
6694 0.641902 1.432367 1.793678 0.798564 0.424543 NaN
6695 2.444357 0.447478 1.025930 0.436168 0.662715 NaN
6696 1.110214 0.328981 1.382793 0.237910 0.811951 NaN

[6697 rows x 6 columns]`

I now trying to move forward and filter the table by log2foldchange>1 and pvalue<0.05, but it seems that the results dataframe i'm trying to get is empty. This is my code -

import pandas as pd
from pydeseq2.default_inference import DefaultInference
from pydeseq2.dds import DeseqDataSet
from pydeseq2.ds import DeseqStats


# Read sample info into DataFrame with 'sample' column as index
sample_info_df = pd.read_csv('sample_info.tsv', sep='\t', index_col='sample')

# Read count data into DataFrame (assuming counts.tsv is in the same directory)
count_data_df = pd.read_csv('counts.tsv', sep='\t', skiprows=[0])

# Drop irrelevant columns
count_data_df = count_data_df.drop(columns=['Geneid', 'Chr', 'Start', 'End', 'Strand', 'Length'])

# Transpose the count data
transposed_df = count_data_df.transpose()

# Create an instance of DefaultInference
inference = DefaultInference(n_cpus=8)

# Create a DESeqDataSet object
dds = DeseqDataSet(
    counts=transposed_df,
    metadata=sample_info_df,
    design_factors=['batch', 'strain'],
    refit_cooks=True,
    inference=inference
)

# Run DESeq2 analysis
dds.deseq2()
results = DeseqStats(dds, inference=inference)

print(results.summary())

summary_df = pd.DataFrame(results.summary())

filtered_results = summary_df[summary_df['pvalue'] < 0.05]

and this is the errors i'm getting (also with log2foldchange filtering) (ignore the padj I tried to filter by, my mistake)

AttributeError: 'NoneType' object has no attribute 'summary'

File "/home/nofar/new.py", line 40, in <module> filtered_results = results.summary()[(results.summary()['log2FoldChange'] > 1) & (results.summary()['padj'] < 0.05)] ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable

I would be happy to get some help,
Thanks in advanve, Nofar :-]

Installing/Using Seaborn

I have big issues completing the installation of seaborn. In Jupyterlab it does not let me install it and i have no other way of using python...

import seaborn as sns

ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 import seaborn as sns

ModuleNotFoundError: No module named 'seaborn'

I am quite stuck with this issue and i do not know how else i can solve it.

cant properly install QUAST

i used QUAST 5.2.0 manual to try and install QUAST, asked chatgpt for help and googled how to install it. i installed a bunch of packages that I'm not sure if they are correct. i also don't understand what do you mean running quast on assembly 4. I eventually managed to get some sort of quast command to work, I start the command with "python quast.py" and I continue putting the file directories of the contings and scaffolds, this results in a new directory with nothing in it other than the log file which states a bunch of different errors regarding utf-8 codec and some more unclear errors.
sorry if I spill too much but I'm in a little frustration after a few hours of trying to figure it out. is there a method to start from scratch, to understand how to install/access/use assembly4 (when did you talk about it?) and also quast?

Jupiter notebook

Unable to connect to the Jupiter notebook, the link in the presentation is incorrect

Ex8 - gtf file is missing

Hi,
In ex8 the file S288C_reference_annotation_proc.gtf is missing from the data directory (and it's needed to run Qualimap)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.