GithubHelp home page GithubHelp logo

stevenwingett / lmb-nextflow Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 3.0 219 KB

Configuration files and other information regarding the Nextflow setup at the LMB

Python 46.18% Shell 3.15% Nextflow 13.32% R 24.22% Dockerfile 13.13%

lmb-nextflow's Introduction

lmb-nextflow's People

Contributors

stevenwingett avatar

Watchers

 avatar

lmb-nextflow's Issues

Should we add new environment variables?

Should we add the following to the script that sets up Nextflow on the cluster?

Do not fill up your home directory with cache files

mkdir -p /beegfs3/$USER/NEXTFLOW/NXF_HOME
mkdir -p /beegfs3/$USER/NEXTFLOW/NXF_TEMP

NXF_HOME=/beegfs3/$USER/NEXTFLOW/NXF_HOME
NXF_TEMP=/beegfs3/$USER/NEXTFLOW/NXF_TEMP

Correct STAR index listing in genome_overview.txt

The genome download script 'download_genomes.py' needs to give the correct STAR index folder in the output file 'genome_overview.txt'.

e.g. change:
star = '/public/genomics/species_references/nextflow/Genome_References/Ensembl/saccharomyces_cerevisiae/R64-1-1/Release_105/STAR_index/'

to

star = '/public/genomics/species_references/nextflow/Genome_References/Ensembl/saccharomyces_cerevisiae/R64-1-1/Release_105/STAR_index/saccharomyces_cerevisiae.R64-1-1.dna.105.STAR_index/'

Suggestions to improve genome download script

Just finished my rescheduled meeting. The format we agreed is:

The existing species_releases folder will have per species folders in it. When you're ready to start downloading you'll move the existing ones into old_data, but leave nextflow there till you've rewritten code.

Inside the species folder structure will look like this:

Ensembl

GRCh38

    Release 103

        BED

        FASTA

        GTF

        INDEXES

In the FASTA folder you will keep the original file names, but duplicate the genome file to a standard renaming scheme for nextflow. You'll add .genome or similar to it to indicate what this file is. You'll also download the cDNA fasta to this folder.

For indexes possibilities will be Bowtie, Bowtie2, Hisat2, STAR (both the version in the genomics/soft/bin and the nextflow version, folder names should indicate which version they were made with), Hi-CUP, 10X, PARSE. Not all will be made for all species - for all releases? Can your code be release specific?

It would be good to add the Human T2T assembly as an option as well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.