GithubHelp home page GithubHelp logo

Comments (7)

will-rowe avatar will-rowe commented on September 4, 2024

This looks strange - I'm not entirely sure of the issue. Pinging @jts to see if he has any insights?

Is the command you are using to run nanopolish outside of the pipeline equivalent to the call within the pipeline?

One thing to change is to use fast5/barcode as the input to --fast5-directory, but this is unlikely to be the problem here.

from artic-ncov2019.

jts avatar jts commented on September 4, 2024

My best guess is that there is something about the sequencing_summary.txt file that nanopolish doesn't like. I suggest running the nanopolish index command outside of the artic wrappers to see what it prints to stderr, which might help diagnose what has gone wrong.

from artic-ncov2019.

RichardCorbett avatar RichardCorbett commented on September 4, 2024

Thanks folks,
Overnight last night I was able to run a different set of data smoothly and quickly. There must be something fishy with this one flowcell of data. I tried running just one fastq file with this command:

nanopolish index -d ../fast5/ -s ../sequencing_summary.txt -v ../plex_FAP90847_ACCACTGCCATGTATCAAAGTACG_pass_concat.fastq
[readdb] indexing ../fast5/

but I don't get any more output after that. I suspect there is some mismatch between the fastq/fast5/sequencing summary files I am using. I think I can dig in from here. Sorry for the trouble.

from artic-ncov2019.

RichardCorbett avatar RichardCorbett commented on September 4, 2024

Hi folks,

It looks like our system is doing something possibly unique.

We are loading a minION flowcell, but extracting the fastq files and sequencing summary at a timepoint in the middle of a run. At this time guppy has finished basecalling available fast5 files, but will be restarted once new fast5 files are available.

The run continues and fast5 files are generated until someone decides to shut down the sequencing run.

When I get around to starting my analysis I have the following inputs:
-fastq file from the intermediate timepoint
-sequencing summary file from the intermediate timepoint
-fast5 files from a later timepoint

This configuration gives me the hanging nanopore index commands.

If I change the sequencing_summary.txt file to be the one created after the run completes the minion command completes in a few minutes. Is it possible that nanopore index hangs when the seuqencing_summary.txt file contains a subset of the reads contained in the fast5 files?

from artic-ncov2019.

jts avatar jts commented on September 4, 2024

In this case nanopolish is going to revert to the slow indexing method for the subset of fast5s that haven't been basecalled. If the run progressed well this can mean opening and reading 1000s of files, which takes awhile.

To work around this, I suggest making a directory containing symlinks to the fast5s that are basecalled and present in the sequencing summary, then passing the directory-of-symlinks to nanopolish index. This is what we do to analyze an in-progress run.

from artic-ncov2019.

RichardCorbett avatar RichardCorbett commented on September 4, 2024

Wonderful. Thanks @jts

from artic-ncov2019.

RichardCorbett avatar RichardCorbett commented on September 4, 2024

Tried this out and indeed symlinking the fast5s referenced in the sequencing_summary.txt file allows the analysis to take just a minute or two per sample in the pool.

from artic-ncov2019.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.