GithubHelp home page GithubHelp logo

nanomux's People

Contributors

jia-xiu avatar willros avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

jia-xiu

nanomux's Issues

How do you deal with concatenated reads?

concatenate_read_example.pdf
Hi William,
How do you define where do the barcode starts and ends in the read?
I found that some reads can be assigned to more than one sample due to read concatenation during nanopore sequencing. I used Dorado Basecaller to do the basecalling, and kept duplex and simplex reads for demultiplexing. To avoid trimming some parts of the barcodes, I kept the adapters (--no-trim). A small proportion of my reads are concatenated, i.e. they have 16S amplicons from different samples (please see an example in the pdf file). In this case, if I set --barcode_start to 0 and -bc_end to 2200 (I set 2200 because of the different lengths of the adapters), the function/program can find 2 barcodes and assign my reads to two samples. What do you think about this issue? I am not sure if trimming adapters is a good idea, as Dorado Basecaller says “The --no-trim option will prevent the trimming of detected barcode sequences as well as the detection and trimming of adapter and primer sequences”. What did you do? Have you used reads after trimming the adapters?
Any insights are welcome.
Thanks,
Xiu

Is it possible to consider primer reads during demultiplexing?

Hi William,
I tested this package, and it works. Very cool. Thanks a lot!
However, I have a few questions regarding my output.

  1. I noticed that barcodes are removed during the analysis. Can you please provide an option to keep the barcodes for further verification or for use the output in other platforms?
  2. I found that many reads appear in more than one sample. Consequently, the "Number of sequences with barcodes" is higher than the "Number of raw sequences" (please see details below). This could be due to the duplicity of our forward and reverse barcodes (a design flaw in our barcodes). I was wondering if it is possible to incorporate primer reads during demultiplexing to reduce the number of reoccurrence and increase the accuracy of demutiplexing. For example, could we link our barcodes and primers as a single entity [fwBC’ = fwBC + fwPrimer; rvBC’ = rvBC + rvPrimer]. In this case, I have some degenerate bases in my primers. Can degenerate bases be considered during the analysis? Or is it possible to use the primers as a trailing flank for the front barcode as in Dorado?
  3. When I submitted my job via slurm. I noticed that my dataset requires a lot of memory (the job was killed when I requested 96G memory for an input fastq.gz file of size 15Gb. Is there a way to optimize memory usage?

Thanks in advance!
Xiu

  • mode: fuzzy
  • mismatch: 4
  • barcode_start: 0
  • barcode_end: 1900
  • read_len_min: 1400
  • read_len_max: 1700
  • minimum_reads: 1
  • parquet: True
  • Running nanomux on simplex_test.fastq
  • Number of raw sequences in simplex_test.fastq: 242093
  • Number of sequences between 1400bp and 1700bp: 183229
  • Barcode: PCRcontrol_r3, contained: 40 reads
  • Number of sequences with barcodes: 1153075
  • Number of barcodes found: 271
  • Number of reads found in more than one sample: 1152785
  • Saving parquet file to results_nanomux_test/simplex_test.parquet
  • Fasta files saved to: results_nanomux_test
  • Nanomux is done!

after trim two samples lost all the reads

Hi @willros ,

I demultiplexed my reads again with the --trim option. Compared to --no-trim, I lost on average 8.7 reads per sample (Min.: 0, 1st Qu.: 3, Median: 7, 3rd Qu.: 12, Max: 57). What I don't understand is that 2 samples are completely lost after trimming, which means they lost all of their 57, and 52 reads, respectively. These are my negative PCR controls. When I check the reads without trimming, they are assigned correctly. Do you have any idea what has happened? How can nanomux prevent this from happening?

Thanks,
Xiu

cannot install

Hi William,

I've asked our dry lab manager to install your package on our HPC. However, we are encountering a problem. Please refer to the error message below. Do you have any suggestions to solve the issue?

Thanks in advance.

Best,
Xiu

pip install -e .
Obtaining file:///home/tools/nanomux/nanomux
Preparing metadata (setup.py) ... done
Collecting pyfastx==2.0.2
Using cached pyfastx-2.0.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
Requirement already satisfied: polars in /home/groups/VEO/tools/nanomux/myenv/lib/python3.6/site-packages (from nanomux==0.1.0) (0.12.5)
ERROR: Could not find a version that satisfies the requirement polars-ds (from nanomux) (from versions: none)
ERROR: No matching distribution found for polars-ds

pip install polars-ds==0.1.2
ERROR: Could not find a version that satisfies the requirement polars-ds==0.1.2 (from versions: none)
ERROR: No matching distribution found for polars-ds==0.1.2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.