GithubHelp home page GithubHelp logo

Comments (13)

hoelzer avatar hoelzer commented on July 24, 2024 1

I think it's difficult to include in the final report at least. There can be various insertions also based on seq errors, used protocols, ...

The question would be also how the newest Nextstrain output looks like in this context?

My last idea was to at least add a column for the totalInsertions so that one can easily spot weirdos for further checks

from porecov.

corneliusroemer avatar corneliusroemer commented on July 24, 2024 1

In the upcoming release (today or tomorrow) Nextclade/align will output aa insertions. Maybe this is still relevant?

from porecov.

RaverJay avatar RaverJay commented on July 24, 2024

Well we are reporting amino acid changes, so these columns:
"aaSubstitutions"
"aaDeletions"

but there is no "aaInsertions" at this time.

Also the "insertions" column is empty in all the runs I just checked

from porecov.

hoelzer avatar hoelzer commented on July 24, 2024

Yes, I think insertions are rare. Let me check if I can find an example Nextclade output w/ insertions.

from porecov.

hoelzer avatar hoelzer commented on July 24, 2024

Okay, so it seems insertions are only reported on nucleotide level. E.g. I just checked some sequences and selected some different outputs:

totalInsertions insertions
17      21534:A,27373:CTTTCGATCTCT,27380:C,27384:CTC
16      27373:CTTTCGATCTCT,27380:C,27384:CTC
10      9627-9631,20528-20531,22492,28967-28969
28      1569:C,1574:AGAGCTAG,27373:CTTTCGATCTCT,27380:C,27384:CTC,28250:CTG
many    matches"""
248     7010-7049,9543-9561,14331-14515,29760-29767
1       28250:G
17      39,19286,29602,29605,29631,29706,29752,29772,29779,29785,29792-29794,29797,29806,29866,29868-29870

It seems the output format can vary a lot. So I am not sure if it is so meaningful to add this. Maybe just information if there are insertions or not (totalInsertions)?

from porecov.

replikation avatar replikation commented on July 24, 2024

from porecov.

hoelzer avatar hoelzer commented on July 24, 2024

@replikation @RaverJay actually this is getting now important with Omicron, especially because of an larger insertion in the Spike:

S:R214REPE

Currently, this is just not shown in the report which can lead to misleading results.

Not sure what's best to solve this, I think currently we could still only extract such information from Nextclade.

Another way would be to use our tool covsonar to basically call all substitutions, insertions and deletions with one tool and include this into the report... https://gitlab.com/s.fuchs/covsonar

To do so, we would need a process that generates the covsonar database for all genomes and then we could easily extract all information.

from porecov.

RaverJay avatar RaverJay commented on July 24, 2024

Yeah this is bad

Problem is, Nextclade output is still missing a 'aaInsertions' column
(there is only: substitutions deletions insertions aaSubstitutions aaDeletions)

We could of course calculate it from 'insertions' though

@replikation what do you think, add or switch to covsonar?

from porecov.

hoelzer avatar hoelzer commented on July 24, 2024

Ahh, we would then need a converter from nt insertions reported by Nextclade to aa insertions. Something like

https://codon2nucleotide.theo.io/

CovSonar might solve that but here the only weak point is that the tool is also under development still (currently people are on a CovSonar2 version) so there might be certain changes. And it's not tested so extensively like Nextclade, etc...

from porecov.

RaverJay avatar RaverJay commented on July 24, 2024

Nextclade might take too long to add this: nextstrain/nextclade#319

Maybe we should implement it ourselves at this time - just 'borrow' the code from codon2nucleotide and translate to AAs

from porecov.

hoelzer avatar hoelzer commented on July 24, 2024

Yes, seems fair to me. Then we still rely on Nextclade that anyway runs and just translate the nt insertions for the report. I thin having the code from https://github.com/theosanderson/codon2nucleotide as a small script in bin should do the trick?

from porecov.

RaverJay avatar RaverJay commented on July 24, 2024

It's a little bit more complicated than that, see PR

from porecov.

replikation avatar replikation commented on July 24, 2024

@corneliusroemer thanks for informing us

from porecov.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.