GithubHelp home page GithubHelp logo

header format of .tsv did not match! about ivar HOT 3 OPEN

zdk427 avatar zdk427 commented on August 19, 2024
header format of .tsv did not match!

from ivar.

Comments (3)

Alex-Vasile avatar Alex-Vasile commented on August 19, 2024 1

We also ran into this issue and dug into a bit. It's caused by the extra POS_AA column at the end.

Temporary solution for anyone having this issue

If you don't need the POS_AA data column, pre-process your variant files to remove this column.

In-depth Info

  1. call_variants_from_plup prints out a POS_AA (from a hardcoded set of column headings inside the function).

    "\tALT_CODON"
    "\tALT_AA"
    "\tPOS_AA"
    << std::endl;

  2. common_variants calls read_variant_file which first checks if the headers are correct:

    while (std::getline(line_stream, cell, '\t')) {
    if (cell.compare(fields[ctr]) != 0) {
    return -1;
    }
    ctr++;
    }

However this checks has 2 issues with it:

  1. It uses a parallel, and out of date, set of header names; it's missing POS_AA. This and call_variants_from_plup should be working with a single set of fields so there aren't two parallel structures to update when a change happens.
  2. This code is has an out of bounds error, which is what's happening now. The loop will keep reading heading columns and index into fields even after ctr >= NUM_FIELDS. The loop should terminate if there are more than NUM_FIELDS entries.

const int NUM_FIELDS = 19;
const std::string fields[NUM_FIELDS] = {
"REGION", "POS", "REF", "ALT", "REF_DP",
"REF_RV", "REF_QUAL", "ALT_DP", "ALT_RV", "ALT_QUAL",
"ALT_FREQ", "TOTAL_DP", "PVAL", "PASS", "GFF_FEATURE",
"REF_CODON", "REF_AA", "ALT_CODON", "ALT_AA"};

Also worth considering is changing the error message from common_variants. It currently gives the incorrect impression that the header formats of A_variant and B_variant do not match each other, but what it actually means is that they don't match the expected header. Would be worth changing that message and also printing both the received header and the expected header.

from ivar.

cmaceves avatar cmaceves commented on August 19, 2024

Hi! Sorry that you're having issues, thanks for reaching out. Could you be more specific about the origin of "Part 4" and maybe share the .tsv files with me? Based on the given error, I would assume that the SNV variants files being used are not properly formatted but it's hard to tell without sample files!

from ivar.

zdk427 avatar zdk427 commented on August 19, 2024

Sure here is the link to .tsv files i got after process 3 in folder 6.Singlet-SNVs
Please let me know if you need any further information.
https://usaskca1-my.sharepoint.com/:f:/g/personal/zdk427_usask_ca/EolxZ7UcEDpPo6qFVXKB_KoBajXB_8WyJ6k5Kx3QoU3OjA?e=Y1ihRt

from ivar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.