GithubHelp home page GithubHelp logo

datahub's Introduction

cBioPortal Public Datahub

The datahub is a repository for data storage only. It contains staging files which are validated and can be loaded directly into the cBioPortal.

Behind the scenes git-lfs is used to manage the large files. https://github.com/github/git-lfs

Test Status

Validation status of all studies on Datahub master branch. This runs weekly using the validation code from the cBioPortal master branch. It also validates if the studies on cbioportal.org and on Datahub are in sync.

CircleCI

How to Download Data

Downloading zip files individual studies

At cbioportal.org datasets page a zipped file with staging files from each study can be downloaded. These zip files are compressed versions of the study folders in the master branch of this repository.

Example downloading individual study with git-lfs

It is also possible to download uncompressed staging files from this repository with git-lfs.

After you have installed git-lfs, configure it not to download all data files right away:

git lfs install --skip-repo --skip-smudge

Clone the git repository and install lfs hooks into it:

git clone https://github.com/cBioPortal/datahub.git
cd datahub
git lfs install --local --skip-smudge

Download the data files for a study folder, for example brca_tcga:

git lfs pull -I public/brca_tcga

How to Upload Data

Create a new branch from the 'master' branch.

git checkout master
git pull origin master
git checkout -b [name_of_your_new_branch]

For general background on creating and managing branches within GitHub, see: Git Branching and Merging.

Commit changes, and push the branch back to GitHub.

[back to the root directory]
git add .
git commit -m '[notes_for_your_change]'
git push origin [name_of_your_new_branch]

Open a Pull Request on GitHub to the 'master' branch.

For instructions on submitting a pull-request, please see: Using Pull Requests and Sending Pull Requests.

Download a complete MySQL export of the latest database

http://download.cbioportal.org/mysql-snapshots/mysql-snapshots-toc.html

License

The data are available under the ODC Open Database License (ODbL).You are free to share and modify the data as long as you attribute any public use of the database, or works produced from the database; keep the resulting data-sets open; and offer your shared or adapted version of the data-set under the same ODbL license.

TCGA data are availabe under Broad Institute GDAC TCGA Analysis Pipeline License. The Cancer Genome Atlas Consortium is pleased to provide the research community with preliminary data prior to publication. Users are requested to carefully consider that these data are preliminary and have yet to be validated. Researchers are warned that the preliminary data have a significant uncertainty, are likely to change, and should be used with caution.

User Assistance

For questions, please post on our user discussion group at: https://groups.google.com/g/cbioportal

datahub's People

Contributors

alexsigaras avatar alisman avatar ao508 avatar averyniceday avatar babyasatravada avatar dependabot[bot] avatar dionnezaal avatar dippindots avatar egarcialara avatar inodb avatar jaybee84 avatar jessicath avatar jim-bo avatar jjgao avatar kalletlak avatar lizabethkatsnelson avatar mandawilson avatar matthijspon avatar migbro avatar n1zea144 avatar oplantalech avatar pieterlukasse avatar rima-waleed avatar ritikakundra avatar rmadupuri avatar rnbatra avatar sbabyanusha avatar tgerke avatar yichaos avatar zheins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datahub's Issues

TCGA staging files enhancements

This ticket is meant for tackling some of the remaining warning ⚠️ messages we still get from the validation reports. The goal is to have a new version (v2) of the staging files without these issues:

  • meta files are missing for gistic and mutsig data files.
  • in many studies, data_CNA.txt and data_linear_CNA.txt contain "|" in gene symbols, e.g. SNOU13|ENSG00000239166.1. Only the first part before the "|" should be there. Medium: estimate is that this occurs in 2% to 5% of the CNA data.
  • in many studies, data_mutations_extended.txt contains NA in Entrez gene id column in some of the records. Should be empty in this case. Minor: seems to occurs in less than 0.01% of the records.
  • in a number of studies, the data type of clinical attributes is not correct in data_bcr_clinical_data_patient.txt: LYMPH_NODE_EXAMINED_COUNT, LYMPH_NODES_EXAMINED_HE_COUNT, DAYS_TO_LAST_FOLLOWUP, DAYS_TO_DEATH, INITIAL_PATHOLOGIC_DX_YEAR, SMOKING_YEAR_STARTED, DAYS_TO_BIRTH, KARNOFSKY_PERFORMANCE_SCORE, are defined as STRING (in some studies) while they should be defined as NUMBER
  • in some studies, data_expression_median.txt will have a strange value for the Hugo gene symbol column: "Composite Element REF ". This is caused by a double header in the file, with this value on the second row, followed by a description of the transformation done on the numeric columns (is a minor issue and the row is skipped).
  • case list for the "sequenced" samples is missing (e.g. a case list with stable_id: brca_tcga_sequenced for the BRCA study)
  • column MA:link.PDB in maf files (e.g. in gbm study) contains a mix of [Not Available] and NA for missing values. Should choose one (or perhaps remove this and other columns if they are not used by the portal).
  • data_RNA_Seq_expression_median.txt in study STAD has 68 rows where Entrez gene identifier has non-numeric values such as AC2, HP08777, MIG7. Bug?
  • case_list_description field in many case lists does not match what is imported. For example, the "All tumors" in brca has case_list_description: All tumor samples (2140 samples) while 1105 samples are in this final case list. This generates confusing tooltips (see screenshot)
    image
  • cancer type file is missing in studies. This is not an issue is the type is present in the DB already, but we should not count on this scenario. Ideally each study is accompanied by its respective cancer type file.
  • SWISSPROT column in maf files should contain uniprot accessions instead of names

Notify:
@zheins, @n1zea144

Missing archives

Empty segment files in brca study (provisional)

Hi,

Segment data for the brca study is missing. There are files for it, brca_tcga_data_cna_hg19.seg and brca_tcga_meta_cna_hg19_seg.txt but the .seq file seems to be emtpy. This segment data is available on cbioportal.org for this study, but just not in this download file.

Kind regards,
Sander

brca_tcga: errors in mass spec data

Fixes that need to be solved:

  • data_protein_quantification.txt and data_protein_quantification_Zscores.txt: replace the headers to Composite.Element.REF instead of Hugo_Symbol.
  • data_protein_quantification.txt and data_protein_quantification_Zscores.txt: fill the empty fields.

AML Data Set

The file called data_mutations_extended.txt in the data set AML contains the following error that needs to be addressed:

Invalid column header, file cannot be parsed

Error in skcm_tcga study

The skcm study seems to contain some errors that show up in the validator. I generated the following report with the latest version of the validator, which is hotfix on Nov 4 + PR1866,1868,1873

Summarised errors:

  • Value in column 'n_ref_count' is invalid (.)
  • Value in column 'Validation_Status' is invalid (---)
  • Sample ID not defined in clinical file (241 times)

Full validation report:

Report_skcm.html.zip

Missing description RNA-Seq normalization

In the meta file for the RNA-Seq of the TCGA provisional data, a description is missing how the data is normalized. It only states: Expression levels for 20532 genes in 171 gbm cases (RNA Seq V2 RSEM).
The file does not contain raw read counts, because the values are floats. Also contains some extreme outliers. Information on this is probably located on https://gdc.cancer.gov/ which hosts the TCGA data.

As an example, some statistics on the values in gbm_tcga:

> expr_data[1:5,1:5]
          TCGA-02-0047-01 TCGA-02-0055-01 TCGA-02-2483-01 TCGA-02-2485-01 TCGA-02-2486-01
100134869          6.7611         15.6973         13.9398         14.9571          4.8049
10357             54.7036         31.3945         60.3441         91.8238         62.5366
155060           232.9512        162.0182        135.0923        417.6190        276.2195
26823              0.0000          0.5606          0.0000          1.9048          0.0000
280660             0.0000          0.0000          0.0000          0.0000          0.0000

> range(expr_data)
[1]       0 1026361

Boxplots including outliers:
boxplot_rnaseq

Boxplots without outliers:
boxplot_zscores_no-outliers

coadread_tcga & ov_tcga: errors in mass spectrometry data

To circumvent this problem, remove the protein_quantification files from these studies.

  • coadread_tcga: Both data_protein_quantification and data_protein_quantification_Zscores have duplicate column names, with different values in the columns (some even 0, and other >20). This does not look correct They look a bit too different to be technical replicates.

ea36a1c2-1dfb-11e7-92dd-63d3e3cc38c8

  • ov_tcga data_protein_quantification.txt: There's a lot of missing data, this does not seem correct. When data is missing, these values should contain NA instead of being empty.

98fd93d0-1dfe-11e7-9eb7-40b8a079bbc8

  • ov_tcga data_protein_quantification_Zscores.txt: The last column misses the lower 75% of data.
    9c2210a4-1dfe-11e7-8c87-531f5c8f726b

Missing documentation on seeds for old cBio releases

In a code review thread regarding how we document compatible seed databases across releases of cBioPortal, @inodb, @pieterlukasse and I came to agree that it would be best to maintain a list seed files for old releases in the corresponding README file on datahub.

Updates to the seed files, such as this one, have so far removed links to the old versions of seed files from the documentation. These links should be restored, referencing the seed files in specific commits, and we should start requiring future backwards-incompatible updates of the seed files to maintain links to previous versions.

z-score file missing for `mrna` profile

E.g. in TCGA BRCA study we have a file for mrna, but no respective file for mrna_median_Zscore profile data.

@zheins @n1zea144 : could you check for this missing file? It seems to be there on the public portal, so it is just missing here.

one line difference in CPTAC brca files

data_protein_quantification_Zscores.txt has 84233 lines while data_protein_quantification.txt 84234 lines. Is this correct, or was one line left out from zscores file by mistake?

Archive for study hnsc_tcga_pub wrong Metadata

The archive for the study hnsc_tcga_pub contains wrong metadata.
The file meta_RNA_Seq_v2_expression_median.txt has the wrong "datatype" "Z-SCORE" instead of the expected "CONTINUOUS".

The file meta_RNA_Seq_v2_expression_median.txt contains:

cancer_study_identifier: hnsc_tcga_pub
genetic_alteration_type: MRNA_EXPRESSION
datatype: Z-SCORE
stable_id: hnsc_tcga_pub_rna_seq_v2_mrna
show_profile_in_analysis_tab: false
profile_description: Expression levels for 20532 genes in 303 hnsc cases (RNA Seq V2 RSEM).
profile_name: mRNA expression (RNA Seq V2 RSEM)
data_filename: data_RNA_Seq_v2_expression_median.txt

brca_tcga.tar.gz what should be the datatype for protein expression?

Hello
I am trying to upload brca_tcga.tar.gz to the latest version of cbioportal. During the validation, files meta_protein_quantification.txt and meta_protein_quantification_Zscores.txt produce errors saying the datatype is incorrect. In http://cbioportal.readthedocs.io/en/latest/File-Formats.html I found that protein expression can be only rppa. But files meta_rppa.txt and meta_rppa_Zscores.txt already exist among brca_tcga.tar.gz files. How to import meta_protein_quantification.txt and meta_protein_quantification_Zscores.txt? Is there any other data type for protein expression?
Best regards,
Marian

Chromosome (field 5) NA for >50% of COADREAD mutation data

Hi there,

it seems that for >50% of all mutations in coadread/tcga/data_mutations_extended.txt field 5 (chromosome) seems to be NA. To be more precise, only chromosomes >=10 are reported.

$ cut -f5 data_mutations_extended.txt | sort | uniq -c
   3410 10
   4743 11
   4832 12
   1872 13
   2361 14
   2723 15
   2715 16
   3961 17
   1480 18
   5046 19
   1990 20
    821 21
   1288 22
      1 Chromosome
  47344 NA
      1 #version 2.4

I did a quick check on a couple of other indications, and there it seems to happen too.

Thanks,
Markus

Naming inconsistencies

Should these tars be renamed to match their corresponding cancer study identifiers?

Cancer Study Identifier tar filename
nepc_wcm_2016 prad_cornell_2016.tar.gz
thyroid_mskcc_2016 thca_mskcc_2016.tar.gz

Broken seed data - references to missing tables

The schema file for the seed data doesn't contain a reference to attribute_metadata, yet the data files attempt to lock that table.

At a guess, the seed data is from a version that predates the schema, this table is dropped by migration in any event.

In any event, you can't load the schema and the data right now.

Should we upload the unpacked files instead of gz files

For discussion:

Often times, I just wanted to look at one file, but I have to download the whole study to do that.

Gzipped files are also not good for comparison / keep tracks of changes.

And if I find a small issue, I wanted to be able to fix just one txt file instead of uploading the whole gzipped file.

TCGA Disease Free Survival Issue (moved from cbioportal project)

Reported by end user:

I’m trying to reproduce your disease free survival plots but am unclear how your pipelines select from multiple relapse events for the same patients. For example patient TCGA-CU-A72E has two new tumor event times: 256 days and 364 days. Cbioportal selects 364 days. Isn’t 256 more correct if we are trying to measure disease free survival?

More info from user:

Here is a link to gdc data portal from which you can download clinical metadata for TCGA-CU-A72E: https://portal.gdc.cancer.gov/files/4f3ae24f-eecd-4ba3-a592-cb9af99ed6e2

Download, de-compress, open and go to follow-ups section. Under follow-ups you’ll find two new tumor event entries (256 and 364).

BLCA GDAC firehose merged clinical files also shows these two events.

Notes from Ethan: Reached out to Ben, who confirmed that this calculation is done within one of the MSK pipelines (that is not currently on github). Also confirmed with @schultzn that we should be using the first event, not the second.

lgg_ucsf_2014: errors when loading the study

The study lgg_ucsf_2014 has missing files. Specifically, it misses the meta_study file and the meta files for the different timepoint data files. After I added this, more errors were raised by the validator:
captura de pantalla 2017-06-29 a les 16 07 20
captura de pantalla 2017-06-29 a les 16 07 34

CCSK.html

data_cna.txt in the CCSK data set has an issue defined as "Invalid CNA value: possible values are [-2, -1, 0, 1, 2, NA]"
The values encountered are "-0.0072, 0.0078, 0.0799, (1023 more)"

Number of rows/columns different in rppa/rppa_zscores for brca

Hi,

I was wondering why the number of rows (226) and columns (845) in the data_rppa_Zscores.txt is different from the number of rows (227) and columns (939) in data_rppa.txt for the brca study.

In the Zscore file, the first row after the header seems to be missing. For the columns, i thought maybe there are samples with only NA values for which Zscores couldnt be calculated, but this doesnt seem to be the case.

Kind regards,
Sander

Loading CPTAC data generates many "gene" records

When loading brca study with CPTAC mass spectrometry data, the portal will generate a large amount of new "gene" records to store the data reported for each separate isoform(?) in the CPTAC files (72,159 new records in gene table!)

Here are some concerns:

  • the query page becomes slow when typing "PHOSPHOPROTEIN" in the Genes box (each new protein "gene" also gets this alias). The resulting drop down is very slow.
  • depending on what each symbol means in the CPTAC data file, this solution might not be scalable. For example: are SORBS1_pT72 or SORBS1_pT82_S89 encoding modifications to the canonical protein sequence known for gene SORBS1 rather than symbols of well known isoforms? If so, we risk an explosion of the number of records in the gene table as each study finds new modifications.

Another question I had when looking at the data (see data sample below) is:

  • how is the entry SORBS1|SORBS1 made? Is this an aggregation of all the other SORBS1|* items? How is this aggregation done?

Data sample from file:

SORBS1|SORBS1   0.545571655184  1.31369690336   1.20131762167   1.1320980343    0.54739111875   1.19041192239   2.73163154855   0.948705044244  1.33867851356   2.12510951076   1.01727605533   1.3008073214
SORBS1|SORBS1_pT72      0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.319789927879  0.325725496261  0.453594164798  0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
SORBS1|SORBS1_pS76      0.0     0.0     0.0     0.0     0.0     0.0     0.802830082054  1.10324826511   1.43253238093   0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
SORBS1|SORBS1_pS77      0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
SORBS1|SORBS1_pS78      0.0312601610742 0.247415810145  1.62051540467   1.11476716945   0.849155398463  2.08833175784   0.0     0.0     0.0     0.0     0.0     0.0     1.32989448951   0.754650122826  0.78

Timeline file for events like status fails on validation

Event types like status, specimen, surgery etc which are a single day event don't have a stop date. The validator fails when it does not find a stop date for any timeline file. Stop date is only for durations like treatments. Others could or could not have a stop date.Therefore maybe keep it as a warning rather than an error

CCSK data set

The file: target_ccsk_pa_01_cna_hg19.seg has failures:

Sample ID not defined in clinical file in lines 2, 3, 4, (2655 more)

Unknown chromosome, must be one of (24|20|21|22|23|1|3|2|5|4|7|6|9|8|Y|X|11|10|13|12|15|14|17|16|19|18) in lines 30, 61, 458, (20 more)

and
Blank cell found in column in line 2497

After uploading a study it does not appear in cbioprtal unless it is restarted

Hello authors

I am trying use cbiortal for visualization of our sequencing data. Could you suggest how to overcome a problem with data loading? newly uploaded data is not visible in cbiortal until next restart, which is not so optimal. Perhaps you know an easy way to disable some cache so that cbiortal always queries the database?

Thank you
Best regards,
Marian Caikovski

CBTTC-0005 Data Set

The file called data_expression_median.txt in CBTTC-0005 has the following failure messages:

Value is neither a real number nor NA in lines 2, 3, 4, (22947 more) and column number 21

and

Entrez gene id is non-positive in lines 12, 73, 129, (2414 more)

Add CPTAC data

Would be nice to have the CPTAC data in the next version (Ovarian, breast, and colorectal?)

Notify: @zheins

Missing 5 meta files in thca_tcga study

These files in the thca_tcga study are missing meta files:

  • data_CNA.txt
  • data_methylation_hm450.txt
  • data_mutations_extended.txt
  • data_RNA_Seq_v2_expression_median.txt
  • data_rppa_Zscores.txt

msk_impact_2017

Requested changes

meta_cna.txt

profile_name: MSK-IMPACT Clinical Sequencing Cohort (MSKCC)
to
profile_name: Putative copy-number alterations

Replace value in profile_name and profile_description with correct value. (see #65)

meta_fusions.txt

stable_id: msk_impact_2017_mutations
to
stable_id: fusion

Add: data_filename: data_fusions.txt

data_mutations_extended.txt

Line 7058: pp.C189_A190delinsWH to p.C189_A190delinsWH
Line 13878: p. L482_E483delinsF* to p.L482_E483delinsF*

meta_study.txt

Add: add_global_case_list: true

data_fusions.txt

Replace 0 for Entrez_Gene_Id by with real values

data_mutations.txt

Replace pp.C189_A190delinsWH to p.C189_A190delinsWH

Gene panels

@zheins mentioned there is gene panel data for msk_impact, but it's currently not included in the files.

IPOP data

  • There is no RNA-Seq count for study 4
  • survival plot is not shown for the 2 new studies
  • Priority

tcga_pub studies cannot be loaded

Currently none of the 16 _tcga_pub studies can be loaded. @pulyakhina and I have validated them all, and these are the most common errors:

  • Clinical data files in old format: data_clinical.txt instead of _data_patient.txt and _data_sample.txt
  • Stable ids in meta files are in an invalid format: stable_id: blca_tcga_pub_gistic instead of stable_id: gistic
  • For .seg and multiple meta files, file type cannot be determined. Could not determine the file type. Please check your meta files for correct configuration. genetic_alteration_type: PROTEIN_LEVEL, datatype: CONTINUOUS

Error reports:
Reports_tcga_pub.zip

Affected studies:

  1. blca_tcga_pub
  2. brca_tcga_pub
  3. coadread_tcga_pub
  4. gbm_tcga_pub
  5. hnsc_tcga_pub
  6. kich_tcga_pub
  7. kirc_tcga_pub
  8. laml_tcga_pub
  9. lgggbm_tcga_pub
  10. luad_tcga_pub
  11. lusc_tcga_pub
  12. ov_tcga_pub
  13. prad_tcga_pub
  14. stad_tcga_pub
  15. thca_tcga_pub
  16. ucec_tcga_pub

Missing clinical attributes for paad_icgc

When running the validation from the command line (validateData.py), I noticed that study “paad_icgc”does not containing clinical data (“data_clinical.txt”). When you go to cBioPortal.org, there you also don’t see clinical data (only mutation data). However, the publication (http://www.ncbi.nlm.nih.gov/pubmed/23103869) does contain clinical data as far as I understand (e.g., figure 2 — the have some survival curves there), which indicates that the initial study should contain clinical data and we lost it somewhere. Could we check that somehow?

Infant MLL Study summary - uncertainty

When i look into the Study summary of
Infant MLL-Rearranged Acute Lymphoblastic Leukemia I see an uncertainty in the list of mutated genes.

The number of samples (#) is higher than the number of profiled samples.
image

Missing CANCER_TYPE_DETAILED in tcga studies?

In cbioportal.org we now see the pancancer histogram view for BRCA study:

image

However, this does not seem to appear in the study loaded from datahub? Is the CANCER_TYPE_DETAILED field missing or empty in the datahub tcga studies?

Sample IDs ov_tcga

@pieterlukasse mentioned in #11:

"Found strange sample ids in the end of the file. So the last samples contain much longer part of the "barcode" (and actually do not comply to the TCGA barcode format...) and potentially overlap with samples in previous lines, given data loader will only use the first part of the barcode, i.e. only TCGA-42-2591-01 portion of TCGA-42-2591-01A-21)"

TCGA-13-0800    TCGA-13-0800-10 cd5a08e6-343a-494e-b73a-2b060c33451d    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] NO      [Not Available] [Not Availab
TCGA-13-0801    TCGA-13-0801-01 dfaf19b4-03b4-49d3-b39f-01aaeb897a2a    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] 0.6     NO      1.5     [Not Available] [Not Availab
TCGA-13-0801    TCGA-13-0801-10 d2c929db-171c-4b5c-9de5-613407999d56    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] NO      [Not Available] [Not Availab
TCGA-13-0802    TCGA-13-0802-01 9e78346d-14d1-4c3e-a144-5b90c32a2731    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] 1.1     NO      1.4     [Not Available] [Not Availab
TCGA-13-0802    TCGA-13-0802-10 c1ab65a8-155e-40dc-a2ae-d526e0ee5138    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] NO      [Not Available] [Not Availab
TCGA-13-0803    TCGA-13-0803-01 13227d89-2bd2-4775-80a5-3fc08927dd25    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] 0.9     NO      1.3     [Not Available] [Not Availab
TCGA-13-0803    TCGA-13-0803-10 81bdc7f0-3cbf-4965-976d-6fb3bd080d72    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] NO      [Not Available] [Not Availab
TCGA-13-0804    TCGA-13-0804-01 7432f954-a16f-4053-bedd-aa776f939f71    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] 1.3     NO      1.5     [Not Available] [Not Availab
TCGA-13-0804    TCGA-13-0804-10 47946a17-b529-4490-894b-e74baf6beb3c    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] NO      [Not Available] [Not Availab
TCGA-13-0805    TCGA-13-0805-01 01cd17b5-aa30-4a47-a1c5-501f9aedb3d2    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] 0.8     NO      1       [Not Available] [Not Availab
TCGA-13-0805    TCGA-13-0805-10 49ad654d-d1de-4cf8-a689-9193bb0926db    [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] [Not Available] NO      [Not Available] [Not Availab
TCGA-10-0931-01A-21     TCGA-10-0931-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-42-2582-01A-21     TCGA-42-2582-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-61-2610-01A-21     TCGA-61-2610-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-13-0901-01A-21     TCGA-13-0901-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-13-1411-01A-21     TCGA-13-1411-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-42-2589-01A-21     TCGA-42-2589-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-36-2552-01A-21     TCGA-36-2552-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-13-1410-01A-21     TCGA-13-1410-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-36-2540-01A-21     TCGA-36-2540-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-04-1341-01A-21     TCGA-04-1341-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-24-1416-01A-21     TCGA-24-1416-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
TCGA-42-2591-01A-21     TCGA-42-2591-01A-21-20  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA

@zheins mentioned in #11 a fix is underway, but this has not been merged yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.