GithubHelp home page GithubHelp logo

Small version GN database (<=2GB) about genenetwork2 HOT 21 CLOSED

 avatar commented on August 22, 2024
Small version GN database (<=2GB)

from genenetwork2.

Comments (21)

 avatar commented on August 22, 2024

Done.

https://github.com/genenetwork/gndatabase/blob/master/db_webqtl_small.zip

compressed: 512MB
uncompressed: 1.3GB

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

Sorry, I think we ought to move to S3, unless someone tells us how to download this file ;)

from genenetwork2.

lomereiter avatar lomereiter commented on August 22, 2024

I second Pjotr's request. Even though I installed git-lfs and found the git lfs smudge command, it didn't help - the response is '403 Forbidden'. Another advantage of S3 over Github LFS servers is fairer pricing..

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

We are uploading to S3. Kinda surprised - even for beta I expect better from github

from genenetwork2.

 avatar commented on August 22, 2024

I have put it onto Amazon S3.
https://s3.amazonaws.com/genenetwork2/db_webqtl_small.zip

from genenetwork2.

lomereiter avatar lomereiter commented on August 22, 2024

Thanks Lei. It would be good to attach a README with instructions. The procedure I used is:

  1. create an empty db_webqtl_s database from mysql console
  2. copy files from the extracted db_webqtl_s dir into /var/lib/mysql/db_webqtl_s
  3. set correct permissions (for me it was chown mysql:mysql and chmod 660 on /var/lib/mysql/db_webqtl_s/*)

I also wish there were included a dataset with case attributes:

> select * from CaseAttributeXRef, ProbeSetFreeze 
>          where CaseAttributeXRef.ProbeSetFreezeId = ProbeSetFreeze.Id;
Empty set (0.04 sec)

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

The README can go into the GN2 tree (root level) in INSTALL.md.

Case attributes are required.

from genenetwork2.

lomereiter avatar lomereiter commented on August 22, 2024

I also have a request to have at least one example dataset for each DataScale in the test database. Currently select * from ProbeSetFreeze; returns just two rows, and for both DataScale is log2.

from genenetwork2.

 avatar commented on August 22, 2024

Fixed a bug in the small database.
https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip

from genenetwork2.

DannyArends avatar DannyArends commented on August 22, 2024

Got it working now, and can search for traits in the dataset: Hippocampus Consortium M430v2 (Jun06)

However I do get an error when I try to run any of the different mapping tools:

  Marker regression line 78
  self.markers = dataset.group.get_markers()
  Error: no JSON object could be decoded

Is this due to marker data being missing ?

Additionally I get errors on:

  • help, references, links, policies and environments pages (error: table db_webqtl_s.Docs doesn't exist)
  • news page (error: table db_webqtl_s.News doesn't exist)

Can we add those 2 missing tables to the zip file ?

from genenetwork2.

 avatar commented on August 22, 2024

Added tables:
db_webqtl_s.Docs
db_webqtl_s.News

https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip

from genenetwork2.

 avatar commented on August 22, 2024

Download https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip, and then unzip it
chown -R mysql:mysql db_webqtl_s/
chmod 700 db_webqtl_s/
chmod 660 db_webqtl_s/*
restart MySQL service

from genenetwork2.

DannyArends avatar DannyArends commented on August 22, 2024

Thanks, seems to work...

Could we add the WGCNA example dataset to the genenetwork database (and the small subset) ?

Then I can use that as a test dataset for WGCNA integration in GN2
Additionally this might be nice for future workshops, since people can then see how to use WGCNA in GN2 compared to using it in R.

The example dataset is at: http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-Data.zip

We do however need to reformat it into GN2 structure.

from genenetwork2.

robwwilliams avatar robwwilliams commented on August 22, 2024

Dear Danny, Lei and team,

This should be easy. That data set (and all other data sets for this cross)
are already in the full GN1 database. In fact, I made corrections to this
database recently (errors in sex assignment). GN1 has Phenotypes,
genotypes, and four gene expression data sets (including the liver data
set). The liver data set is presented as Male, Female, and Combined.

[image: Inline image 1]

Here is a piece of the CSV file with the case IDs used in the Horvath
example:

Mice Number Mouse_ID Strain sex DOB parents Western_Diet Sac_Date weight_g
length_cm ab_fat other_fat total_fat comments 100xfat_weight Trigly
Total_Chol HDL_Chol UC FFA Glucose LDL_plus_VLDL MCP_1_phys Insulin_ug_l
Glucose_Insulin Leptin_pg_ml Adiponectin Aortic lesions Note Aneurysm
Aortic_cal_M Aortic_cal_L CoronaryArtery_Cal Myocardial_cal BMD_all_limbs
BMD_femurs_only 1 F2_290 290 306-4 BxH ApoE-/-, F2 2 3/22/02 229232 5/14/02
9/11/02 36.9 9.9 2.53 2.26 4.79 NA 12.98102981 53 1167 50 484 121 437 1117
175.85 924 0.472943723 245462 11.274 496250 NA 16 0 17 0 0 NA NA 2 F2_291
291 307-1 BxH ApoE-/-, F2 2 3/22/02 232 5/14/02 9/11/02 48.5 10.7 2.9 2.97
5.87 NA 12.10309278 61 1230 32 592 173 572 1198 92.43 5781 0.098944819
84420.88 7.099 NA NA 16 4 0 2 4 0.0548 0.0773 3 F2_292 292 307-2 BxH
ApoE-/-, F2 1 3/22/02 232 5/14/02 9/11/02 45.7 10.4 1.04 2.31 3.35 NA
7.330415755 41 1285 81 460 96 497 1204 196.398 2074 0.239633558 105889.76
5.795 218500 NA 0 0 11 0 0 0.0554 0.08065 4 F2_293 293 307-3 BxH ApoE-/-, F2
1 3/22/02 232 5/14/02 9/11/02 50.3 10.9 0.91 1.89 2.8 NA 5.566600398 271
1299 64 476 122 553 1235 97.466 11874 0.046572343 100398.68 5.495 61250 NA 0
0 0 0 236 0.0597 0.0868 5 F2_294 294 307-4 BxH ApoE-/-, F2 1 3/22/02 232
5/14/02 9/11/02 44.8 9.8 1.22 2.47 3.69 NA 8.236607143 114 1410 50 516 118
535 1360 95.452 9181 0.058272519 130846.3 6.868 243750 NA 12 10 0 0 0 NA NA
6 F2_295 295 308-1 BxH ApoE-/-, F2 1 3/22/02 232 5/14/02 9/11/02 39.2 10.2
3.06 2.49 5.55 NA 14.15816327 72 1533 18 620 106 382 1515 144.27 485
0.787628866 75166.22 17.328 104250 NA 17 2 0 0 0 0.0557 0.077

On Fri, Sep 11, 2015 at 11:55 AM, Danny Arends [email protected]
wrote:

Thanks, seems to work...

Could we add the WGCNA example dataset to genenetwork (and the small
subset) ?

Then I can use that as a test dataset for WGCNA integration in GN2
Additionally this might be nice for future workshops, since people can
then see how to use WGCNA in GN2 compared to using it in R.

The example dataset is at:
http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-Data.zip

We do however need to reformat it into GN2 structure.


Reply to this email directly or view it on GitHub
#32 (comment)
.

Rob

Robert W. Williams, Ph.D.
UT-ORNL Governor's Chair in Computational Genomics
Chair, Department of Genetics, Genomics and Informatics
University of Tennessee Health Science Center
Room 501
855 Monroe Avenue, Memphis TN 38163 USA

Office 901 448-7018 CELL 901 604 4752
Office: 501 Wittenborg Building
Department of Genetics: 71 Manassas St, Memphis TN 38163
EMAIL: [email protected]
Alternative email: [email protected]
SKYPE: robwwilliams

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

I have moved the test database to GNU Guix. A direct download is possible through http://files.genenetwork.org/raw_database/

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

@Lyan6 can you document the steps you did to create this smaller database? Thanks!

from genenetwork2.

 avatar commented on August 22, 2024

Finished.

https://github.com/genenetwork/genenetwork/blob/master/web/webqtl/maintainance/gndb-shrink.sql

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

Thanks!

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

@Lyan6 can we deploy the small database on Lily?

from genenetwork2.

leiyan avatar leiyan commented on August 22, 2024

I deployed a small GN database on Lily, and the db name is “db_webqtl_s”.

from genenetwork2.

pjotrp avatar pjotrp commented on August 22, 2024

Thanks!

from genenetwork2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.