GithubHelp home page GithubHelp logo

c5sire / biostat-578 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from raphg/biostat-578

0.0 2.0 0.0 175.59 MB

A repository for all of my teaching material

License: Creative Commons Zero v1.0 Universal

HTML 99.96% R 0.02% Shell 0.03%

biostat-578's Introduction

BIOSTAT 578A: Bioinformatics for Big Omics Data

Important Note: I am in the process of modifying the content of this repository in preparation for Winter 2015. Stay tuned (I have now updated the first few lectures). If you want to be informed of all changes, please create a github account and watch the repository. Please also make sure you look at the "Getting Started" section below, as I expect you to do some things before the course actually starts. Please also login to myuw and look for other information on canvas.

Instructor: Raphael Gottardo, PhD, Fred Hutchinson Cancer Research Center

If you need to contact me, please email me at [email protected].

Time and location: T & Th 9:00-10:20 HST T439

Prerequisite: BIOSTAT 511/12 or permission of the instructor. Please email me if you're unsure.

Getting Started: Please look at this document to get you all set-up before the first class. This will include doing some reading/learning about R/Bioconductor/git/GitHub.

Grading scheme (Tentative): HW (40%), Midterm (30%), Final project (30%)

Important dates: Midterm (Feb 19), Final project presentations: last 2 weeks of class (March 3, 5, 10 & 12).

Scope: This practical "hands-on" course in Bioinformatics for high dimensional omics will emphasize on how to use statistical methods, as well as the R programming language and the Bioconductor project, as tools to manipulate, visualize and analyze real world omics datasets. The course will be organized around the following topics:

  • Introduction to computing for Bioinformatics using R: Introduction to R/RStudio, review of main data structures and tools for efficient and reproducible research, data manipulation and visualization
  • Managing "big omics data" using relational databases: Overview of main database management systems (MySQL, Postgres, SQLite), and review of the Structured Query Language and main operations
  • How to connect to a database from R, and alternative to databases in R (sqldf and data.table)
  • How to evaluate and adjust the data for presence of "batch effect"
  • Regression techniques for high throughput biomedical data: Multiple regression analysis and logistic regression, ANOVA and design of experiments
  • Statistical methods for high dimensional hypothesis testing: Permutation tests, empirical Bayes and multiple comparison adjustment
  • Modeling of gene expression data: Introduction to Bioconductor, and basic packages for gene expression analysis (GEOquery, Limma, DAVIDquery, etc)
  • Genome-wide association studies and eQTLs; review of main packages in R/Bioconductor (e.g. rqtl)
  • Overview of other high-throughput technologies (e.g. RNA-seq, ChIP-seq) and available tools in R/Bioconductor
  • Data integration: Using R to integrate multiple data types and perform "systems biology" type analysis
  • Drawbacks and limitations of high dimensional omics analysis (overfitting, inference)

Note that this is tentative ouline and minor modifications are likely to occur. Please watch this page regularly for updates.

Lecture notes: Notes are provided as the source file (.Rmd) and resulting html file for online viewing. If you'd like to print these notes, please use the intermediate md file and gitprint.

biostat-578's People

Contributors

brianhigh avatar earosenthal avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.