GithubHelp home page GithubHelp logo

pwilmart / yeast_carbonsources Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 41.85 MB

Yeast TMT data - 3 different carbon sources (from Gygi lab) analyzed with PAW pipeline and MaxQuant

License: MIT License

Jupyter Notebook 63.35% R 0.09% HTML 36.56%
tmt-data-analyses r jupyter-notebook proteomics tmt paw-pipeline statistical-testing

yeast_carbonsources's Introduction

Yeast_CarbonSources

Yeast grown with galactose, glucose, or raffinose carbon sources from the Gygi lab labeled with 10-plex TMT reagents, in triplicate.

These are analyses of a public dataset (PRIDE PXD002875) from Paulo, O'Connell, Gaun, and Gygi:

Paulo, J.A., O’Connell, J.D., Gaun, A. and Gygi, S.P., 2015. Proteome-wide quantitative multiplexed profiling of protein expression: carbon-source dependency in Saccharomyces cerevisiae. Molecular biology of the cell, 26(22), pp.4063-4074.

There were 24 RAW files of yeast grown in three different carbon sources. It was a 3x3 (9-plex) TMT experiment done with the SPS MS3 (MultiNotch) method on a Thermo Fusion instrument.

Analyses:


PAW folder contents

File types:

  • *.ipynb - Jupyter notebooks

  • *.r - code cells from notebooks

  • *.html - notebooks rendered in html

  • results_files folder:

    • *.log - console output log files from pipeline steps
    • *.txt - tab-delimited text results_files
      • protein summaries
      • peptide summaries
    • *.xlsx - Excel files
    • R-input.txt - prepped table of TMT data for importing into r
    • CarbonSources_results.txt - statistical testing results from r

MQ folder contents

Files:

  • CarbonSources_MQ.ipynb - Jupyter notebook for statistical analysis
  • CarbonSources_MQ.html - html rendering of notebook
  • CarbonSources_MQ.r - code cells from notebook
  • CarbonSources_results.txt - statistical testing results
  • parameters.txt - summary of MQ parameter settings
  • proteinGroups.txt - main protein-level results file from MQ
  • proteinGroups.xlsx - Modified Excel file (for table prepping)
  • R-input.txt - prepped table for import into r
  • summary.txt - summary file from MQ (LC run stats)

R input table prep

Basic steps were similar for both pipelines:

  • flag proteins to exclude
    • common contaminants
    • decoys
    • proteins with no reporter ion signals
  • sort excluded proteins to bottom of table
  • make new tab
    • add column of protein accessions
    • add columns of the TMT channels
  • export the new tab contents to text files
    • table should be well-formed (single header line) and rectangular
  • read text file into R

Statistical test results in R are collected into a data frame in the same order as the imported proteins. At the end of the notebook, the results file is saved as a text file for adding back to the main protein results spreadsheet file. The accessions are also included to make sure that the rows are correctly aligned.

Eventually, there needs to be a coherent, comprehensive summary file that contains the proteomics results, the statistical testing results, and any other information to aid biological interpretation (rich annotations, etc.). This will be needed for publication and is a nice thing to include in data repositories. An Excel file is a good format for this since adding descriptive text and formatting are easy. A basic Excel sheet can be easily distributed in Supplemental files and opened in Open Office applications.

yeast_carbonsources's People

Contributors

pwilmart avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.