fgcz / rawdiag Goto Github PK

View Code? Open in Web Editor NEW

36.0 7.0 11.0 176.13 MB

Brings Orbitrap mass spectrometry data to life; multi-platform, fast and colorful R package

Home Page: https://bioconductor.org/packages/rawDiag

R 78.00% TeX 22.00%

fast multiplatform rpackage mass-spectrometry visualization r orbitrap

rawdiag's People

Contributors

Stargazers

Watchers

Forkers

nilshoffmann lars20070 metabolomicshk 13479776 dirklewerentz dzolg quintuplin williambarshop lygars8746 lfnothias mbatsch97

rawdiag's Issues

add scan filters to readXICs()

Our current readXICs() does not support any scan filters. Actually the compiled c# code uses the hard coded filter:

Filter = "ms"
that returns all scans.
see line 1046 of fgcz_raw.cs

I think it would be cool to have an additional parameter for readXICs() that passes filters to the c# function.

dark theme

gp <- PlotMassHeatmap(PXD006932, bins=40)

gp2 <- gp + theme(legend.position = 'none') +
                   theme(axis.line=element_blank(),
                         axis.text.x=element_blank(),
                         axis.text.y=element_blank(),
                         axis.ticks=element_blank(),
                         axis.title.x=element_blank(),
                         axis.title.y=element_blank(),
                         legend.position="none",
                         panel.background=element_blank(),
                         panel.border=element_blank(),
                         panel.grid.major=element_blank(),
                         panel.grid.minor=element_blank(),
                         plot.background=element_blank()) +
                   theme(plot.title = element_blank()) +
                   theme(plot.subtitle = element_blank()) +
                   theme(strip.background = element_blank()) +
                   theme(strip.text = element_blank()) +
		   theme(plot.background = element_rect(fill = "black")) +
		   theme(panel.spacing = unit(-1, "lines"))

ggsave(filename = "graphics/Thumb.png", gp2,
  device = 'png',
  dpi = 300,
  height = 9, width =16)

LFQ demo

code for tweet https://twitter.com/hb9feb/status/1014602529034915840

#R

library(protViz)
library(rawDiag)

f <- function(rawfile, pepSeq, dt = 0.1){
  mass2Hplus <- (parentIonMass(pepSeq) + 1.008) / 2
  X <- readXICs(rawfile = rawfile, masses = mass2Hplus)
  S <- read.raw(rawfile)
  
  idx <- lapply(mass2Hplus, function(m){
    which(abs(S$PrecursorMass - m) < 0.1)
  })
  
  scanNumbers <- lapply(idx, function(x){S$scanNumber[x]})
  
  bestMatchingMS2Scan <- sapply(1:length(pepSeq), function(i){
    peakList <- readScans(rawfile, scans = scanNumbers[[i]])
    
    peptideSpecMatch <- lapply(peakList,
                               function(x){
                                 psm(pepSeq[i], x, FUN = function (b, y){cbind(b, y)}, plot = FALSE)})
    score <- sapply(1:length(peptideSpecMatch), 
                    function(j){
                      sum(peakList[[j]]$intensity[abs(peptideSpecMatch[[j]]$mZ.Da.error) < 0.1])})
    bestFirstMatch <- which(max(score, na.rm = TRUE) == score)[1]
    scanNumbers[[i]][bestFirstMatch]
  })
  
  peakList <- readScans(rawfile, scans = bestMatchingMS2Scan)
  
  pp <- lapply(1:length(pepSeq), function(j){
    jpeg(filename = paste("~/Desktop/rawDiag_", pepSeq[j],".jpeg", sep=''), quality = 100, height = 640)
    op<-par(mfrow = c(2,1), mar = c(5,4,4,3))
    peakplot(pepSeq[j], peakList[[j]], FUN = function (b, y){cbind(b, y)})
    
    t <- S$StartTime[bestMatchingMS2Scan[j]];
    
    peak.idx  <- which((t - dt) < X[[j]]$times & X[[j]]$times < (t + dt))
    
    plot(X[[j]], xlim = c(t - 0.2, t + 0.2), main = paste("RT =", round(t * 60), 'seconds', "[m+2H]2+ =", mass2Hplus[j] ),
         xlab = 'RT [min]', ylab = 'intensity');
    abline(v = t, col = rgb(0.8, 0.1, 0.1, alpha = 0.5), lwd = 3)
    
    # peak fitting
    xx <- X[[j]]$times[peak.idx]
    yy <- X[[j]]$intensities[peak.idx]
    points(xx, yy, pch = 16, col = rgb(0.0, 0.1, 0.8, alpha = 0.5))
    # text(xx, yy, peak.idx, pos = 1)
    peak <- data.frame(logy = log(yy), x = xx)
    x.mean <- mean(peak$x)
    peak$xc <- peak$x - x.mean
    (fit <- lm(logy ~ xc + I(xc^2), data = peak))
    xx <- with(peak, seq(min(xc) - 0.2, max(xc) + 0.2, length = 100))
    lines(xx + x.mean, exp(predict(fit, data.frame(xc = xx))), col=rgb(0.25, 0.25, 0.25, alpha = 0.3), lwd = 5)
    dev.off()
  })
  
}
# https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3918884/

f(rawfile = "/Users/cp/Downloads/20180220_14_autoQC01.raw", 
  c('GAGSSEPVTGLDAK', 'VEATFGVDESNAK', 
    'TPVISGGPYEYR', 'TPVITGAPYEYR', 'DGLDAASYYAPVR',
    'ADVTPADFSEWSK', 'GTFIIDPGGVIR')
    )

Number of precursors scheduled for fragmentation for each MS1 scan

First of all, rawDiag is AWESOME!
I guess this is more a feature request, but I am not completely sure this is even possible.
It would be nice to be able to extract from a raw file the list of monoisotopic m/z the mass spectrometer has 'calculated' from the MS1 survey scan. I believe Thermo is using a proprietary (?) algorithm for the MIPS, but probably the output of that step (probably m/z, charge and intensity) is stored in the final raw file for each MS1 scan.
Knowing which precursor has been actually fragmented and which one not due to the cycle time, one can decide whether it is worth to adjust LC & MS parameters to dig deeper into the sample.

wrong variable in version 0.0.5

rawDiag/R/rawDiag.R

Line 445 in 4657018

 rvs <- system2("mono", args = c(exe, shQuote(rawfile), "qc", shQuote(tf)), stdout = tfstdout) 

rename rawfile -> file and test again

Centroided ITMS (ion trap) cans not recognized

Don't know if it ever was intended to work on ITMS data, however trying to get the peaklist of a Fusion Lumos ion trap scan always results in the following error:

Example scan:
Scan Mode: ITMS + c NSI r d Full ms2 [email protected] [100.00-825.00]

extract <- readScans(rawfile = rawfile), c(10638))
# No centroid stream available

Cheers
Daniel

ASMS 2018 poster

INFORMATICS: ALGORITHMS AND STATISTICAL ADVANCES II 374-392
ThP 375
Optimize your Method: rawDiagnostic An R Package to Support Method Development for Bottom-up Proteomics on Orbitrap Instruments

upload HFX data to PE

used in the application section of https://www.biorxiv.org/content/early/2018/04/24/304485
and add the ID to the manuscript.

add testthat case for .calc.transient

this function has to be refactored to eliminate the R CMD check NOTE: 'no visible binding for global variable'. using the mutate_at
before we should have a unit test

test bfabric interface

read.tdf - Bruker timsTOF reader

read.tdf <- function(filename){
  con <- dbConnect(RSQLite::SQLite(), filename)
  rv <- dbGetQuery(con, "SELECT * FROM Precursors a INNER JOIN Frames b on a.id == b.id;");
  dbDisconnect(con)
  
  
  rv <- rv[, c('Id','Time','ScanNumber','Intensity','SummedIntensities',
               'MonoisotopicMz', 'Charge', 'MsMsType')];
  colnames(rv) <- c('scanNumber','StartTime','BasePeakMass','BasePeakIntensity',
                    'totIonCurrent', 'PrecursorMass','ChargeState','MSOrder')
  rv$filename <- basename(filename)
  rv$MSOrder[rv$MSOrder == 0] <- "Ms"
  rv$MSOrder[rv$MSOrder == 8] <- "Ms2"
  as.rawDiag(rv)
}

language setting - digits separator is a comma and not a dot

rawDiag/R/rawDiag.R

Line 469 in 6efdab0

unlink(tfstdout)

possible solution

rv <- as.data.frame(lapply(rv, function(x)
    if(any(grepl(',', x))) as.numeric(gsub(',', '.', x)) else x), stringsAsFactors=FALSE)

or change the language settings

thanks to Yann GUITTON

add PlotSOM

https://www.shanelynn.ie/self-organising-maps-for-customer-segmentation-using-r/

add study and lab info to sample information

rawDiag/inst/docker/fgcz_raw.cs

Line 512 in cc8e18e

Console.WriteLine();

Console.WriteLine("   Study: " + rawFile.SampleInformation.UserText[0]); 
Console.WriteLine("   Laboratory: " + rawFile.SampleInformation.UserText[2]);

data analysis task: MaxQuant evidence file for targeted XIC extraction

MQ bfabric workunit with combined course data: 175310

download zip than:

xx <- read.csv("paolo_20180716_o4526_MQ_txtFiles/evidence.txt",sep="\t")
xx %>% head()
# vielleicht brauchbare columns.
relevant <- xx %>% select(Raw.file,
Sequence,Modified.sequence,Proteins, Charge,MS.MS.m.z, m.z, Mass,
Retention.time, Retention.length,Score,Delta.score,MS.MS.scan.number,
Intensity, Type ) %>% head()

Def. origin scan type for instrument cycle

It would be nice to def. which scan type should be used as a marker for the start of an instrument cycle. Here is an example:

We execute cycles of
MS1 -> msxSIM -> M2-> ... -> MS1 -> ...

selecting MS1 as origin scan would def. an instrument cycle. This would allow plotting cycle specific stats.

How many cycles did the instrument do?
How long is a cycle?
...

ggplots for QCs

    gp <- ggplot(data = df, aes(x = log(abundance,10), y = log(intensity,10), fill=filename)) + 
      geom_point(stat='identity', size = 2, aes_string(group = "filename", colour = "filename")) +
      geom_smooth(method = "lm", se = FALSE, aes_string(group = "filename", colour = "filename")) +
      #geom_text(x = -2, y = 7, label = lm_eqn_promega(df), parse = TRUE, aes_string(group = "filename", colour = "filename")) +
      facet_wrap(~ sequence * filename,  scales = "free", nrow = 6)

and

  gp <- ggplot(data = df, aes(x = rt, y = t, fill=filename)) + 
      xlab("iRT score") + 
      ylab("retention time [minutes]") +
      geom_point(stat='identity', size = 2, ) +
      geom_smooth(method = "lm", se = FALSE, aes_string(group = "filename", colour = "filename")) 
      
    
    if (input$plottype == "trellis") {
      gp <- gp + 
        #geom_text(x = 0, y = median(df$t), label = lm_eqn(df), parse = TRUE) +
        #geom_text(x = -2, y = 7, label = lm_eqn(df), parse = TRUE, aes_string(group = "filename", colour = "filename")) +
        facet_wrap(~filename, ncol = 1,  scales = "free")
    }
  
    gp <- gp + scale_fill_manual(values = cbbPalette)

Competitors work

here a link to the ASMS poster of Thermos Tartare!
https://assets.thermofisher.com/TFS-Assets/CMD/posters/po-65226-xcalibur-tartare-rawmeat-asms2018-po65226-en.pdf

lazy-load database '/usr/local /lib/R/site-library/stringr/R/stringr.rdb' is corrupt

Error in unlist(str_split(x, "\n"), recursive = FALSE, use.names = FALSE): lazy-load database '/usr/local /lib/R/site-library/stringr/R/stringr.rdb' is corrupt

implement a read.raw.info method

.read.raw.info <- function(file,
     mono = if(Sys.info()['sysname'] %in% c("Darwin", "Linux")) TRUE else FALSE,
     exe = file.path(path.package(package = "rawDiag"), "exec", "fgcz_raw.exe"),
     mono_path = "",
     argv = "info",
     system2_call = TRUE,
     method = "thermo"){

  if(system2_call && method == 'thermo'){

    tf <- tempfile(fileext = '.tsv')
    tf.err <- tempfile(fileext = '.tsv')

    message(paste("system2 is writting to tempfile ", tf, "..."))

    if (mono){
      rvs <- system2("mono", args = c(exe, shQuote(file), argv),
                     stdout = tf)
    }else{
      rvs <- system2(exe, args = c(shQuote(file), argv),
                     stderr = tf.err,
                     stdout = tf)
    }

    if (rvs == 0){
      rv <- read.csv(tf,  sep = ":",   stringsAsFactors = TRUE, header = FALSE,
                     col.names = c('attribute', 'value'))

      message(paste("unlinking", tf, "..."))

      unlink(tf)
      # unlink(tfstdout)
      return(rv)
    }
  }
  NULL
}

call C# code from R

current setup

exchange fileIO by linking

https://www.mono-project.com/docs/faq/technical/

https://www.mono-project.com/docs/about-mono/languages/cplusplus/

"try" output to console when extracting centroided ITMS scans

When extracting centroided ITMS scans, rawDiag prints "try" for every single scan that it extracts to the console. Suggest to remove or make at least optional flag.

try to solve the Thermo License issue

more columns than column names Fusion Lumos

...

Bigger bug, I guess you didn’t have a Fusion Lumos file to test with:
Extracting the raw file for scans/XIC works, however the read.raw () function throws an error. I guess the naming is different, perhaps one can build a fallback flag to ignore them if none are found. Otherwise this might have to be adapted to different machines. I could provide raw files for Orbitrap XL, Velos, Elite, QE Plus, QE HF, QE HFX, Fusion Lumos.

metadata <- read.raw(rawfile)

system2 is writting to tempfile C:\Users\danielz\AppData\Local\Temp\RtmpqM2GA5\file139c2bc35ba1tsv ...

Error in read.table(file = file, header = header, sep = sep, quote = quote, :

more columns than column names

If you want to reproduce that, I uploaded a BSA raw file from said Fusion Lumos mass spectrometer ...

installation of NewRawFileReader on MacOS

I tried to follow

"Register the .Net assembly in your system similar to a Linux installation", but it is unclear to me what this implies. The following things are done:

downloaded the NewRawFileReader archive from Thermo
installed mono

What are the next steps? How do I install the NuGet packe? Do I need VisualStudio? If not, what are the alternatives? We need to document this in a way that people without any prior experience in this area will to able to complete installation.

support open file standards through Bioconductor mzR package - make it PSI friendly

to benefit from to the fantastic PSI world.

ScanTime plot not working

hei zäme..
de ScanTime plot (den so viele Leute sich anschauen wollten) funktioniert nicht.

Add Signal to Noise data to ReadScan

Hello! Thank you for developing this useful package. Is there any way to add the signal to noise info to the object returned by ReadScans? This is a value that Thermo stores in the .raw file for every peak, besides the m/z and intensity. I believe that it can be obtained from one of the ThermoFisher.CommonCore DLLs already utilized by rawDiag. Thanks!

unit test data

I created some raw files that could be used for unit testing.

a) Calibration mix recording on a FUSION (profile & centroid mode) using direct infusion (no LC seperation!)

Pierce LTQ Velos ESI Positive Calibration Solution, Product number: 88323
product homepage
Certificate of analysis

Could be used to test basic functions like:

file header access
scan data retrieval (profile or centroid mode)
XIC generation
m/z peak detection (profile data)

FUSION1_calMix.zip

MSScan_Orbi_centroid.raw contains 50 scan of type
FTMS + c ESI Full ms [150.0000-2000.2000]

scan #50 looks like this in FreeStyle 1.4 (uses RawFileReader)

The file header contains
FileHeader_MSScan_Orbi_centroid.txt

The profile mode file is structured accordingly and displays like this for scan 2

create order hexSticker

https://github.com/GuangchuangYu/hexSticker

`R CMD check rawDiag_0.0.3.tar.gz`

should pass a CRAN submission

Error: Negative length vectors are not allowed

Dear RawDiag Team,

I am trying to extract scans from a RAW file. MS2 scans work, MS1 scan extraction works in general, e.g. if I subselect the first 100 scans to extract. Whenever I submit a large amount of scans (like all MS1 scans of a file), readScans returns:

Error in source(tfo) : negative length vectors are not allowed

I suspect that one of the scans might be empty (have seen that before, but rarely). The behavior is file dependent, some run through, some don't. Are there verbose messages to find at which scan it goes wrong? If it is indeed an empty scan, can one try to catch this error?

This seems to be a memory issue, quite a lot hits for the error. When I chunk the scans (5x1000 scans) it runs fine. So I guess the function does not scale well to ~ 5k MS1 scans (profile) or > 80k MS2 scans (these were testfiles that fail).

RawDiag 0.0.29, R 3.5.2 under 64bit Windows:

file <- "02401_Ecoli_QC_R3.raw"
metaDat <- read.raw(file, rawDiag = FALSE)
idx <- metaDat[ which(metaDat$MSOrder == "Ms"),]$scanNumber
scanDat <- readScans(file, scans = idx)

File that I am using: https://drive.google.com/open?id=1VN4U21jtg5bY10Bb9bnFEZ-mTfRMFKEY

Thanks for the support.

add pdf Download for report and each graphic

single install.R for windows/linux

#R
  
# Christian Panse <[email protected]>
# Functional Genomics Center Zurich 2018

# System Requirements
pkgs <- c( 'devtools',
  'dplyr',
  'ggplot2',
  'hexbin',
  'magrittr',
  'parallel',
  'protViz',
  'rmarkdown',
  'RSQLite',
  'scales',
  'shiny',
  'tidyr',
  'tidyverse',
  'DT')

(pkgs <- pkgs[(!pkgs %in% unique(installed.packages()[,'Package']))])
if(length(pkgs) > 0){install.packages(pkgs)}

# Installation
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawDiag_0.0.28.tar.gz', repos=NULL)


# Testing
library(rawDiag)
(rawfile <- file.path(path.package(package = 'rawDiag'), 'extdata', 'sample.raw'))
system.time(RAW <- read.raw(file = rawfile))
dim(RAW)
summary.rawDiag(RAW)
PlotScanFrequency(RAW)

# read all dimensions
dim(RAW)
RAW <- read.raw(file = rawfile, rawDiag = FALSE)
dim(RAW)

R.version.string; Sys.info()[c('sysname', 'version')]

run the rawDiag shiny application

library(rawDiag)

# root defines where your raw files are
rawDiagShiny(root="D:/Data2San/")

run as BAT script on the windows box

"c:\Program Files\R\R-3.5.1\bin\R.exe" -e "library(rawDiag); rawDiagShiny(root='D:/Data2San', launch.browser=TRUE)"

or from the Linux/Apple command line

R -e "library(rawDiag); rawDiagShiny(root='$HOME/Downloads', launch.browser=TRUE)"

and you can add it to you $HOME/.bashrc

alias rawDiag="R -e \"library(rawDiag); rawDiagShiny(root='$HOME/Downloads', launch.browser=TRUE)\""

supported std. peptides

So far rawDiag supports:

iRT (Biognosys)
6 x 5 LC-MS/MS Peptide Ref. Mix (Promega)
MSQC1 (Sigma)

Are there other peptide sets that could make sense?

PROCAL
Zolg, D. P., Wilhelm, M., Yu, P., Knaute, T., Zerweck, J., Wenschuh, H., et al. (2017). PROCAL: A Set of 40 Peptide Standards for Retention Time Indexing, Column Performance Monitoring, and Collision Energy Calibration. Proteomics, 17(21), 1700263. http://doi.org/10.1002/pmic.201700263

JPT product

`XICs.as.data.frame <- function(x)`

FUSION FTMS/ITMS und Centroided/Profile data.

https://fgcz-bfabric.uzh.ch/bfabric/userlab/show-workunit.html?id=170064&tab=details

4 Exp., jedes vom Type Full MS -> ddMS2 (1 s), aber jedes Exp. hat eine andere Kombie aus

FTMS/ITMS und Centroided/Profile data.

implement `MsBackendRawDiag.R` for `Spectra`

rawDiag/R/rawDiag.R

Line 496 in 0d3f5d4

readScans <- function(rawfile, scans = NULL){

https://github.com/rformassspectrometry/Spectra

XIC mass range option for shiny application

Hi Christian,

I was thinking: it would be really cool if in the shiny version of rawDiag you added an option for a custom mass XIC. For example if I wanted to see where a particular trypsin peptide was eluting I could type in the mass range (e.g. 421.74-421.76) and it would display those XICs.

Thanks for all your help; I’m really loving rawDiag.

Cheers,

Richard Hagan | PhD Student
Max Planck Institute for the Science of Human History
Kahlaische Straße 10 07745 Jena, Germany

`plot.XIC` and `plot.XICs`

having implementations for three method options.

plot.xic <- function(x, method = 'trellis'){
    #x$fmass <- as.factor(x$mass)
    figure <- ggplot(x, aes_string(x = "time", y = "intensity")) +
      #geom_segment() +
      geom_line(stat='identity', size = 1, aes_string(group = "filename", colour = "filename")) +
      
      #scale_x_continuous(breaks = scales::pretty_breaks(8)) +
      #scale_y_continuous(breaks = scales::pretty_breaks(8)) +
      labs(title = "XIC plot") +
      labs(subtitle = "Plotting XIC intensity against retention time") +
      labs(x = "Retention Time [min]", y = "Intensity Counts [arb. unit]") +
      theme_light()
    
    
    if(input$XICmainPeak){
      figure <- figure + facet_wrap(~  x$mass  , scales = "free", ncol = 1) 
    }else{
      figure <- figure + facet_wrap(~  x$mass  , ncol = 1) 
    }
    return(figure)
  }

add PRM/SRM example data

also consider to load data as

data(rawDiag)

and having a man page.

add helpText on each plot tab

class raw

rawfile <- structure(list(path = "Downloads/Resource_642890/20180717_006_tSIM_demo.raw", header = ... ), class = "raw")

accordingly XIC() could be def. as

XIC(rawfile, mz, tol, ...)

generalize bivariate histogram family - heatmap deconvolute checkbox?


PlotHeatmap <- function(x, deconvolute = TRUE, ...){

}

enable the vignette file for the package

``About rawDiag'' tab for shiny application

We should have a tab in the GUI that displays some important infos regarding the software:

License/copyright infos (see About RStudio)
links to repositories (GitHub)
contact points for bug reporting/help
literature references (our publication)

fgcz / rawdiag Goto Github PK

rawdiag's People

Contributors

Stargazers

Watchers

Forkers

rawdiag's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs