GithubHelp home page GithubHelp logo

Comments (9)

tobiasko avatar tobiasko commented on September 27, 2024 1

Please check #48 on how to generate tabular data from non standard scan attributes.

from rawrr.

tobiasko avatar tobiasko commented on September 27, 2024

Hi Vivian,

so what you would like to do is select specific scans from a raw file that correspond to a specific DIA window? For instance all MS2 scans that cover the mass range 400-425 m/z (quadrupole position 412.5 m/z, window size 25 m/z)?

from rawrr.

videlc avatar videlc commented on September 27, 2024

Hi,

Thanks for answering quickly. Sorry but my indications were incorrect as the critical information is stored in highMZ/lowMZ but you understood the idea.

This is exactly what I would like to do: parse the raw file to select specific scans where my ion of interest has been selected / fragmented for PRM/DIA analysis. To do so, we need two informations, quadrupole isolation center m/z and quadrupole isolation width.

Here is some chunk of code we use with MSnbase:

getChromato<-function(file, precursorMZ, RT [...]){

hdata = fData(file) ## or header(file) which is the older function

[...]

tmp_data = data[hdata$spectrum[which(hdata$lowMZ < precursorMZ &
hdata$highMZ > precursorMZ &
hdata$polarity == pol & hdata$msLevel == 2)]]

[...]


chromato = MChromatograms()
if(length(tmp_data)) 
	{

		chromato = chromatogram(tmp_data, 
		rt_window, # RT +/- window
		mz_window, #precursorMZ +/- mztoll in ppm
		msLevel = 2)

	}
	
chromato

ggplot.......

}

I could produce a similar function with rawrr but unfortunately, you have to know the exact scan header text rather than scan parameters.


get_eic_overlay<-function(file,fragments,scanfilter,peptide,file_description){


[...]
    temp<-data.frame()

    for(frag in unique(fragments)){

        tmp <- readChromatogram(file, mass = frag, tol = 5, type = "xic", 
                                     filter = scanfilter)
        tmp<-data.frame(rt_min=tmp[[1]]$times,
                         intensity=tmp[[1]]$intensities,
                         mz=frag,header=scanfilter) 
         temp<-rbind(temp,tmp)
        
     }
[...]
temp

ggplot.....

}

Lastly, this is relatively slow in terms of running speed because the file has to be loaded at each operation. MSnbase does a "onDisk" / "inmemory" file loading which speeds tremendously up the process. Is this kind of operation available with rawrr ?

Thanks

from rawrr.

tobiasko avatar tobiasko commented on September 27, 2024

You don't need to know "the exact scan header text". The Thermo RawFileReader library offers pretty sophisticated filters. An easy example is using "+ ms". You would get all scans recorded in pos. mode and of MS level = 1. see here. Our readChromatogram() function for instance accepts filter strings and passes them to the API. If they are valid, they will be used and chromatogram extraction happens on the filtered scans.

We could of course squeeze all kind of scan related meta data into a data frame/table. But that is not effective and different people have different needs. What should be included and what not???

from rawrr.

videlc avatar videlc commented on September 27, 2024

Thanks for you answer. I understand your point and totally agree, every user has specific needs which can not be all satisfied.
Let's say I have 3hours DDA file (several (hundreds of-) thousands of ms2 scans) and I'd like to know if my peptide-of-interest has been fragmented (quickly). I'd have to know exactly the m/z that has been seen by the MS and selected by the quadrupole (inside the scan filter as text") rather than slicing/filtering every header (data frame of every scan) according to m/z center of quadrupole and m/z width. By this operation, I only have to go through let's say 10 spectra rather than evaluate manually all scans headers.

Example with (silly peptide of interest trypsin autolysis VATVSLPR)
image

This particular ms2 scan has been recorded with a m/z(ppm) error (i.e. 421.7578 instead of 421.75840). To get that scan, I'd have to manually check ms2 scans to find the "closest" to 421.75840.

If I had a dataframe containing m/z center of quadrupole and m/z selection width, I could:

dataframe %>% 
mutate(mzdiff=abs(mz_selected_by_quad-mz_th)) %>% #Compute m/z deviation
filter(mzdiff < quad_selection_width/2) %>%  # only keep relevant scans
arrange(-tic) #optionally consider highest ms2 tics as top candidates

In my opinion, this kind of feature in addition with a "load in memory" mode would speed up any kind of process and would open perspective towards "rawrr as a rapid data processing tool".

I'd understand if you think this feature is not really relevent and does not fit with the purpose of rawrr development. I could still find a workaround by adding a new column via grepl('ms2') // gsub ("@.*",""...), //gsub('.* ','',...) // as.numeric() but this would be particularly inelegant.

Vivian

from rawrr.

tobiasko avatar tobiasko commented on September 27, 2024

I honestly don't understand your problem @videlc . The precursor mass has ALWAYS been part of the data frame returned by readIndex():

> library(rawrr)
Package 'rawrr' version 1.2.0 using
RawFileReader reading tool. Copyright © 2016 by Thermo Fisher Scientific, Inc. All rights reserved.
> rawfile <- sampleFilePath()
> Idx <- readIndex(rawfile)
> str(Idx)
'data.frame':	573 obs. of  8 variables:
 $ scan          : int  1 2 3 4 5 6 7 8 9 10 ...
 $ scanType      : chr  "FTMS + c NSI Full ms [350.0000-1800.0000]" "FTMS + c NSI Full ms2 [email protected] [140.0000-1015.0000]" "FTMS + c NSI Full ms2 [email protected] [140.0000-1335.0000]" "FTMS + c NSI Full ms2 [email protected] [140.0000-1415.0000]" ...
 $ rtinseconds   : num  0.097 0.35 0.419 0.489 0.558 0.627 0.696 0.766 0.835 0.904 ...
 $ precursorMass : num  1075 487 645 684 547 ...
 $ MSOrder       : chr  "Ms" "Ms2" "Ms2" "Ms2" ...
 $ charge        : int  0 2 2 2 2 2 2 2 2 2 ...
 $ masterScan    : int  NA NA NA NA NA NA NA NA NA NA ...
 $ dependencyType: logi  NA NA NA NA NA NA ...
> Idx$precursorMass
  [1] 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535
 [11]  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617
 [21]  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384
 [31]  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246
 [41]  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381
 [51]  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398
 [61]  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279
 [71]  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345
 [81]  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567
 [91]  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297
[101]  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606
[111] 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535
[121]  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617
[131]  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384
[141]  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246
[151]  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381
[161]  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398
[171]  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279
[181]  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345
[191]  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567
[201]  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297
[211]  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606
[221] 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535
[231]  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617
[241]  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384
[251]  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246
[261]  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381
[271]  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398
[281]  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279
[291]  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345
[301]  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567
[311]  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297
[321]  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606
[331] 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535
[341]  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617
[351]  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384
[361]  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246
[371]  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381
[381]  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398
[391]  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279
[401]  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345
[411]  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567
[421]  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297
[431]  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606
[441] 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535
[451]  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617
[461]  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384
[471]  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246
[481]  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381
[491]  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398
[501]  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567  644.8226  683.8279
[511]  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297  582.3190  488.5345
[521]  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606 1075.0000  487.2567
[531]  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535  636.8691  776.9297
[541]  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617  756.4250  554.2606
[551] 1075.0000  487.2567  644.8226  683.8279  547.2980  669.8381  683.8537  699.3384  726.8357  622.8535
[561]  636.8691  776.9297  582.3190  488.5345  710.3505  517.7398  583.8923  722.3246  751.8105  653.3617
[571]  756.4250  554.2606 1075.0000

In addition there is the MS order and dependency type attribute. It is very easy to filter for data dependent scans of level 2 who's precursors fall into a certain mass range. Am I missing something?

from rawrr.

videlc avatar videlc commented on September 27, 2024

100 % agreed, what's missing is the quadrupole bounardies or quadrupole isolation width (highly variable between laboratories or in PRM / DIA methods) which prevents us to know if the m/z of interest is indeed in the scan.

Or maybe this information is somewhere else ?

Sorry if my explanations were unclear.
Best regards,
Vivian

from rawrr.

tobiasko avatar tobiasko commented on September 27, 2024

For a ddMS2 "experiment" as Thermo calls it you can do:

> head(Idx)
  scan                                                     scanType rtinseconds precursorMass MSOrder
1    1                    FTMS + c NSI Full ms [350.0000-1800.0000]       0.097     1075.0000      Ms
2    2 FTMS + c NSI Full ms2 [email protected] [140.0000-1015.0000]       0.350      487.2567     Ms2
3    3 FTMS + c NSI Full ms2 [email protected] [140.0000-1335.0000]       0.419      644.8226     Ms2
4    4 FTMS + c NSI Full ms2 [email protected] [140.0000-1415.0000]       0.489      683.8279     Ms2
5    5 FTMS + c NSI Full ms2 [email protected] [140.0000-1135.0000]       0.558      547.2980     Ms2
6    6 FTMS + c NSI Full ms2 [email protected] [140.0000-1385.0000]       0.627      669.8381     Ms2
  charge masterScan dependencyType
1      0         NA             NA
2      2         NA             NA
3      2         NA             NA
4      2         NA             NA
5      2         NA             NA
6      2         NA             NA
> S2 <- readSpectrum(rawfile = rawfile, scan = 2)
> S2[[1]]$`MS2 Isolation Width:`
[1] "1.40"
> S2[[1]]$`MS2 Isolation Offset:`
[1] "0.00"

So the isolation width of the quad was set to 1.4 Da with a zero offset for the 2nd scan.

from rawrr.

videlc avatar videlc commented on September 27, 2024

thanks a lot !

from rawrr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.