Comments (9)
Please check #48 on how to generate tabular data from non standard scan attributes.
from rawrr.
Hi Vivian,
so what you would like to do is select specific scans from a raw file that correspond to a specific DIA window? For instance all MS2 scans that cover the mass range 400-425 m/z (quadrupole position 412.5 m/z, window size 25 m/z)?
from rawrr.
Hi,
Thanks for answering quickly. Sorry but my indications were incorrect as the critical information is stored in highMZ/lowMZ but you understood the idea.
This is exactly what I would like to do: parse the raw file to select specific scans where my ion of interest has been selected / fragmented for PRM/DIA analysis. To do so, we need two informations, quadrupole isolation center m/z and quadrupole isolation width.
Here is some chunk of code we use with MSnbase
:
getChromato<-function(file, precursorMZ, RT [...]){
hdata = fData(file) ## or header(file) which is the older function
[...]
tmp_data = data[hdata$spectrum[which(hdata$lowMZ < precursorMZ &
hdata$highMZ > precursorMZ &
hdata$polarity == pol & hdata$msLevel == 2)]]
[...]
chromato = MChromatograms()
if(length(tmp_data))
{
chromato = chromatogram(tmp_data,
rt_window, # RT +/- window
mz_window, #precursorMZ +/- mztoll in ppm
msLevel = 2)
}
chromato
ggplot.......
}
I could produce a similar function with rawrr
but unfortunately, you have to know the exact scan header text rather than scan parameters.
get_eic_overlay<-function(file,fragments,scanfilter,peptide,file_description){
[...]
temp<-data.frame()
for(frag in unique(fragments)){
tmp <- readChromatogram(file, mass = frag, tol = 5, type = "xic",
filter = scanfilter)
tmp<-data.frame(rt_min=tmp[[1]]$times,
intensity=tmp[[1]]$intensities,
mz=frag,header=scanfilter)
temp<-rbind(temp,tmp)
}
[...]
temp
ggplot.....
}
Lastly, this is relatively slow in terms of running speed because the file has to be loaded at each operation. MSnbase
does a "onDisk
" / "inmemory
" file loading which speeds tremendously up the process. Is this kind of operation available with rawrr
?
Thanks
from rawrr.
You don't need to know "the exact scan header text". The Thermo RawFileReader library offers pretty sophisticated filters. An easy example is using "+ ms". You would get all scans recorded in pos. mode and of MS level = 1. see here. Our readChromatogram()
function for instance accepts filter strings and passes them to the API. If they are valid, they will be used and chromatogram extraction happens on the filtered scans.
We could of course squeeze all kind of scan related meta data into a data frame/table. But that is not effective and different people have different needs. What should be included and what not???
from rawrr.
Thanks for you answer. I understand your point and totally agree, every user has specific needs which can not be all satisfied.
Let's say I have 3hours DDA file (several (hundreds of-) thousands of ms2 scans) and I'd like to know if my peptide-of-interest has been fragmented (quickly). I'd have to know exactly the m/z that has been seen by the MS and selected by the quadrupole (inside the scan filter as text") rather than slicing/filtering every header (data frame of every scan) according to m/z center of quadrupole and m/z width. By this operation, I only have to go through let's say 10 spectra rather than evaluate manually all scans headers.
Example with (silly peptide of interest trypsin autolysis VATVSLPR)
This particular ms2 scan has been recorded with a m/z(ppm) error (i.e. 421.7578 instead of 421.75840). To get that scan, I'd have to manually check ms2 scans to find the "closest" to 421.75840.
If I had a dataframe containing m/z center of quadrupole and m/z selection width, I could:
dataframe %>%
mutate(mzdiff=abs(mz_selected_by_quad-mz_th)) %>% #Compute m/z deviation
filter(mzdiff < quad_selection_width/2) %>% # only keep relevant scans
arrange(-tic) #optionally consider highest ms2 tics as top candidates
In my opinion, this kind of feature in addition with a "load in memory" mode would speed up any kind of process and would open perspective towards "rawrr as a rapid data processing tool".
I'd understand if you think this feature is not really relevent and does not fit with the purpose of rawrr development. I could still find a workaround by adding a new column via grepl('ms2') // gsub ("@.*",""...), //gsub('.* ','',...) // as.numeric()
but this would be particularly inelegant.
Vivian
from rawrr.
I honestly don't understand your problem @videlc . The precursor mass has ALWAYS been part of the data frame returned by readIndex()
:
> library(rawrr)
Package 'rawrr' version 1.2.0 using
RawFileReader reading tool. Copyright © 2016 by Thermo Fisher Scientific, Inc. All rights reserved.
> rawfile <- sampleFilePath()
> Idx <- readIndex(rawfile)
> str(Idx)
'data.frame': 573 obs. of 8 variables:
$ scan : int 1 2 3 4 5 6 7 8 9 10 ...
$ scanType : chr "FTMS + c NSI Full ms [350.0000-1800.0000]" "FTMS + c NSI Full ms2 [email protected] [140.0000-1015.0000]" "FTMS + c NSI Full ms2 [email protected] [140.0000-1335.0000]" "FTMS + c NSI Full ms2 [email protected] [140.0000-1415.0000]" ...
$ rtinseconds : num 0.097 0.35 0.419 0.489 0.558 0.627 0.696 0.766 0.835 0.904 ...
$ precursorMass : num 1075 487 645 684 547 ...
$ MSOrder : chr "Ms" "Ms2" "Ms2" "Ms2" ...
$ charge : int 0 2 2 2 2 2 2 2 2 2 ...
$ masterScan : int NA NA NA NA NA NA NA NA NA NA ...
$ dependencyType: logi NA NA NA NA NA NA ...
> Idx$precursorMass
[1] 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535
[11] 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617
[21] 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384
[31] 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246
[41] 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381
[51] 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398
[61] 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279
[71] 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345
[81] 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567
[91] 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297
[101] 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606
[111] 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535
[121] 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617
[131] 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384
[141] 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246
[151] 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381
[161] 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398
[171] 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279
[181] 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345
[191] 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567
[201] 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297
[211] 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606
[221] 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535
[231] 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617
[241] 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384
[251] 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246
[261] 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381
[271] 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398
[281] 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279
[291] 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345
[301] 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567
[311] 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297
[321] 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606
[331] 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535
[341] 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617
[351] 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384
[361] 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246
[371] 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381
[381] 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398
[391] 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279
[401] 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345
[411] 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567
[421] 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297
[431] 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606
[441] 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535
[451] 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617
[461] 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384
[471] 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246
[481] 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381
[491] 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398
[501] 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567 644.8226 683.8279
[511] 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297 582.3190 488.5345
[521] 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606 1075.0000 487.2567
[531] 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535 636.8691 776.9297
[541] 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617 756.4250 554.2606
[551] 1075.0000 487.2567 644.8226 683.8279 547.2980 669.8381 683.8537 699.3384 726.8357 622.8535
[561] 636.8691 776.9297 582.3190 488.5345 710.3505 517.7398 583.8923 722.3246 751.8105 653.3617
[571] 756.4250 554.2606 1075.0000
In addition there is the MS order and dependency type attribute. It is very easy to filter for data dependent scans of level 2 who's precursors fall into a certain mass range. Am I missing something?
from rawrr.
100 % agreed, what's missing is the quadrupole bounardies or quadrupole isolation width (highly variable between laboratories or in PRM / DIA methods) which prevents us to know if the m/z of interest is indeed in the scan.
Or maybe this information is somewhere else ?
Sorry if my explanations were unclear.
Best regards,
Vivian
from rawrr.
For a ddMS2 "experiment" as Thermo calls it you can do:
> head(Idx)
scan scanType rtinseconds precursorMass MSOrder
1 1 FTMS + c NSI Full ms [350.0000-1800.0000] 0.097 1075.0000 Ms
2 2 FTMS + c NSI Full ms2 [email protected] [140.0000-1015.0000] 0.350 487.2567 Ms2
3 3 FTMS + c NSI Full ms2 [email protected] [140.0000-1335.0000] 0.419 644.8226 Ms2
4 4 FTMS + c NSI Full ms2 [email protected] [140.0000-1415.0000] 0.489 683.8279 Ms2
5 5 FTMS + c NSI Full ms2 [email protected] [140.0000-1135.0000] 0.558 547.2980 Ms2
6 6 FTMS + c NSI Full ms2 [email protected] [140.0000-1385.0000] 0.627 669.8381 Ms2
charge masterScan dependencyType
1 0 NA NA
2 2 NA NA
3 2 NA NA
4 2 NA NA
5 2 NA NA
6 2 NA NA
> S2 <- readSpectrum(rawfile = rawfile, scan = 2)
> S2[[1]]$`MS2 Isolation Width:`
[1] "1.40"
> S2[[1]]$`MS2 Isolation Offset:`
[1] "0.00"
So the isolation width of the quad was set to 1.4 Da with a zero offset for the 2nd scan.
from rawrr.
thanks a lot !
from rawrr.
Related Issues (20)
- Get information on gradient HOT 1
- Peak charges for MS1 spectras HOT 4
- Spectrum scan centroid mZ, intensity and noises values do not match HOT 2
- Error in Example: Length of "x" and "y" are not matching HOT 3
- Read noise value for profile mode mass spectra HOT 4
- Read_Spectrum - Sum Spectra
- unit should be minute / auc computation in seconds HOT 24
- validate_rawrrSpectrum 'StartTime' HOT 4
- "Error: line 1 did not have 9 elements" for readIndex() and readChromatogram() + "Error : No scan vector is provided"for readSpectrum HOT 9
- Problem executing readChromatogram inside Singularity container HOT 5
- Add a check if `input` file exists and is not empty
- Error in if (rvs != "No RAW file specified!") { : the condition has length > 1 HOT 16
- Switch to RawFileReader 5.0.93 HOT 2
- different total number of Spectra in msconvert, compomics/ThermoRawFileParser and thermofisherlsms/RawFileReader HOT 2
- Request for auc.rawrrChromatogram HOT 3
- profile mode in readSpectrum HOT 11
- speed up readIndex/readSpectrum by base::textConnection HOT 14
- rawrr::buildRawrrExe() fails HOT 6
- auc.rawrrChromatogram question
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rawrr.