fgcz / rawrr Goto Github PK
View Code? Open in Web Editor NEWAccess Orbitrap data in R lang using C# mono assembly - bioconductor package
Home Page: https://bioconductor.org/packages/rawrr/
Access Orbitrap data in R lang using C# mono assembly - bioconductor package
Home Page: https://bioconductor.org/packages/rawrr/
last two files of each available instrument
PFILE=/srv/www/htdocs/Data2San/sync_LOGS/pfiles.txt ;
cat ${PFILE} \
| cut -f4 -d";" \
| cut -d"/" -f3 \
| sort \
| uniq \
|while read i ;
do
grep -E "/${i}/.*autoQC01.*raw$" ${PFILE} \
| tail -n 2;
done \
| cut -d";" -f4 \
| while read raw ;
do
[[ -f /srv/www/htdocs/${raw} ]] && echo ${raw} ;
done
Dear all,
is there a way of accessing the S/N values from a readSpectrum-Object? I cant seem to find the info in the list.
Thanks
no binary code would be contained in the source package.
all RawFileReader Assemblies need to be installed
if (Sys.which("msbuild") == "" && Sys.which("xbuild") == "")
{
warning ("could not find msbuild or xbuild in path; will not be able to use rDotNet unless corrected and rebuilt")
return()
}
Hi,
I worked a lot with RawDiag in the past and really love this package, many thx! For certain reasons I need to switch to rawrr and I miss the information you already got from the index table in rawDiag. Especially the ScanDescription is missing and for QC many of the other values were of great help. Do you think there is a way to expand the exported column set similar to the one from rawDiag? It would speed up my code tremendously. My current solution is to read each spectrum and get the information from there, but it's very time consuming.
All the best
Henrik
Hi,
Very nice tool!
At the moment, sadly:
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawR_0.1.0.tar.gz')
Warning in install.packages :
package ‘http://fgcz-ms.uzh.ch/~cpanse/rawR_0.1.0.tar.gz’ is not available for this version of R
A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
R.version
_
platform x86_64-apple-darwin17.0
arch x86_64
os darwin17.0
system x86_64, darwin17.0
status
major 4
minor 0.3
year 2020
month 10
day 10
svn rev 79318
language R
version.string R version 4.0.3 (2020-10-10)
nickname Bunny-Wunnies Freak Out
The Journal of Proteome Research is preparing to publish its Second Biennial Special Issue on Software Tools and Resources in February 2021.
Software tools and data resources are essential to research in all omics domains, including proteomics and metabolomics. The goal of this recurring special issue is to highlight the latest novel and significantly updated software tools, web applications, and databases scientists can use for data analysis and visualization in proteomics and related research.
For readers, this provides an easily identifiable source of tools specifically reviewed for their applicability and ease of adoption. For authors, this provides visibility for and wider adoption of their tools in the proteomics community through dissemination and documentation.
The same team that led the first Special Issue on Software Tools and Resources will also lead this one:
Journal of Proteome Research Associate Editor Susan Weintraub, The University of Texas Health Science Center at San Antonio
Michael Hoopmann, Institute for Systems Biology
Magnus Palmblad, Leids Universitair Medisch Centrum
They invite you to submit a manuscript by October 31, 2020.
What to Submit—Deadline: October 31, 2020
For inclusion in this special issue, authors must present either a complete description of a relevant novel tool, library, web application, or database (article) or a substantial and meaningful update of a previously published tool or resource (technical note). The full working tool or database must be available free-of-charge to editors and reviewers for evaluation at the time of manuscript submission.
Tools with a graphical or web browser interface are preferred, but the editors will also consider well-documented web service APIs or libraries of functional building blocks for custom data analysis pipelines.
Manuscript Requirements
Manuscripts must be submitted electronically through the ACS Paragon Plus Environment online submission system by October 31, 2020, and conform to the Journal of Proteome Research Author Guidelines.
Authors, please:
Indicate in your cover letter that the manuscript is for the Special Issue on Software Tools and Resources.
Remember, the full working tool or database must be available free-of-charge to editors and reviewers for evaluation at the time of manuscript submission.
Be concise and focus the manuscript on the unique or novel functionality of the tool. It should be clear to any reader what problem the system addresses and how it is used.
For tools and libraries, use the table form to describe the input, operations, and output of each tool or function. A screenshot of the interface may be included if this has novel or unusual features.
LEARN MORE: Read the first Special Issue on Software Tools and Resources, including the Editorial by Susan Weintraub, Michael Hoopmann, and Magnus Palmblad.
What about a rawR
function that checks for the installed .NET framework? The C#
code could be based on
const string subkey = @"SOFTWARE\Microsoft\NET Framework Setup\NDP\v4\Full\";
using (var ndpKey = RegistryKey.OpenBaseKey(RegistryHive.LocalMachine, RegistryView.Registry32).OpenSubKey(subkey))
{
if (ndpKey != null && ndpKey.GetValue("Release") != null)
{
Console.WriteLine($".NET Framework Version: {CheckFor45PlusVersion((int)ndpKey.GetValue("Release"))}");
}
else
{
Console.WriteLine(".NET Framework Version 4.5 or later is not detected.");
}
}
// Checking the version using >= enables forward compatibility.
static string CheckFor45PlusVersion(int releaseKey)
{
if (releaseKey >= 528040)
return "4.8 or later";
if (releaseKey >= 461808)
return "4.7.2";
if (releaseKey >= 461308)
return "4.7.1";
if (releaseKey >= 460798)
return "4.7";
if (releaseKey >= 394802)
return "4.6.2";
if (releaseKey >= 394254)
return "4.6.1";
if (releaseKey >= 393295)
return "4.6";
if (releaseKey >= 379893)
return "4.5.2";
if (releaseKey >= 378675)
return "4.5.1";
if (releaseKey >= 378389)
return "4.5";
// This code should never execute. A non-null release key should mean
// that 4.5 or later is installed.
return "No 4.5 or later version detected";
}
as shown here by Microsoft.
We would just need to write into a tmp file instead to the console. The R function could be something like dotNetInfo()
aligned with sessionInfo()
output. Or we directly check if >= 4.5.1 and return a logical
, but then the function should be named is.NET()
and contain a parameter named release
and we would set it to 4.5.1 as default. This would also be very useful for CI, since we probably can't request specific .NET version on the test infrastructure.
What do you think, @cpanse ?
Line 147 in f9974c1
maybe one of the apply function fill do the job
runs on fgcz-c-073
docker run -a stdin -a stdout -i -t rocker/verse:4.0.5 R
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)
rawfile <- rawrr::sampleFilePath()
h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)
docker run -a stdin -a stdout -i -t c95c10872a5d
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)
rawfile <- rawrr::sampleFilePath()
h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)
Listing of the Dockerfile
FROM rocker/verse:4.0.5
RUN apt-get update \
&& sudo apt-get install mono-runtime -y
CMD ["R"]
docker run -a stdin -a stdout -i -t f53000645fca
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)
rawfile <- rawrr::sampleFilePath()
h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)
Listing of the Dockerfile
FROM rocker/verse:4.0.5
RUN apt-get update \
&& sudo apt-get install mono-mcs mono-xbuild -y
CMD ["R"]
docker run -a stdin -a stdout -i -t -v /usr/local/lib/RawFileReader/:/usr/local/lib/RawFileReader/ d6cec6026a70
docker run -i -v /usr/local/lib/RawFileReader/:/usr/local/lib/RawFileReader/ d6cec6026a70 R --no-save << EOF
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)
Sys.getenv("MONO_PATH")
rawfile <- rawrr::sampleFilePath()
h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)
EOF
Listing of the Dockerfile
FROM rocker/verse:4.0.5
RUN apt-get update \
&& sudo apt-get install mono-mcs mono-xbuild -y
CMD ["R"]
Hi guys,
When plotting multiple raw files with iRT peptides, I'm using the function .plotChromatogramAndFit that you showed.
I want to add a title being the name of each of those raw files to the plot, though I'm not being successful. Any ideas?
plot(x, main=???); legend("topright", legend=i, title='Instrument Model', bty = "n", cex=0.75)
Thanks a lot for the great library :)
We should update the package in a way that `citation("rawrr") returns the desired information. The current state is:
> citation(package = "rawrr")
To cite package ‘rawrr’ in publications use:
Christian Panse and Tobias Kockmann (NA). rawrr: Access to Thermo Fisher Scientific raw
files from R. R package version 0.1.7. https://github.com/fgcz/rawR/
A BibTeX entry for LaTeX users is
@Manual{,
title = {rawrr: Access to Thermo Fisher Scientific raw files from R},
author = {Christian Panse and Tobias Kockmann},
note = {R package version 0.1.7},
url = {https://github.com/fgcz/rawR/},
}
Warning messages:
1: In citation(package = "rawrr") :
no date field in DESCRIPTION file of package ‘rawrr’
2: In citation(package = "rawrr") :
could not determine year for ‘rawrr’ from package DESCRIPTION file
I would suggest to reference our bioRxiv manuscript for now.
Hi, I really appreciated your efforts in making this package.
I have a few questions about the readChromatogram function, I get the XIC for an analyte, such as the 836.07492 at tol:100.
Once I got the XIC, there are an equal number of retention time and intensity, when I look at the details of the RT and intensity, for example, at rt33.03 min, the output intensity from this function is 23020686, but for the raw data, the NL is 7.11E6. So I was wondering, how is the output 23020686 calculated?
Another question is that, is there any way to get the area under the curve of XIC? I want to do the quantitation analysis.
Thank you.
Hi @cpanse,
I had a look at the return values of readIndex()
and readFileHeader()
and I think it would make sense to combine them into a single object. The object would be structured into a data portion which is the data.frame
returned by readIndex
. All items in the list
returned by readFileHeader
would become attributes of the object. The object class
could be something like rawRindex
.
refactor rawDiag::readXICs(rawfile, masses=unique(RAW$PrecursorMass), tol=1000)
returns a nested S3 list
[[22]]
$mass
[1] 554.2606
$times
[1] 0.1216408 0.1516450 0.4810452 0.5409557 0.7801059
$intensities
[1] 3005.061 4328.104 3658.515 3862.011 4992.357
$filename
[1] "sample.raw"
attr(,"class")
[1] "list" "XIC"
attr(,"class")
[1] "list" "XICs"
> X[[20]]
$mass
[1] 653.3617
$times
[1] 0.001619751 0.031642766 0.061663615 0.091651065 0.121640750 0.151644970
[7] 0.181667770 0.211526280 0.241284530 0.271307600 0.301222200 0.331145000
[13] 0.361147270 0.391168050 0.421057700 0.450970250 0.481045180 0.510989030
[19] 0.540955750 0.570893130 0.600724580 0.630620300 0.660428770 0.690318400
[25] 0.720320350 0.750197620 0.780105920
$intensities
[1] 374171.6 405717.2 350914.7 373948.4 328768.2 425965.4 360327.9 453483.1
[9] 445894.1 430538.9 422901.3 545305.5 433117.8 357588.1 435593.4 351018.2
[17] 407768.5 406468.8 446027.8 385148.5 579871.8 461409.0 390769.3 458988.6
[25] 378339.6 480078.4 467780.0
$filename
[1] "sample.raw"
attr(,"class")
with additional attributes:
input:
type BPC, TIC need no additional parameters. XIC requires mz and tolerance in addition.
output:
Hi everyone,
to test this package I wanted to load the .raw file and follow the provided example code.
Somehow my R sends me an error message that the "x and "y" coordinates are not matching:
Now when I run I get:
> plot(S[[1]], centroid=TRUE)
Error in xy.coords(x, y, xlabel, ylabel, log) :
Length of 'x' and 'y' do not match
I have absolutly no Idea what I'm doing wrong and am super lost.
I would appreciate some help here!
Also I am new to R and working with bio-informatics data so if anyone could provide any help how to come up with the number in the scan vector (paper just mentions some database seach?) that would be awesome aswell.
Thanks in advance!
Hi,
Thanks for making a great tool! I have found it quite useful so far 👍
I have an issue for XIC values when I wish to plot a certain peptides.
Firstly, I can successfully extract and plot the XICs using your inbuilt functions, but cannot figure out how to constrain the retention times plotted/extracted.
I did manage to access the S3 elements in the chromatogram object and plot them myself in ggplot, but then had an issue where rawR does not report the 0 values for M/Zs at certain times. This is useful to see the shape of the eluting peptide, though I acknowledge it will likely increase the object size...
Is it possible to clarify (1) how to constrain the XIC for a certain retention time range, and (2) how to access (or at least impute from RT of MS1 scans) the 0 values of XICs.
Thanks again for making this tool,
Tara
function reads file header information. In Freestyle key:value pairs
Sample Name autoQC01
Comment
Seq Row 10
Sample Type Unknown
Path D:\Data2San\p2469\Proteomics\QEXACTIVEHF_2\bpfister_20200714
Cal Level
Cal File
Inj Volume 2
Sample Weight 0
Sample Volume 0
Sample Id NA
Istd Amount 0
CD Factor 0
Bar Code
Bar Code Status 0
Inst Method C:\Xcalibur\methods\__autoQC\trap\autoQC01.meth
Proc Method
User Text1 2469
User Text2
User Text3 FGCZ
User Text4
User Text5
Tray Index 80
Tray Name ANSI-48Vial2mLHolder/ANSI-48Vial2mLHolder
Tray Shape Rectangular
Vial Index 48
Vials Per Tray 48
Vials Per TrayX 8
Vials Per TrayY 6
Instrument Name Q Exactive HF Orbitrap
Instrument Model Q Exactive HF Orbitrap
Instrument Number Exactive Series slot #2496
Instrument SoftWare 2.9-290204/2.9.3.2948
Instrument Hardware rev. 1
Flags
Mass Tolerance 0.5 amu
Created by Administrator
returns S3 object, (nested) list
The goal would be that Spectra could not only be read from local raw files, but also public repositories like ProteomicsDB and prediction services like Prosit. A REST endpoint is already available and used by USE. This REST interface should also work for queries from R.
solve the naming conflict with
https://CRAN.R-project.org/package=rawr
R CMD check rawR_0.1.1.tar.gz
will produces
* package encoding: UTF-8
* checking CRAN incoming feasibility ... ERROR
Maintainer: 'Christian Panse <[email protected]>'
New submission
Conflicting package names (submitted: rawR, existing: rawr [https://CRAN.R-project.org])
Conflicting package names (submitted: rawR, existing: rawr [CRAN archive])
The Title field should be in title case. Current version is:
'Access to Thermo Fisher Scientific raw files from R'
In title case that is:
R> BiocCheck("rawR_0.1.1.tar.gz")
This is BiocCheck version 1.26.0. BiocCheck is a work in
progress. Output and severity of issues may change. Installing
package...
* Checking Package Dependencies...
* Checking if other packages can import this one...
* Checking to see if we understand object initialization...
* Checking for deprecated package usage...
* Checking for remote package usage...
* Checking version number...
* Checking for version number mismatch...
* Checking version number validity...
Package version 0.1.1; pre-release
* Checking R Version dependency...
* Checking package size...
* Checking individual file sizes...
* WARNING: The following files are over 5MB in size:
'rawRcolor.tif'
* Checking biocViews...
* Checking that biocViews are present...
* ERROR: No biocViews terms found.
See http://bioconductor.org/developers/how-to/biocViews/
* Checking build system compatibility...
* Checking for blank lines in DESCRIPTION...
* Checking if DESCRIPTION is well formatted...
* Checking for proper Description: field...
* Checking for whitespace in DESCRIPTION field names...
* Checking that Package field matches directory/tarball
name...
* Checking for Version field...
* Checking for valid maintainer...
* Checking DESCRIPTION/NAMESPACE consistency...
* WARNING: Import grDevices, graphics, utils in
DESCRIPTION as well as NAMESPACE.
* Checking vignette directory...
This is an unknown type of package
* ERROR: No 'vignettes' directory.
* Checking library calls...
* Checking for library/require of rawR...
* Checking coding practice...
* NOTE: Avoid sapply(); use vapply()
Found in files:
rawR.R (line 1011, column 29)
* NOTE: Avoid 1:...; use seq_len() or seq_along()
Found in files:
rawR.R (line 600, column 36)
rawR.R (line 745, column 70)
Warning in readLines(infile) :
incomplete final line found on '/tmp/RtmpLJg2l6/filedebd713f5408/rawR/tests/testthat/test-header.R'
* WARNING: Avoid class() == or class() != ; use is() or
!is()
Found in files:
R/rawR.R (line 68)
* Checking parsed R code in R directory, examples,
vignettes...
* Checking function lengths..........
* NOTE: Recommended function length <= 50 lines.
There are 5 functions > 50 lines.
The longest 5 functions are:
plot.rawRspectrum() (R/rawR.R, line 711): 108 lines
readChromatogram() (R/rawR.R, line 429): 105 lines
print.rawRspectrum() (R/rawR.R, line 839): 84 lines
readFileHeader() (R/rawR.R, line 106): 61 lines
validate_rawRspectrum() (R/rawR.R, line 635): 52 lines
* Checking man page documentation...
* WARNING: Add non-empty \value sections to the following
man pages: man/plot.rawRchromatogram.Rd,
man/plot.rawRchromatogramSet.Rd,
man/plot.rawRspectrum.Rd, man/print.rawRspectrum.Rd,
man/summary.rawRspectrum.Rd
man/plot.rawRspectrum.Rd, man/print.rawRspectrum.Rd,
man/summary.rawRspectrum.Rd
* ERROR: At least 80% of man pages documenting exported
objects must have runnable examples. The following pages
do not:
new_rawRspectrum.Rd, plot.rawRchromatogramSet.Rd,
validate_rawRspectrum.Rd
* NOTE: Usage of dontrun{} / donttest{} found in man page
examples.
14% of man pages use one of these cases.
Found in the following files:
readChromatogram.Rd
readSpectrum.Rd
* NOTE: Use donttest{} instead of dontrun{}.
Found in the following files:
readChromatogram.Rd
readSpectrum.Rd
* Checking package NEWS...
* NOTE: Consider adding a NEWS file, so your package news
will be included in Bioconductor release announcements.
* Checking unit tests...
* Checking skip_on_bioc() in tests...
* Checking formatting of DESCRIPTION, NAMESPACE, man pages, R
source, and vignette source...
* NOTE: Consider shorter lines; 32 lines (2%) are > 80
characters long.
First 6 lines:
R/rawR.R:7 .writeRData <- function(rawfile, outputfile=paste0...
R/rawR.R:14 list(scanType=rv$scanType, mZ=rv$mZ, inte...
R/rawR.R:28 warning("Can not find Mono JIT co...
R/rawR.R:41 rvs <- system2(Sys.which('mono'), arg...
R/rawR.R:64 #' pathToRawFile <- file.path(path.package(packag...
R/rawR.R:154 e$info$`Instrument method` <- ba...
* NOTE: Consider 4 spaces instead of tabs; 5 lines (0%)
contain tabs.
First 5 lines:
R/zzz.R:5 if(interactive()){
R/zzz.R:6 version <- packageVersion('rawR')
R/zzz.R:7 packageStartupMessage("Package 'rawR' version ", ...
R/zzz.R:8 invisible()
R/zzz.R:9 }
* NOTE: Consider multiples of 4 spaces for line indents,
233 lines(14%) are not.
First 6 lines:
R/rawR.R:107 mono = if(Sys.info()['sysname'] %in% c("Darwi...
R/rawR.R:108 exe = system.file('exec/rawR.exe',package = '...
R/rawR.R:109 mono_path = "",
R/rawR.R:110 argv = "infoR",
R/rawR.R:111 system2_call = TRUE,
R/rawR.R:112 method = "thermo"){
See
http://bioconductor.org/developers/how-to/coding-style/
See styler package:
https://cran.r-project.org/package=styler as described
in the BiocCheck vignette.
* Checking if package already exists in CRAN...
* ERROR: Package must be removed from CRAN.
* Checking for bioc-devel mailing list subscription...
* NOTE: Cannot determine whether maintainer is subscribed
to the bioc-devel mailing list (requires admin
credentials). Subscribe here:
https://stat.ethz.ch/mailman/listinfo/bioc-devel
* Checking for support site registration...
Maintainer is registered at support site.
Summary:
ERROR count: 4
WARNING count: 4
NOTE count: 10
For detailed information about these checks, see the BiocCheck
vignette, available at
https://bioconductor.org/packages/3.12/bioc/vignettes/BiocCheck/inst/doc/BiocCheck.html#interpreting-bioccheck-output
BiocCheck FAILED.
$error
[1] "No biocViews terms found."
[2] "No 'vignettes' directory."
[3] "At least 80% of man pages documenting exported objects must have runnable examples. The following pages do not:"
[4] "Package must be removed from CRAN."
$warning
[1] "The following files are over 5MB in size: 'rawRcolor.tif'"
[2] "Import grDevices, graphics, utils in DESCRIPTION as well as NAMESPACE."
[3] " Avoid class() == or class() != ; use is() or !is()"
[4] "Add non-empty \\value sections to the following man pages: man/plot.rawRchromatogram.Rd, man/plot.rawRchromatogramSet.Rd, man/plot.rawRspectrum.Rd, man/print.rawRspectrum.Rd, man/summary.rawRspectrum.Rd"
$note
[1] " Avoid sapply(); use vapply()"
[2] " Avoid 1:...; use seq_len() or seq_along()"
[3] "Recommended function length <= 50 lines."
[4] "Usage of dontrun{} / donttest{} found in man page examples."
[5] "Use donttest{} instead of dontrun{}."
[6] "Consider adding a NEWS file, so your package news will be included in Bioconductor release announcements."
[7] "Consider shorter lines; 32 lines (2%) are > 80 characters long."
[8] "Consider 4 spaces instead of tabs; 5 lines (0%) contain tabs."
[9] "Consider multiples of 4 spaces for line indents, 233 lines(14%) are not."
R>
Hello again :-)
Unfortunately I have a little problem, which I don't know how to solve...
readSpectrum() gives the following error message:
Error in source(tfo, local = TRUE) : negative length vectors are not allowed
A little example how my code looks:
library(rawrr)
library(tidyverse)
#reading Index and selecting scans with ms_order = "Ms"
ms_order <- "Ms"
IDX <- as_tibble(readIndex(path))
scans <- IDX %>% filter(MSOrder == ms_order) %>% pull(scan)
SPC <- readSpectrum(path, scan = scans)
I have about 12000 scans in total and about 2500 MS1 scans in my raw file. It works fine as long as I only read about 2000 scans. After that I receive the error message. I don't know if it due to memory limits on my machine.
Thanks for your help in advance!
kaempfro
Hi,
I really like the package. Thank you for that! I was just going through the vignette and when I do readChromatogramm it gives me the following error. I guess I will not be the only one with that. How do you solve that?
plot(rawR::readChromatogram(rawfile = rawfile, type = "tic"))
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed
Also:
C <- rawR::readChromatogram(rawfile, mass = iRTmZ, tol = 10, type = "xic", filter = "ms")
plot(C, diagnostic = TRUE)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x, na.rm = na.rm) :
no non-missing arguments to min; returning Inf
2: In max(x, na.rm = na.rm) :
no non-missing arguments to max; returning -Inf
3: In min(x, na.rm = na.rm) :
no non-missing arguments to min; returning Inf
4: In max(x, na.rm = na.rm) :
no non-missing arguments to max; returning -Inf
Cheers!
refactor rawDiag::readScans()
rawfile : path to raw file
scans : numeric vector for selection based on scan index
filter : scan filter for logical selection of scans (e.g. MS, MS2, +, HCD, ...)
S3 object, nested list of type Spectrum (not peaklist
)
for FTMS
scans the list should contain vectors for: mz, intensity, resolution, noise, charge
In addition header information is needed:
I have been using this package for ~1 year now, specifically the readChromatogram function for extracting "tic" and "base peak". It works nicely and has been very useful. However, I have just for the first time tried to extract an "xic" for some masses of interest, and here is what I get:
XICs <- lapply(Raws, function(raw) { readChromatogram(raw, masses, tols) })
Error in .rawrrSystem2Source(rawfile, input = mass, rawrrArgs = sprintf("xic %f %s", :
**Rcode file to parse does not exist. 'C:\Users\MyUserName\AppData\Local/R/cache/R/rawrr/rawrrassembly/rawrr.exe' failed for an unknown reason.
Please check the debug files:
C:\Users\MyUserName\AppData\Local\Temp\2\RtmpeQSRv0\file62986016698d.stderr
C:\Users\MyUserName\AppData\Local\Temp\2\RtmpeQSRv0\file629849cf6634.stdout
and the System Requirements
Called from: .rawrrSystem2Source(rawfile, input = mass, rawrrArgs = sprintf("xic %f %s",
tol, shQuote(filter)))**
This is on a Windows 2019 Server machine, using R version 4.1.0 (2021-05-18) in RStudio 1.4.1717.
File "C:/Users/MyUserName/AppData/Local/R/cache/R/rawrr/rawrrassembly/rawrr.exe" does exist, but maybe this is an issue with slashes in Windows, since the error uses inconsistently backwards (Windows) and forward (Linux) slashes? In which case, including normalizePath(..., winslash = "/") would probably be enough to fix it?
Each raw file header contains a detector list. c# methods are:
int GetInstrumentCountOfType (Device type)
Device GetInstrumentType(int index);
int InstrumentCount { get; }
see page 17 of UsingRawFileReader.
details are available through InstrumentData GetInstrumentData();
Implementation of plot.rawRspectrum(...)
assuming $class = "rawRspectrum"
Hi developers,
Thanks for the really great package! Saves me so much file conversion time.
I have a question regarding viewing RAW-files during acquisition. When I load those (also after making a copy of the file to 'fix' it), I get an error that the 'RAW file still being acquired', which of course is indeed the case.
Is there any possibilty to view it anyway, like you would do in vendor software to do some quick checks? Or is critical info save at the end of the run?
> RAW_chrom <- readChromatogram(rawfile = rawfile, tol = 3, mass = masses_of_intest)
RAW file still being acquired
Thanks!
It is possible to use rawrr
to access the noise
values for a raw file collected in reduced profile mode?
From reading
the code in rawrr.cs, it seems that the noise is only read for
centroided data, but I wanted to be sure.
e.g., sudo apt-get install mono ...
Basic idea
Spectra are 2D data items (x, y data):
x : position (m/z)
y : intensity
All other information can be assumed to be meta data for the moment. The most basic idea to represent this data in R is to use two numeric vectors and pair according to the vector indices, so (xi, yi) are corresponding values in the 2D space generated by the vectors.
Collections of scans that are connected by a further dimension, for instance RT, could be handled as lists of vector tuples.
L
| - (xi, yi)
| - (xi, yi)
| - (xi, yi)
But there are some problems to this: If we use the RT to generate an index for L let's use j here, than we can only select scan according to index position, but this Lj may not be equal to the original scan# nor does it allow for RT-based access.
But we could add RT as data dimension directly and arrive at:
x : position (m/z)
y : intensity
z : RT
A 3D data type for numeric is simply an array. Array have nice properties, since they can be sliced along all dimensions as needed. The only problem left to solve is: Centroided data generates vectors of unequal length, but transforming these to sparse vectors/matrixes would solve the problem.
Two steps:
rawRspectrum
object as input and returns a sparse matrix.The first would also be handy if one would like to compute dot products or alike.
rawrr::readSpectrum is very slow, making it unuseable to read files with 10,000s of spectra
By slow I mean it takes ~1 second on my 1 year old Macbook Pro to read a spectrum.
(I do call the function once, with list of spectrum ids.)
It would take 3 hours just to read a single file. That renders the package unuseable by some two orders of magnitude.
I will be investigating to figure out what is the culprit. It might be necessary to add switches that remove some "advanced" functionality from spectrum reads to get the performance back (?).
How nice would that be?
The following should be displayed after loading the package:
RawFileReader reading tool. Copyright © 2016 by Thermo Fisher Scientific, Inc. All rights reserved
Hi, Thankful for your great work to support the useful R packages, But the sample() is a import function in R base. Many R script and packages base on the "sample()" function. If you can rename the example database function like "example()" . IT will help researcher use more fluently.
Thanks for you grateful work for R packages“rawR”
Hi everyone,
I have a problem while completing XIC graphic, here is the code:
iRT.mZ <- c(487.2571, 547.2984, 622.8539, 636.8695, 644.8230, 669.8384,
683.8282, 683.8541, 699.3388, 726.8361, 776.9301)
c<- rawrr::readChromatogram(rawfile, mass = iRT.mZ, tol = 10, type = 'xic', filter = 'ms')
#Extracted Ion Chromatogram
plot(c, diagnostic = TRUE)
The problem is --> Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Can anyone help me?
Thanks all!
Hi:
I'm using "rawrr" to read and represent 2 spectrum scans. With the first scan, I had no problems when using the plot function to represent "spectrum.scan$centroid.mZ" values on the x axis, and "spectrum.scan$centroid.intensity" and" spectrum.scan$noises" as a rate on the y axis: The plotted spectrum is the same as the expected spectrum in this case, considering the obtained results with another software to represent them. However, with the second one, the plotted image is different as the expected one. I think that this is due to that mZ, intensity and noises vectors differ in length for this second scan. How can I approach this situation? How is it possible that these vector lengths differ? In the first scan, this wasn't the case.
Thank you for your time.
Hi,
First, thank you for developping such a nice package. I've been using it for a few days and have been really amazed by it so far ! For some context, I'm a big fan of MSnbase, but it requires data conversion which can be inconvenient... One handy function of MSnbase that is partly missing (or ) in rawrr
is header
which gives access to a myriad of useful information in a data.frame as follows :
print(names(header(msfile)))
[1] "seqNum" "acquisitionNum"
[3] "msLevel" "polarity"
[5] "peaksCount" "totIonCurrent"
[7] "retentionTime" "basePeakMZ"
[9] "basePeakIntensity" "collisionEnergy"
[11] "ionisationEnergy" "lowMZ"
[13] "highMZ" "precursorScanNum"
[15] "precursorMZ" "precursorCharge"
[17] "precursorIntensity" "mergedScan"
[19] "mergedResultScanNum" "mergedResultStartScanNum"
[21] "mergedResultEndScanNum" "injectionTime"
[23] "filterString" "spectrumId"
[25] "centroided" "ionMobilityDriftTime"
[27] "isolationWindowTargetMZ" "isolationWindowLowerOffset"
[29] "isolationWindowUpperOffset" "scanWindowLowerLimit"
[31] "scanWindowUpperLimit"
I don't know if this information is available but readIndex()
could ideally contain (some of) these data, which would allow broader application of the package (for example, "isolationWindowLowerOffset"
and "isolationWindowUpperOffset"
give critical information for DIA applications).
Thanks again,
Vivian
"If you are using just a few functions from another package, the recommended option is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun(). Alternatively, though no longer recommended due to its poorer readability, use @importFrom, e.g., @importFrom pgk fun, and call the function(s) without ::."
taken from https://roxygen2.r-lib.org/articles/namespace.html#imports
Example found in rawrr.R
:
#' Plot \code{rawrrChromatogramSet} objects
#'
#' @param x A \code{rawrrChromatogramSet} object to be plotted.
#' @param ... Passes additional arguments.
#' @param diagnostic Show diagnostic legend?
#' @author Tobias Kockmann, 2020.
#' @export
#' @importFrom grDevices hcl.colors
#' @importFrom graphics lines text
and many many more!
Hi cpanse and tobiakso
I've noticed that there are some parameters missing in the rawRspectrum
after importing rawfile.
However not all those parameters may be set in our experiment. But I also get a wrong reading of Base Peak Intensity and Base Peak Mass.
Thanks for helping!
RawRspectrum:
> Total Ion Current: 4870947
> Scan Low Mass: 50
> Scan High Mass: 250
> Scan Start Time (Min): 0
> Scan Number: 1
> Base Peak Intensity: -1
> Base Peak Mass: -1
> Scan Mode: FTMS + p NSI Full ms [50.00-250.00]
> ======= Instrument data ===== :
>
> Multiple Injection:
>
> Multi Inject Info:
>
> AGC: On
> Micro Scan Count: 1
> Scan Segment: 0
> Scan Event: 0
> Master Index: 0
> Charge State: 1
> Monoisotopic M/Z: 78.0468
> Ion Injection Time (ms): 100.000
> Max. Ion Time (ms):
>
> FT Resolution: 30000
> MS2 Isolation Width: 0.0
> MS2 Isolation Offset:
>
> AGC Target:
>
> HCD Energy:
>
> Analyzer Temperature:
>
> === Mass Calibration:
>
> Conversion Parameter B: 47557789.235
> Conversion Parameter C: -2547049.695
> Temperature Comp. (ppm):
>
> RF Comp. (ppm):
>
> Space Charge Comp. (ppm):
>
> Resolution Comp. (ppm):
>
> Number of Lock Masses:
>
> Lock Mass #1 (m/z):
>
> Lock Mass #2 (m/z):
>
> Lock Mass #3 (m/z):
>
> LM Search Window (ppm):
>
> LM Search Window (mmu):
>
> Number of LM Found:
>
> Last Locking (sec):
>
> LM m/z-Correction (ppm):
>
> === Ion Optics Settings:
>
> S-Lens RF Level:
>
> S-Lens Voltage (V):
>
> Skimmer Voltage (V):
>
> Inject Flatapole Offset (V):
>
> Bent Flatapole DC (V):
>
> MP2 and MP3 RF (V):
>
> Gate Lens Voltage (V):
>
> C-Trap RF (V):
>
> ==== Diagnostic Data:
>
> Dynamic RT Shift (min):
>
> Intens Comp Factor:
>
> Res. Dep. Intens:
>
> CTCD NumF:
>
> CTCD Comp:
>
> CTCD ScScr:
>
> RawOvFtT:
>
> LC FWHM parameter:
>
> Rod:
>
> PS Inj. Time (ms):
>
> AGC PS Mode:
>
> AGC PS Diag:
>
> HCD Energy eV:
>
> AGC Fill:
>
> Injection t0:
>
> t0 FLP:
>
> Access Id:
>
> Analog Input 1 (V):
>
> Analog Input 2 (V):
Line 46 in eeceea8
Just came across this when running R CMD check
:
checking for executable files ...
Found the following executable files:
exec/ThermoFisher.CommonCore.BackgroundSubtraction.dll
exec/ThermoFisher.CommonCore.Data.dll
exec/ThermoFisher.CommonCore.MassPrecisionEstimator.dll
exec/ThermoFisher.CommonCore.RawFileReader.dll
exec/rawR.exe
Source packages should not contain undeclared executable files.
See section ‘Package structure’ in the ‘Writing R Extensions’ manual.
Did that and found:
1.1.7 Non-R scripts in packages
Code which needs to be compiled (C, C++, Fortran …) is included in the src subdirectory and discussed elsewhere in this document.
Subdirectory exec could be used for scripts for interpreters such as the shell, BUGS, JavaScript, Matlab, Perl, php (amap), Python or Tcl (Simile), or even R. However, it seems more common to use the inst directory, for example WriteXLS/inst/Perl, NMF/inst/m-files, RnavGraph/inst/tcl, RProtoBuf/inst/python and emdbook/inst/BUGS and gridSVG/inst/js.
So shouldn't we put the rawR.exe
and the dlls in src
instead of exec
@cpanse ?
Hello,
really great package! I was wondering if it was possible to also get information on the chromatography via your package? For example, getting the LC gradient or the LC pressure curve would be great!
Thank you in advance!
Yasin
Hello, and thank you for your package!
I was wondering if it is somehow possible to get the Noise value that is reported for every single mass peak in Thermo's .raw files.
Thanks!
Dear all,
thanks for this very useful package!
Is there a way to extract the peak "charges" of MS1 spectras with rawrr::readSpectrum? It seems to only work for MS2 spectras at the moment.
Thanks a lot.
Should we start using CI services like GitHub actions?
https://ropensci.org/technotes/2020/11/19/moving-away-travis/
I think it is a nice way of making sure rawR
works on different OS platforms and different R versions.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.