GithubHelp home page GithubHelp logo

compomics / ms2rescore Goto Github PK

View Code? Open in Web Editor NEW
41.0 5.0 14.0 310.13 MB

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications

Home Page: https://ms2rescore.readthedocs.io

License: Apache License 2.0

Python 95.23% Dockerfile 0.20% HTML 4.00% Inno Setup 0.57%
proteomics ms2pip percolator deeplc peptide-identification

ms2rescore's Introduction

MS²Rescore



GitHub release PyPI GitHub Workflow Status GitHub issues GitHub Last commit

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications

⚠️ Note: This is the documentation for the fully redeveloped version 3.0 of MS²Rescore. While MS²Rescore 3.0 has been drastically improved over the previous version, you might run into some unforeseen issues. Please report any issues you encounter on the issue tracker or post your questions on the GitHub Discussions forum.

About MS²Rescore

MS²Rescore performs ultra-sensitive peptide identification rescoring with LC-MS predictors such as MS²PIP and DeepLC, and with ML-driven rescoring engines Percolator or Mokapot. This results in more confident peptide identifications, which allows you to get more peptide IDs at the same false discovery rate (FDR) threshold, or to set a more stringent FDR threshold while still retaining a similar number of peptide IDs. MS²Rescore is ideal for challenging proteomics identification workflows, such as proteogenomics, metaproteomics, or immunopeptidomics.

MS²Rescore overview

MS²Rescore can read peptide identifications in any format supported by psm_utils (see Supported file formats) and has been tested with various search engines output files:

MS²Rescore is available as a desktop application, a command line tool, and a modular Python API.

TIMS²Rescore: Direct support for DDA-PASEF data

MS²Rescore v3.1+ includes TIMS²Rescore, a usage mode with specialized default configurations for DDA-PASEF data from timsTOF instruments. TIMS²Rescore makes use of new MS²PIP prediction models for timsTOF fragmentation and IM2Deep for ion mobility separation. Bruker .d and miniTDF spectrum files are directly supported through the timsrust library.

Checkout our preprint for more information and the TIMS²Rescore documentation to get started.

Citing

Latest MS²Rescore publication:

MS²Rescore 3.0 is a modular, flexible, and user-friendly platform to boost peptide identifications, as showcased with MS Amanda 3.0. Louise Marie Buur*, Arthur Declercq*, Marina Strobl, Robbin Bouwmeester, Sven Degroeve, Lennart Martens, Viktoria Dorfer*, and Ralf Gabriels*. Journal of Proteome Research (2024) doi:10.1021/acs.jproteome.3c00785
*contributed equally

MS²Rescore for immunopeptidomics:

MS²Rescore: Data-driven rescoring dramatically boosts immunopeptide identification rates. Arthur Declercq, Robbin Bouwmeester, Aurélie Hirschler, Christine Carapito, Sven Degroeve, Lennart Martens, and Ralf Gabriels. Molecular & Cellular Proteomics (2021) doi:10.1016/j.mcpro.2022.100266

MS²Rescore for timsTOF DDA-PASEF data:

TIMS²Rescore: A DDA-PASEF optimized data-driven rescoring pipeline based on MS²Rescore. Arthur Declercq*, Robbe Devreese*, Jonas Scheid, Caroline Jachmann, Tim Van Den Bossche, Annica Preikschat, David Gomez-Zepeda, Jeewan Babu Rijal, Aurélie Hirschler, Jonathan R Krieger, Tharan Srikumar, George Rosenberger, Dennis Trede, Christine Carapito, Stefan Tenzer, Juliane S Walz, Sven Degroeve, Robbin Bouwmeester, Lennart Martens, and Ralf Gabriels. bioRxiv (2024) doi:10.1101/2024.05.29.596400

Original publication describing the concept of rescoring with predicted spectra:

Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions. Ana S C Silva, Robbin Bouwmeester, Lennart Martens, and Sven Degroeve. Bioinformatics (2019) doi:10.1093/bioinformatics/btz383

To replicate the experiments described in this article, check out the publication branch of the repository.

Getting started

The desktop application can be installed on Windows with a one-click installer. The Python package and command line interface can be installed with pip, conda, or docker. Check out the full documentation to get started.

Questions or issues?

Have questions on how to apply MS²Rescore on your data? Or ran into issues while using MS²Rescore? Post your questions on the GitHub Discussions forum and we are happy to help!

How to contribute

Bugs, questions or suggestions? Feel free to post an issue in the issue tracker or to make a pull request!

ms2rescore's People

Contributors

anasilviacs avatar arthurdeclercq avatar paretje avatar ralfg avatar rodvrees avatar tivdnbos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ms2rescore's Issues

GUI Windows installation error

Hello, I'm trying to install the GUI Windows but I got the following error:

ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\\Users\\hieutran1985\\Downloads\\ms2rescore-gui-windows\\gui-windows\\Miniconda3\\envs\\ms2rescore\\Lib\\site-packages\\tensorflow\\include\\external\\cudnn_frontend_archive\\_virtual_includes\\cudnn_frontend\\third_party\\cudnn_frontend\\include\\cudnn_frontend_EngineConfig.h'

It seems related to tensorflow and cudnn.

Implementation of get_search_engine_features for PeptideShaker input

Hello,

I tried to install and run ms2rescore together with percolator on a peptideshaker output.
I used conda environment and installed percolator using conda and ms2rescore using pip.
When I call the tool I get the help printed so I'd guess my installation is correct but not entirely sure.

I tried to run ms2rescore using the command:
ms2rescore -c config_defaut.json -m /scratch/users/bkunath/IMP_MetaP/MGF_LAO/D01 vi/scratch/users/bkunath/IMP_MetaP/OUT_LAO/D01.out/D01_1_1_Extended_PSM_Report.txt

and obtained this error:

2021-05-10 15:25:10 // INFO // ms2rescore // Using PeptideShakerPipeline.
2021-05-10 15:25:12 // ERROR // ms2rescore.__main__ // Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/home/users/bkunath/miniconda3/envs/ms2rescore/lib/python3.9/site-packages/ms2rescore/__main__.py", line 15, in main
    rescore.run()
  File "/home/users/bkunath/miniconda3/envs/ms2rescore/lib/python3.9/site-packages/ms2rescore/__init__.py", line 207, in run
    search_engine_features = self.pipeline.get_search_engine_features()
  File "/home/users/bkunath/miniconda3/envs/ms2rescore/lib/python3.9/site-packages/ms2rescore/id_file_parser.py", line 439, in get_search_engine_features
    return self.extended_psm_report.ext_psm_report.get_search_engine_features()
  File "/home/users/bkunath/miniconda3/envs/ms2rescore/lib/python3.9/site-packages/ms2rescore/peptideshaker.py", line 187, in get_search_engine_features
    raise NotImplementedError
NotImplementedError

Here is a copy of the config file if needed.
Thanks a lot for your help!
Ben

{
    "$schema": "./config_schema.json",
    "general":{
        "pipeline":"peptideshaker",
        "feature_sets":["all"],
        "run_percolator":true,
        "id_decoy_pattern": "REVERSED",
        "num_cpu":-1,
        "config_file":"/scratch/users/bkunath/IMP_MetaP/config_defaut.json",
        "tmp_path":"/scratch/users/bkunath/IMP_MetaP/MGF_LAO/ms2",
        "mgf_path":"/scratch/users/bkunath/IMP_MetaP/MGF_LAO/D01",
        "output_filename":"/scratch/users/bkunath/IMP_MetaP/MGF_LAO/ms2/D01",
        "log_level": "info"
    },
    "ms2pip": {
        "model": "HCD",
        "frag_error": 0.02,
        "modifications": [
            {"name":"Acetyl", "unimod_accession":1, "mass_shift":42.010565, "amino_acid":null, "n_term":true, "c_term": false},
            {"name":"Carbamidomethyl", "unimod_accession":4, "mass_shift":57.021464, "amino_acid":"C", "n_term":false, "c_term": false},
            {"name":"Deamidated", "unimod_accession":7, "mass_shift":0.984016, "amino_acid":"N", "n_term":false, "c_term": false},
            {"name":"PhosphoS", "unimod_accession":21, "mass_shift":79.966331, "amino_acid":"S", "n_term":false, "c_term": false},
            {"name":"PhosphoT", "unimod_accession":21, "mass_shift":79.966331, "amino_acid":"T", "n_term":false, "c_term": false},
            {"name":"PhosphoY", "unimod_accession":21, "mass_shift":79.966331, "amino_acid":"Y", "n_term":false, "c_term": false},
            {"name":"Pyro-carbamidomethyl", "unimod_accession":26, "mass_shift":39.994915, "amino_acid":"C", "n_term":false, "c_term": false},
            {"name":"Glu->pyro-Glu", "unimod_accession":27, "mass_shift":-18.010565, "amino_acid":"E", "n_term":true, "c_term": false},
            {"name":"Gln->pyro-Glu", "unimod_accession":28, "mass_shift":-17.026549, "amino_acid":"Q", "n_term":true, "c_term": false},
            {"name":"Oxidation", "unimod_accession":35, "mass_shift":15.994915, "amino_acid":"M", "n_term":false, "c_term": false},
            {"name":"iTRAQ", "unimod_accession":214, "mass_shift":144.102063, "amino_acid":null, "n_term":true, "c_term": false},
            {"name":"Ammonia-loss", "unimod_accession":385, "mass_shift":-17.026549, "amino_acid":"C", "n_term":true, "c_term": false},
            {"name":"TMT6plexN", "unimod_accession":737, "mass_shift":229.162932, "amino_acid":"N", "n_term":false, "c_term": false},
            {"name":"TMT6plex", "unimod_accession":737, "mass_shift":229.162932, "amino_acid":null, "n_term":true, "c_term": false},
            {"name":"Amidated", "unimod_accession": 2, "mass_shift": -0.984016, "amino_acid":null, "n_term": false, "c_term": true}
        ]
    },
    "maxquant_to_rescore": {
        "mgf_dir": "",
        "modification_mapping":{
            "ox":"Oxidation",
            "ac":"Acetyl",
            "cm":"Carbamidomethyl",
            "de":"Deamidated",
            "gl":"Gln->pyro-Glu"
        },
        "fixed_modifications":{
            "C":"Carbamidomethyl"
        }
    },
    "percolator": {}
}

Error parsing scan number from mgf file

Hi, I'm trying to run ms2rescore on MaxQuant output, and here's the error I keep getting:

2022-10-25 08:34:42 // INFO // ms2rescore // Using MaxQuantPipeline.
2022-10-25 08:34:44 // WARNING // ms2rescore.maxquant // Removed 17537 non-rank 1 PSMs.
2022-10-25 08:34:45 // INFO // ms2rescore.parse_mgf // Parsing 2 MGF files to single MGF containing all PSMs.
  0%|                                                                                                                                                                               | 0/2 [00:00<?, ?it/s]
2022-10-25 08:34:45 // ERROR // ms2rescore.__main__ // Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/ms2rescore/__main__.py", line 15, in main
    rescore.run()
  File "/ms2rescore/__init__.py", line 233, in run
    peprec = self.pipeline.get_peprec()
  File "/ms2rescore/id_file_parser.py", line 402, in get_peprec
    self.parse_mgf_files(peprec)
  File "/ms2rescore/id_file_parser.py", line 390, in parse_mgf_files
    mgf_title_pattern=self.mgf_title_pattern
  File "/ms2rescore/parse_mgf.py", line 109, in parse_mgf
    title = title_parser(line, mgf_title_pattern=mgf_title_pattern, method=title_parsing_method, run=run)
  File "/ms2rescore/parse_mgf.py", line 66, in title_parser
    f"Could not extract scan number from TITLE field: `{line.strip()}`"
ms2rescore.parse_mgf.ParseMGFError: Could not extract scan number from TITLE field: `TITLE=expt1.2.2.3`

I checked my mgf file, and the TITLE lines look like this:

TITLE=expt1.2.2.3
TITLE=expt1.4.4.3
TITLE=expt1.5.5.3
...

I.e., for each scan, there are three numbers separated by dots after the experiment title. I converted the raw files to mgf with msconvert. I understand parse_mgf.py script expects a different format.

Would be grateful for any help!

install without gui?

Hello,
I used to be able to install without issues and now there are various failures because of wxPython and/or gooey. I need to be able to install without ui on virtual machines (i.e. docker) and on HPCs. Do you have instructions for doing that?
I tried removing the requirement for gooey but it doesn't seem to work... (tried removing gui.py and the reference to gooey in setup.py before python setup.py install)
Thanks!

MaxQuant can have long-format modification notations; does not work in current implementation

Hi,

I am running ms2pip through ms2resocre, and i get that following Index error.

Merging results
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/noeguill/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/noeguill/anaconda3/lib/python3.7/site-packages/ms2pip/ms2pipC.py", line 316, in process_spectra
modpeptide = apply_mods(peptide, mods, PTMmap)
File "/home/noeguill/anaconda3/lib/python3.7/site-packages/ms2pip/ms2pipC.py", line 631, in apply_mods
modpeptide[pos] = mod
IndexError: index 23 is out of bounds for axis 0 with size 19
"""

Any ideas ?

Thanks a lot,
Noé

Proteome Discoverer node / PSM file support

Hello! I would like to ask if you are planning to make a Proteome Discoverer node for you tool. It would be very usefull for our facility and I think that for the entire community. Thanks!

Allow rescoring of multiple hits per spectrum

Dear all,

Thank you for creating this nice tool.

I have recently tried to re-score some comet results (pin files). However, during the searches I usually set "num_output_lines" to a value grater than 1 to export also lower-than-best scoring results. Often this improves score adjustment by the TPP/Prophets pipeline. Unfortunately it seems that these lower-hit ranks are also written to the percolator files and result in the error below. The same might be true for other search engines that can output such hits.
The error traces back to the _get_spectrum_index_column method in the percolator.py file where the pattern string discards the spectrum identifier information on charge and hit-rank (e.g. there are scans like ..._623_2_1; ..._623_2_2; ... which all become spec id 623).
Would it be possible to discard these lower-ranking hits automatically and just throw a warning instead? I guess this would be the cleanest solution as I am not sure if percolator can handle the information properly.

Best,
Juergen

The error is:
Traceback (most recent call last):
File "C:\Programs\Python310\lib\site-packages\ms2rescore_main_.py", line 15, in main
rescore.run()
File "C:\Programs\Python310\lib\site-packages\ms2rescore_init_.py", line 233, in run
peprec = self.pipeline.get_peprec()
File "C:\Programs\Python310\lib\site-packages\ms2rescore\id_file_parser.py", line 224, in get_peprec
return self.peprec_from_pin()
File "C:\Programs\Python310\lib\site-packages\ms2rescore\id_file_parser.py", line 179, in peprec_from_pin
peprec = self.original_pin.to_peptide_record(
File "C:\Programs\Python310\lib\site-packages\ms2rescore\percolator.py", line 470, in to_peptide_record
peprec_df["spec_id"] = self._get_spectrum_index_column(
File "C:\Programs\Python310\lib\site-packages\ms2rescore\percolator.py", line 270, in _get_spectrum_index_column
raise PercolatorInError("Issue in matching spectrum IDs, duplicates found.")
ms2rescore.percolator.PercolatorInError: Issue in matching spectrum IDs, duplicates found.

Suggest correction for comet pin file

Hello,

First of all, thank you for maintaining this nice software.
I leave some comments for more comprehensive manuals (for comet pin).

  1. Change below code. When I applied below, it works well for comet pin file also.

id_file_parser.py
line number 46~48
change "retention_times[index] = float(line[12:].strip())" to "retention_times[titles[index]] = float(line[12:].strip())"

  1. Please add explicit explanation about that the column name "lnExpect" must be changed to "COMET:lnExpect" to avoid "psm_score" null point error.

I am glad to share this information with future users ^^*.

Error while running command line ms2rescore

Hi,

I'm trying to run the command line version of ms2rescore in my local PC with Ubuntu 20.04.5 LTS and Python 3.7.0. I installed percolator and copied the default config_default.json file from this repo. My file is a QC HeLa that I converted to MGF with ThermoRawFileParser and searched with OpenMS+Mascot. I converted the output idXML to mzID with IDFileConverter .

Here the files I used: https://www.dropbox.com/sh/tz6mmmtnqk9mqd5/AACF8Nk6BFbF5M9VHwQB8Rrza?dl=0

The line for running the program:

ms2rescore -c config_default.json -m 2022MQ903_QC02_001_01_100ng.raw.mgf 2022MQ903_QC02_001_01_100ng.raw_mascot_idfilter_aaa_peptideindexer_fdr_idfilter_score.mzid

And the result:

2022-09-05 15:58:50 // INFO // ms2rescore // Using MSGFPipeline.
2022-09-05 15:58:50 // INFO // ms2rescore.percolator // Running Percolator PIN converter
2022-09-05 15:58:50 // ERROR // ms2rescore.__main__ // Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/__main__.py", line 15, in main
    rescore.run()
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/__init__.py", line 233, in run
    peprec = self.pipeline.get_peprec()
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/id_file_parser.py", line 245, in get_peprec
    return self.peprec_from_pin()
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/id_file_parser.py", line 179, in peprec_from_pin
    peprec = self.original_pin.to_peptide_record(
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/id_file_parser.py", line 169, in original_pin
    self._run_percolator_converter()
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/id_file_parser.py", line 149, in _run_percolator_converter
    log_level=self.log_level,
  File "/home/rolivella/venv/lib/python3.7/site-packages/ms2rescore/percolator.py", line 532, in run_percolator_converter
    subprocess.run(command, capture_output=log_level == "debug", check=True)
  File "/home/rolivella/anaconda3/lib/python3.7/subprocess.py", line 453, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/home/rolivella/anaconda3/lib/python3.7/subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "/home/rolivella/anaconda3/lib/python3.7/subprocess.py", line 1499, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'msgf2pin': 'msgf2pin'

Am I missing something?

Thank you

Suggestions for Ion mobility MS data

Hello @RalfG,

First thanks very much for developing such wonderful tool, I really enjoyed working with it!

I have a question regarding how to apply MS2rescore to bruker (.d) or broadly speaking, the ion mobility MS data (TIMS), such that features from same RT will further be separated by a gas phase.

Practically, the issue I am facing right now is when using MS2PIP to generate features, I can not correspond the MaxQuant scan ID and the raw scan ID from bruker raw data (.d). It seems that MaxQuant did some sort of accumulation along ion mobility axis to make the MS/MS as conventional spectrum and then submitted to search engine (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7261821/). Because of that, my impression is the current implementation of MS2rescore can not handle well because it relies on one-to-one correspondence between a PSM and the raw MS/MS spectrum.

There are other conceptual challenges like the lack of training model for TIMS-TOF specifically, and technically, the ion mobility should also be a predictable feature that may help with rescoring. In light of that, I just want to get your thoughts on:

[1] Whether my understanding is correct, that current implementation of MS2rescore is more focused on Thermo data and may not be applicable to other like bruker TIMS data.

[2] Do you have any ideas on the difficulties that you can foresee of adapting the model to ion mobility data?

[3] If I still want to use that, can I skip the MS2PIP step and only use DeepLC and other features, even additional features from my customized function to assist with rescoring, do you think there will still be any increase on the identification rate? Because it seems that MS2PIP features indeed contribute a lot to the prediction.

Thanks very much in advance,
Frank

MS-GF+ms2rescore pipeline doesn't run

Dear Concern,

When I try to run ms2rescore on .mzid files generated by MS-GF+, I get the following error:

ms2rescore -c ms2rescore_config.json -m test.mgf test.mzid
2023-08-21 14:35:35 // INFO // ms2rescore // Using MSGFPipeline.
2023-08-21 14:36:28 // ERROR // ms2rescore.__main__ // Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/path/ms2rescore/__main__.py", line 15, in main
    rescore.run()
  File "/path/ms2rescore/__init__.py", line 233, in run
    peprec = self.pipeline.get_peprec()
  File "/path/ms2rescore/id_file_parser.py", line 245, in get_peprec
    return self.peprec_from_pin()
  File "/path/id_file_parser.py", line 191, in peprec_from_pin
    raise IDFileParserError(
ms2rescore.id_file_parser.IDFileParserError: Could not map all MGF retention times to spectrum indices.

I understand this topic has been previously discussed on here, with the solution being to downgrade to ms2rescore version 2.0.0. But when I run pip install ms2rescore, I get this error:

DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement ms2rescore==2.0.0 (from versions: none)
ERROR: No matching distribution found for ms2rescore==2.0.0

The same error occurs when I run just "pip install ms2rescore" a well.

Will be grateful for your help in this regard!

Thanks and best regards,
Hassan

Install v3.0.0b4 from Windows .exe, failed to run with "Cannot find XGBoost Library in the candidate path"

I downloaded MS2Rescore-v3.0.0b4-Windows64bit.exe and installed in the default path. When trying to run ms2rescore.exe, I get:

Failed to execute script '__main__' due to unhandled exception:
Cannot find XGBoost Library in the candidate path. List of candidates:
- C:\Program Files (x86)\MS2Rescore\_internal\xgboost\lib\xgboost.dll
- C:\Program Files (x86)\MS2Rescore\_internal\xgboost\..\..\lib\xgboost.dll
- C:\Program Files (x86)\MS2Rescore\_internal\lib\xgboost.dll
- C:\Program Files (x86)\MS2Rescore\_internal\xgboost\../../windows/x64/Release/xgboost.dll
- C:\Program Files (x86)\MS2Rescore\_internal\xgboost\./windows/x64/Release/xgboost.dll
XGBoost Python package path: C:\Program Files (x86)\MS2Rescore\_internal\xgboost
sys.prefix: C:\Program Files (x86)\MS2Rescore\_internal
See: https://xgboost.readthedocs.io/en/latest/build.html for installing XGBoost.

Indeed, I can find no files in _internal that match *xgboost*, and no obvious way to install it.

ms2rescore error: ms2rescore/tandem_to_rescore.py", line 71, in make_pepfile ; KeyError: '57.02147'

I am trying to analyze my mass spec data for the presence of small proteins. I started by searching my mgf files against Uniprot annotated and reviewed proteins. I then take the unmatched spectra and search it against a small protein database. This reduces the chances of false positives arising.

I did the search using X!Tandem on R which outputs a xml file which I then use as input into ms2rescore using the following command:

ms2rescore -m /gpfs/eplab/Nathaniel/ms-analysis/unmatched_spectra_from_uniprot_search/b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf -c /gpfs/eplab/Nathaniel/ms2rescore-master/config_tandem.json /gpfs/eplab/Nathaniel/b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf_xtandem.xml

However, the following error message comes up:

Pin-converter version 3.04.0, Build Date Mar 11 2020 14:07:14
Copyright (c) 2013 Lukas Käll. All rights reserved.
Written by Lukas Käll ([email protected]) in the
School of Biotechnology at KTH - Royal Institute of Technology, Stockholm.
Issued command:
tandem2pin -P DECOY /gpfs/eplab/Nathaniel/b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf_xtandem.xml
on compute-9-18
Reading /gpfs/eplab/Nathaniel/b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf_xtandem.xml
2020-07-08 18:11:24 - INFO - Fixing tabs on pin file
2020-07-08 18:11:24 - INFO - Adding mgf TITLE to pin file
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_4460_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_5373_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_8981_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_10890_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_12455_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_13836_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_20088_3_1
No match found in mzid file for SpecId b014p014_P_56h_M5_rep1_non_validated_PSMs_24356_3_1
2020-07-08 18:11:24 - INFO - Writing PEPREC file
Traceback (most recent call last):
  File "/home/e0470749/miniconda2/envs/ms2rescore/bin/ms2rescore", line 8, in <module>
    sys.exit(main())
  File "/home/e0470749/miniconda2/envs/ms2rescore/lib/python3.6/site-packages/ms2rescore/__init__.py", line 124, in main
    peprec_filename, mgf_filename = tandem_to_rescore.tandem_pipeline(config)
  File "/home/e0470749/miniconda2/envs/ms2rescore/lib/python3.6/site-packages/ms2rescore/tandem_to_rescore.py", line 152, in tandem_pipeline
    make_pepfile(outname + "_edited.pin", config)
  File "/home/e0470749/miniconda2/envs/ms2rescore/lib/python3.6/site-packages/ms2rescore/tandem_to_rescore.py", line 71, in make_pepfile
    mod.append("{}|{}".format(m.start(), modifications[str(float(m.group(1)))]))
KeyError: '57.02147'

In addition, the following files are created after running ms2rescore:

b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf_xtandem_original.pin
b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf_xtandem_edited.pin
b014p014_P_56h_M5_rep1_non_validated_PSMs.mgf_xtandem_edited.peprecpnomod

Would be great if someone could point out where I've gone wrong.

Support for C-terminal modifications

Add support for C-terminal modifications. This requires correct parsing of C-terminal modifications from the various search engine result file formats to the PEPREC format.

  • PIN (PR #20)
  • MSGF+ (PR #20)
  • MaxQuant
  • PeptideShaker

IDFileParserError: Could not map all MGF retention times to spectrum indices (using MSGF+ pipeline)

I am running MSGF+ using command:

java -Xmx32g -jar MSGFPlus.jar -s Yeast_trypsin.mgf -d database.fasta -tda 1 -ti 0,2 -minLength 7 -maxLength 50 -mod Mods.txt -o Yeast_trypsin.mzid -addFeatures 1

Then I run ms2rescore with command:

ms2rescore -c config.json -m Yeast_trypsin.mgf Yeast_trypsin.mzid

The config file is default except specifying msgfplus as the pipeline.

This gets me the following error:

2022-09-22 16:25:32 // INFO // ms2rescore // Using MSGFPipeline.
2022-09-22 16:25:32 // INFO // ms2rescore.percolator // Running Percolator PIN converter

Pin-converter version 3.05.0, Build Date Aug 31 2020 19:06:15
Copyright (c) 2013 Lukas Käll. All rights reserved.
Written by Lukas Käll ([email protected]) in the
School of Biotechnology at KTH - Royal Institute of Technology, Stockholm.
Issued command:
msgf2pin -P XXX -o /tmp/tmpj3fvb69r/Yeast_trypsin_original.pin Yeast_trypsin.mzid

Uses features for fragment spectra mass errors
Reading Yeast_trypsin.mzid
2022-09-22 16:25:47 // ERROR // ms2rescore.main // Critical error occured in MS2ReScore
Traceback (most recent call last):
File "/home/aawa/.local/lib/python3.8/site-packages/ms2rescore/main.py", line 15, in main
rescore.run()
File "/home/aawa/.local/lib/python3.8/site-packages/ms2rescore/init.py", line 233, in run
peprec = self.pipeline.get_peprec()
File "/home/aawa/.local/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 245, in get_peprec
return self.peprec_from_pin()
File "/home/aawa/.local/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 191, in peprec_from_pin
raise IDFileParserError(
ms2rescore.id_file_parser.IDFileParserError: Could not map all MGF retention times to spectrum indices.

Any idea what could be causing this? I have tried multiple mgf files from different studies all with MSGF+ and all give the same error.

Thanks for any help.

Couldn't parse MaxQuant modifications in "Modified Sequence" column in CLI

Hey there,

I am trying to run MS2Rescore from the command line with MaxQuant results (msms.txt).

  • I used the default modification settings (Oxidation (M), Acetyl (Protein (N-term) as variable, Carbamidomethylation (C) as fixed).
  • I added the modifications to the modification_mapping configuration in the .json file:

"ms2pip": {
"model": "Immuno-HCD",
"frag_error": 0.02,
"modifications": [
{"name":"Acetyl", "unimod_accession":1, "mass_shift":42.010565, "amino_acid":null, "n_term":true, "c_term": false},
{"name":"Carbamidomethyl", "unimod_accession":4, "mass_shift":57.021464, "amino_acid":"C", "n_term":false, "c_term": false},
{"name":"Oxidation", "unimod_accession":35, "mass_shift":15.994915, "amino_acid":"M", "n_term":false, "c_term": false}
]
},
"maxquant_to_rescore": {
"modification_mapping":{
"ox":"Oxidation",
"ac":"Acetyl (Protein N-term)"
},
"fixed_modifications":{
"C":"Carbamidomethyl"
}
},
"percolator": {}
}

I believe there is an error with the parsing of the files, when I try to run the rescoring I get this:

2023-08-07 13:30:41 // INFO // ms2rescore // Using MaxQuantPipeline.
2023-08-07 13:31:14 // WARNING // ms2rescore.maxquant // Removed 105008 non-rank 1 PSMs.
2023-08-07 13:31:15 // WARNING // ms2rescore.maxquant // Removed 32 PSMs with invalid amino acids.
2023-08-07 13:31:18 // ERROR // ms2rescore.main // Critical error occured in MS2ReScore
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.9/site-packages/ms2rescore/main.py", line 15, in main
rescore.run()
File "/opt/anaconda3/lib/python3.9/site-packages/ms2rescore/init.py", line 233, in run
peprec = self.pipeline.get_peprec()
File "/opt/anaconda3/lib/python3.9/site-packages/ms2rescore/id_file_parser.py", line 397, in get_peprec
peprec = self.msms_df.msms.to_peprec(
File "/opt/anaconda3/lib/python3.9/site-packages/ms2rescore/maxquant.py", line 434, in to_peprec
peprec["modifications"] = self.get_peprec_modifications(
File "/opt/anaconda3/lib/python3.9/site-packages/ms2rescore/maxquant.py", line 314, in get_peprec_modifications
peprec_mods.append(self._get_single_peprec_modification(
File "/opt/anaconda3/lib/python3.9/site-packages/ms2rescore/maxquant.py", line 250, in _get_single_peprec_modification
mod_list = MSMSAccessor._find_mods_recursively(
File "//opt/anaconda3/lib/python3.9/site-packages/ms2rescore/maxquant.py", line 225, in _find_mods_recursively
raise ModificationParsingError(
ms2rescore._exceptions.ModificationParsingError: Coud not match remaining modification labels in sequence (Acetyl (Protein N-term))AAAAAAAAAAGAAGGR. Ensure that all modifications are configured in the MaxQuant modification_mapping setting.

I saw that there was a bug fix in the gui version of the software and I hope that the error was made on my end.

Thanks a lot in advance!

Processing PEAKS XPro output using ms2rescore

Hi MS2ReScore team,
Thanks for creating MS2ReScore and for the recent implementation of PEAKS results compatibility.

We are trying to use MS2ReScore for immunopeptidomics to rescore timsTOF Pro data processed with PEAKS Xpro but it looks like it doesn’t recognize the mgf files, since it shows the error "Not all PSMs could be found in the provided MGF files”.
Full description:
I am working with samples from IP-enriched MHC-ligandome analyzed in a nanoElute coupled to timsTOF-Pro in DDA-PASEF. Several files were acquired and each one was processed individually in PEAKS Xpro to identify possible immunopeptides (unspecific cleavage).
The .mgf and .mzid files were exported by selecting Export / For Third Party / for PRIDE / Scaffold; also, the de novo only spectra.mgf was exported from Export / Text Formats. All the files were placed in the same folder. Then, MS2Rescore was configured (details below. Already when selecting “Spectrum file directory” the .mgf weren’t shown. When the process is started despite that, the following error is shown: "Not all PSMs could be found in the provided MGF files”
MS2Rescore was run using the GUI, with the version downloaded on 23/02/2022, in Windows 10 Pro. Using the following parameters; model = Immuno-HCD, MS2 error = 0.03, pipeline = peaks, logging level = info, identification file = the .mzid file, spectrum file directory = directory containing the .mgfs, temporary file directory = same, configuration = (empty) and ouptut = (empty).

Could you please indicate me how to properly process PEAKS XPro data using ms2rescore?

Best,
David Gomez-Zepeda

PTM information is lost in pin and pout

Hi,

it looks like the PTM information is lost in the PIN file produced by ms2rescore. PEPREC has a column with PTMS, but the PIN contains already just plain (stripped) peptide sequences and so does (obviously) the output. I wonder, is there any reason to strip the PTMs from the sequences or is it just a bug?

I believe PIN preserves SpecId from PEPREC, and, thus, one can "restore" PTMs if necessary, but I believe it would be much better if PTMs will be preserved in the output by default.

Issue in selecting the mgf file of the spectrum file in GUI

Hi, I've got a quick question about selecting the mgf file in the Spectrum file directory.

I'm currently using GUI by running [python -m ms2rescore.gui] in the command window and I got a window where I could browse files and folders directly. I tried putting in my mgf spectrum files but when I click the browse button, and try to select the file, it gives me a message, "No items match your search."
And when I select the folder and try running, I get an error, "FileNotFoundError: [WinError 2] The system cannot find the file specified."

I am wondering why I'm not able to see and select my files in the Spectrum file directory browse.

Thank you so much.

Filtering of PSMs from MaxQuant Search

Hello,

I noticed that MS2Rescore filters PSMs with duplicate scan IDs from MaxQuant search results as "non-rank 1 PSMs" but to my understanding MaxQuant reports multiple PSMs for chimeric spectra and so filtering these PSMs seems incorrect. Is it possible to configure MS2Rescore to avoid this?

Just wanted to clarify this, thank you for your help.

Best wishes,
John.

Wrong indexing during parsing MGF titles and retention times

Hello,

retention_times dictionary returned by parse_mgf_title_rt function has scan order index as keys, thus, mapping in peprec_from_pin function fails with Could not map all MGF retention times to spectrum indices. error message.

Indexing with spectrum titles seems to solve the problem:

#current code
            if line[0] == "R":
                if line.startswith("RTINSECONDS="):
                    retention_times[index] = float(line[12:].strip())

#new code
            if line[0] == "R":
                if line.startswith("RTINSECONDS="):
                    retention_times[titles[index]] = float(line[12:].strip())

This change has been introduced at some point after v 2.0.0 (see #79). Earlier mapping in peprec_from_pin where performed in the opposite order and, thus, were still successful.

Not able to run ms2resore through command line for this repo examples

Hello

I'm trying to run ms2rescore through the command line with the example dataset available at the ms2rescore repo:

MGF file: https://github.com/compomics/ms2rescore/blob/master/examples/mgf/20161213_NGHF_DBJ_SA_Exp3A_HeLa_1ug_7min_15000_02.mgf.zip

MZID file: https://github.com/compomics/ms2rescore/blob/master/examples/id/msgfplus.mzid

And this config file: config.zip

If run it on Ubuntu 22.04 through this command:
ms2rescore -c config.json -m 20161213_NGHF_DBJ_SA_Exp3A_HeLa_1ug_7min_15000_02.mgf msgfplus.mzid

I get this error. Am I missing some file? Thanks.

2023-07-04 14:37:49 // INFO // ms2rescore // Using MSGFPipeline.
2023-07-04 14:37:49 // INFO // ms2rescore.percolator // Running Percolator PIN converter

Pin-converter version 3.06.1, Build Date Jun 15 2023 14:57:04
Copyright (c) 2013 Lukas Käll. All rights reserved.
Written by Lukas Käll ([email protected]) in the
School of Biotechnology at KTH - Royal Institute of Technology, Stockholm.
Issued command:
msgf2pin -P XXX -o /tmp/tmp3sn1zv2g/20161213_NGHF_DBJ_SA_Exp3A_HeLa_1ug_7min_15000_02_original.pin /home/proteomics/mysoftware/compomics/mgf/20161213_NGHF_DBJ_SA_Exp3A_HeLa_1ug_7min_15000_02.mzid

Uses features for fragment spectra mass errors
Reading /home/proteomics/mysoftware/compomics/mgf/20161213_NGHF_DBJ_SA_Exp3A_HeLa_1ug_7min_15000_02.mzid
No scan number was found for a PSM (or it equaled 0), scans are ranked from 1 and up
2023-07-04 14:38:01 // ERROR // ms2rescore.__main__ // Critical error occured in MS2ReScore
Traceback (most recent call last):
  File "/home/proteomics/.local/lib/python3.10/site-packages/ms2rescore/__main__.py", line 15, in main
    rescore.run()
  File "/home/proteomics/.local/lib/python3.10/site-packages/ms2rescore/__init__.py", line 233, in run
    peprec = self.pipeline.get_peprec()
  File "/home/proteomics/.local/lib/python3.10/site-packages/ms2rescore/id_file_parser.py", line 245, in get_peprec
    return self.peprec_from_pin()
  File "/home/proteomics/.local/lib/python3.10/site-packages/ms2rescore/id_file_parser.py", line 191, in peprec_from_pin
    raise IDFileParserError(
ms2rescore.id_file_parser.IDFileParserError: Could not map all MGF retention times to spectrum indices.

Error with PEAKS mzid file

Hello,

I'm trying to run MS2Rescore on the mzid and mgf files exported from PEAKS (using the export option for third party PRIDE / scaffold). However, I got the error below. Really appreciate your help! I need to run MS2Rescore on PEAKS results, but the gui-windows version gave me installation error (posted in another issue), and the linux command-line version gave the error below.

`ms2rescore -m mgf/ peptides_1_1_0.mzid
2022-03-11 07:53:45 // INFO // ms2rescore // Using MSGFPipeline.
2022-03-11 07:53:45 // INFO // ms2rescore.percolator // Running Percolator PIN converter

Pin-converter version 3.05.0, Build Date Aug 31 2020 19:06:15
Copyright (c) 2013 Lukas Käll. All rights reserved.
Written by Lukas Käll ([email protected]) in the
School of Biotechnology at KTH - Royal Institute of Technology, Stockholm.
Issued command:
msgf2pin -P XXX -o /tmp/tmpsb1le34g/peptides_1_1_0_original.pin /data/nh2tran/DeepNovo/DeepDB/PEAKS_Online/ms2rescore/peptides_1_1_0.mzid

Error : the input file is not MzIdentML - MSGF+ format /data/nh2tran/DeepNovo/DeepDB/PEAKS_Online/ms2rescore/peptides_1_1_0.mzid

2022-03-11 07:53:45 // ERROR // ms2rescore.main // Critical error occured in MS2ReScore
Traceback (most recent call last):
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/main.py", line 15, in main
rescore.run()
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/init.py", line 233, in run
peprec = self.pipeline.get_peprec()
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 245, in get_peprec
return self.peprec_from_pin()
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 179, in peprec_from_pin
peprec = self.original_pin.to_peptide_record(
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 169, in original_pin
self._run_percolator_converter()
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 144, in _run_percolator_converter
run_percolator_converter(
File "/data/nh2tran/python3_tf2/lib/python3.8/site-packages/ms2rescore/percolator.py", line 532, in run_percolator_converter
subprocess.run(command, capture_output=log_level == "debug", check=True)
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['msgf2pin', '-P', 'XXX', '-o', '/tmp/tmpsb1le34g/peptides_1_1_0_original.pin', '/data/nh2tran/DeepNovo/DeepDB/PEAKS_Online/ms2rescore/peptides_1_1_0.mzid']' returned non-zero exit status 1.`

Failed to install ms2rescore

Dear Ralf,
I would love to test ms2rescore in conjunction with FragPipe. However, I failed to install it with the Anaconda prompt. The following error code is shown. Maybe you have an idea how to solve this issue. Thanks for your help.

Best, Peer

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\gu68nur\AppData\Local\Temp\pip-build-env-9_7v_9p0\overlay\Lib\site-packages\numpy\core\include -IC:\01_ScientificSoftware\Anaconda\include -IC:\01_ScientificSoftware\Anaconda\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\cppwinrt" /Tcms2pip/cython_modules\ms2pip_pyx.c /Fobuild\temp.win-amd64-3.8\Release\ms2pip/cython_modules\ms2pip_pyx.obj -fno-var-tracking -Og -Wno-unused-result -Wno-cpp -Wno-unused-function
cl : Befehlszeile warning D9035 : Die Option "Og" ist veraltet und wird in einer der n„chsten Versionen entfernt.
cl : Befehlszeile error D8021 : Ung\x81ltiges numerisches Argument /Wno-unused-result.
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe' failed with exit status 2

ERROR: Failed building wheel for ms2pip
Failed to build ms2pip
ERROR: Could not build wheels for ms2pip which use PEP 517 and cannot be installed directly

issue with installation

Hi, I am trying to install ms2rescore but there is an issue with installation. i get these errors and when i want to start the gui, the command window comes for 0.2 seconds and disappears and nothing gets started. my system already had anaconda, and i tried installing miniconda seperately as well and tried uninstalling everything. nothing changed the outcome!
Screenshot 2023-08-23 083509

Modification and flanking amino acid information absent from final output (pout) file

Dear Concern,

I'm running MS-GF+ms2rescore, and as the title suggests, the final .pout file doesn't have flanking amino acid or modification information in the peptide column.

This is the config file I'm using:

    "$schema": "./config_schema.json",
    "general":{
        "pipeline":"infer",
        "feature_sets":[["searchengine", "ms2pip", "rt"]],
        "run_percolator":true,
        "id_decoy_pattern": "XXX_",
        "num_cpu":12,
        "config_file":null,
        "tmp_path":"path/to/tmp/",
        "mgf_path":null,
        "output_filename":"path/to/output",
        "log_level": "info",
        "plotting": false
    },
    "ms2pip": {
        "model": "HCD",
        "frag_error": 0.02,
        "modifications": [
            {"name":"Acetyl", "unimod_accession":1, "mass_shift":42.010565, "amino_acid":null, "n_term":true, "c_term": false},
            {"name":"Carbamidomethyl", "unimod_accession":4, "mass_shift":57.021464, "amino_acid":"C", "n_term":false, "c_term": false},
            {"name":"Deamidated", "unimod_accession":7, "mass_shift":0.984016, "amino_acid":"N", "n_term":false, "c_term": false},
            {"name":"PhosphoS", "unimod_accession":21, "mass_shift":79.966331, "amino_acid":"S", "n_term":false, "c_term": false},
            {"name":"PhosphoT", "unimod_accession":21, "mass_shift":79.966331, "amino_acid":"T", "n_term":false, "c_term": false},
            {"name":"PhosphoY", "unimod_accession":21, "mass_shift":79.966331, "amino_acid":"Y", "n_term":false, "c_term": false},
            {"name":"Pyro-carbamidomethyl", "unimod_accession":26, "mass_shift":39.994915, "amino_acid":"C", "n_term":false, "c_term": false},
            {"name":"Glu->pyro-Glu", "unimod_accession":27, "mass_shift":-18.010565, "amino_acid":"E", "n_term":true, "c_term": false},
            {"name":"Gln->pyro-Glu", "unimod_accession":28, "mass_shift":-17.026549, "amino_acid":"Q", "n_term":true, "c_term": false},
            {"name":"Oxidation", "unimod_accession":35, "mass_shift":15.994915, "amino_acid":"M", "n_term":false, "c_term": false},
            {"name":"iTRAQ", "unimod_accession":214, "mass_shift":144.102063, "amino_acid":null, "n_term":true, "c_term": false},
            {"name":"Ammonia-loss", "unimod_accession":385, "mass_shift":-17.026549, "amino_acid":"C", "n_term":true, "c_term": false},
            {"name":"TMT6plexN", "unimod_accession":737, "mass_shift":229.162932, "amino_acid":"N", "n_term":false, "c_term": false},
            {"name":"TMT6plex", "unimod_accession":737, "mass_shift":229.162932, "amino_acid":null, "n_term":true, "c_term": false},
            {"name":"Amidated", "unimod_accession": 2, "mass_shift": -0.984016, "amino_acid":null, "n_term": false, "c_term": true}
        ]
    },
    "maxquant_to_rescore": {
        "mgf_title_pattern": "TITLE=.*scan=([0-9]+).*$",
        "modification_mapping":{
            "ox":"Oxidation",
            "cm":"Carbamidomethyl"
        },
        "fixed_modifications":{
            "C":"Carbamidomethyl"
        }
    },
    "percolator": {}
}

I'm running these on a pair of mzid and mgf files, as mentioned in the documentation. When I run MS-GF+, I specify carbamidomethylation of C as fixed modification and oxidation of M as variable modification.

I'd be grateful for your help in this regard. Even short of an actual fix, if you can suggest a way to extract this information from the generated files, that would be great.

Thanks again and very best regards,
Hassan

num_cpus used by MS2PIP

MS2PIP takes the default of 8 CPUs instead of taking the user defined option in config file

2021-02-05 08:57:40 // INFO // numexpr.utils // Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2021-02-05 08:57:40 // INFO // numexpr.utils // NumExpr defaulting to 8 threads.

PIN pipeline title matching: `ValueError: cannot convert float NaN to integer`

ms2rescore -m /mnt/d/protome/P20210803783/P20210803783-P1_Slot1-73_1_4257_uncalibrated.mgf -t ./ms2r -o ms2r_P1_MH-40G-300-50 -n 12 P20210803783-P1_Slot1-73_1_4257.pin

/home/llt/.local/lib/python3.8/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.

from pandas import MultiIndex, Int64Index
2022-03-18 16:29:40 // INFO // ms2rescore // Using PinPipeline.
/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/percolator.py:154: UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.
self.df["Peptide"].str.contains(r"[([^\[^\]]*)]", regex=True)
2022-03-18 16:29:43 // ERROR // ms2rescore.main // Critical error occured in MS2ReScore
Traceback (most recent call last):
File "/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/main.py", line 15, in main
rescore.run()
File "/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/init.py", line 233, in run
peprec = self.pipeline.get_peprec()
File "/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 224, in get_peprec
return self.peprec_from_pin()
File "/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/id_file_parser.py", line 179, in peprec_from_pin
peprec = self.original_pin.to_peptide_record(
File "/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/percolator.py", line 470, in to_peptide_record
peprec_df["spec_id"] = self._get_spectrum_index_column(
File "/home/llt/miniconda3/envs/ms2r/lib/python3.8/site-packages/ms2rescore/percolator.py", line 268, in _get_spectrum_index_column
id_col = self.df["SpecId"].str.extract(pattern, expand=False).astype(int)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/generic.py", line 5920, in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 419, in astype
return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 304, in apply
applied = getattr(b, f)(**kwargs)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 580, in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1292, in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1237, in astype_array
values = astype_nansafe(values, dtype, copy=copy)
File "/home/llt/.local/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1154, in astype_nansafe
return lib.astype_intsafe(arr, dtype)
File "pandas/_libs/lib.pyx", line 668, in pandas._libs.lib.astype_intsafe
ValueError: cannot convert float NaN to integer

I create python 3.8 conda environment, conda installs wxPython, pip installs MS²Rescore
“pandas.Int64Index is deprecated” just run the software and it will generate.

Add flanking amino acids to PIN output

Percolator --picked-protein/-f option requires flanking amino acids in PIN peptide column: e.g. R.RNVIDKVAK.Y. See #24.

Implementation progress:

  • MaxQuant
  • MSGFPlus
  • X!Tandem
  • PeptideShaker

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.