GithubHelp home page GithubHelp logo

process_maxquant's People

Contributors

arielkomen avatar joerivstrien avatar

Stargazers

 avatar

Watchers

 avatar

process_maxquant's Issues

Sample names are not always fully detected (resulting in unexpected behaviour)

The file '20210128_BHM_DDM_XL-CP_all_proteinGroups' has the sample names "iBAQ BHM_DDM_DMTMM_01" and "iBAQ BHM_DDM_PhoX_01" where the program only detects "iBAQ BHM" as sample name resulting in an error. So, make sure that sample names are detect which are "iBAQ {sample_name}_[0-9]{2}" or something like that.

Validate the settings file

Currently, I assume that the settings file the user can change is always correct. Everyone makes mistakes so I should write code to validate whether the settings file is valid.

Enable the user to dictate the column output order and sort the samples alphabetically.

In the settings file the user should be able to dictate the order of the columns in the excel file. This also means that some functions should be made to check whether the user has entered valid column names. Additionally, the samples should be inserted in the excel in a alphabetically way instead of a random order.

The columns in the excel file have the following order: the original columns, the user can dictate which order these appear in the excel file - samples, should be sorted alphabetically - new columns, will be sorted at random.

Processing comments about how the script works

  • Change the order of the excel sheet columns. The original order should be retained, then the iBAQ samples and clustered columns should appear and lastly the 'new' things from uniprot and mitocarta.
  • Change the order of the excel sheets. The applying protein sheet first and then the non-applying protein sheet.
  • Enable that the columns that are looked through in the mitocarta step can be changed depending on user arguments.
  • Per sample, calculate the total iBAQ protein abundance value per protein and make a column with this value. Additionally, calculate the global protein abundance value and put this value in a new column
  • In order to increase user friendlyness create a simple PyQt5 gui in which the input and settings file can be selected.

Evaluate whether identifiers where found

When the program tries to fetch data from uniprot it is using uniprot identifiers(hopefully). In order to get these identifiers fasta headers are examined and the identifier is extracted from the fasta header. A function needs to be made which evaluates whether the input, identifiers, are present. If not, skip fetching data from uniprot.

Program crashes instead of displaying error in GUI.

When running the tool on a dataset with gi numbers instead of uniprot ID's, but leaving the option "uniprot_step" on, the tool and GUI shut down/crash instead of the error being displayed in the GUI.

Traceback:
Traceback (most recent call last): File "/home/joeri/Documents/coding_projects/process_maxquant/gui_file_acceptor.py", line 149, in execute_process_maxquant_script protein_groups_dataframe = fetch_uniprot_annotation_step(self, protein_groups_dataframe, settings_dict) File "/home/joeri/Documents/coding_projects/process_maxquant/process_maxquant.py", line 948, in fetch_uniprot_annotation_step protein_data_dict = fetch_uniprot_annotation(gui_object, protein_groups_dataframe["identifier"], settings_dict["uniprot_step"]) File "/home/joeri/Documents/coding_projects/process_maxquant/process_maxquant.py", line 305, in fetch_uniprot_annotation request.raise_for_status() File "/home/joeri/miniconda3/envs/py3new/lib/python3.9/site-packages/requests/models.py", line 943, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://www.ebi.ac.uk/proteins/api/proteins?offset=0&size=100&accession=10048432,...110625963,110625975,110625979 Aborted (core dumped)

Fix the error message output

Whenever an error message appears it is printed in the GUI. However, the program resumes and the errors message should go away. This doesn't happen and should be changed.
This is a minor thing but should be done in order to prevent confusion.

Maximum recursion error during clustering step

When running the tool on a protein_groups file with 6 samples, (each having 60 slices) an error occurs during the clustering step.

The GUI displays the following error:
An exception occurred while applying clustering on a sample
Maximum recursion depth exceeded in comparison.

relevant lines in log file before error occurs:
INFO:root:Step 4, cluster the fractions per sample using hierarchical clustering.
INFO:root:Start hierarchical clustering for sample T5

(not sure why it starts clustering T5? maybe something went wrong with parsing the other samples? Samples are T1 - T6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.