Windows Download here
Linux run from Source
Todo
pip install -r requirements.txt
python main.py
Add as many MPI-SHH MAscot Report GEnerator HTML files as you like. Will scrap the relevant information
The scraper expects the table inside of the HTML in the following layout:
The Excel sheet should be in this style for the scraper to work.
Speciem | Accession | Protein Description (GN=) | Number of Peptides | Link |
---|
Speciem | Accession | Protein Description (GN=) | Number of Peptides | Link | Tissue specificity | Tissue expression cluster | specific | status |
---|---|---|---|---|---|---|---|---|
If Tissue specificity or Tissue expression cluster contain the word "brain", it's yes else no | 0=not crawled, 1 = successfully crawled, 2= not found |
Insert DOI here