Comments (10)
Okay so what I would do is submit the request, then run a loop where you check the status of the job until it finishes. e.g.
import time
from unipressed import IdMappingClient
request = IdMappingClient.submit(
source="UniProtKB_AC-ID", dest="Gene_Name", ids={"A1L190", "A0JP26", "A0PK11"}
)
while True:
status = request.get_status()
if status in {"FINISHED", "ERROR"}:
break
else:
sleep(5)
from unipressed.
Thank you - I didn't realize there was a get_status() method - should've looked harder.
I implemented that but I still got the same error. Although I might have looped through more request submissions before it happened this time.
Thank you,
Dan
from unipressed.
Can you please post a reproducible example?
from unipressed.
The attached .zip is a json file containing a dictionary where the keys are integers and the values are sets of uniprot ids that I'm trying to get GI numbers for. This dictionary is referred to as "id_lists" in the loop and the "chunk" is the dictionary key. I loop through the dictionary keys to submit the subset of uniprot ids with the idmapping client with the included function get_gi_numbers() with:
uniprot_to_gi = {}
for chunk, id_list in id_lists.items():
uniprot_to_gi[chunk] = get_gi_numbers(id_list, delay=5)
def get_gi_numbers(uniprot_ids, delay=5):
request = IdMappingClient.submit(
source="UniProtKB_AC-ID", dest="GI_number", ids=uniprot_ids
)
while True:
status = request.get_status()
if status in {"FINISHED", "ERROR"}:
break
else:
time.sleep(delay)
return [i for i in request.each_result()]
Using this, I still get:
IdMappingError: UniProt has not yet processed the results, consider using time.sleep() to wait until they are complete.
Thank you,
Dan
from unipressed.
I can't easily reproduce this. The only way I could see this happening is if uniprot is actually returning an invalid result which tricks my code into thinking it hasn't finished. If you could narrow down the IDs (or possibly single ID) that causes this by catching the error unipressed
throws, that would be great.
from unipressed.
I see, I was wondering if it might be a bad id.
I'm not sure if that is the case since I can make it through one set of ids on one attempt but on another attempt, it will fail on that same set of ids.
I will look into it.
from unipressed.
It doesn't seem like a single bad ID would make it fail, I just tried it and Uniprot just ignores invalid IDs, but otherwise behaves reasonably.
from unipressed.
My guess is you're hitting an intermittent issue with the uniprot API itself, so you would get this same issue with any client library (not just unipressed
). However I would like to be able to smooth over that glitch in unipressed
which is why I want to catch it.
from unipressed.
I had this "unstable return/connection/timeout" with this package due to the lack of exception handling. All three functions, submit, get_status, and each_result call can break individually, and it is not easy to catch all the possible exceptions.
Finally, I came up with a solution without pagination ability. Hope this example helps.
from retry import retry
from unipressed import IdMappingClient
from unipressed.id_mapping.core import IdMappingError
from unipressed.id_mapping.core import IdMappingJob
@retry(IdMappingError, delay=2, tries=5)
def submit_query(gene_ids: str) -> IdMappingJob:
"""
Query UniProt DB with a string of Gene ids
Args:
gene_ids: A string of NCBI Gene IDs separated by comma
Returns:
IdMappingJob Object
"""
try:
job_request = IdMappingClient.submit(
source="GeneID", dest="UniProtKB", ids={gene_ids}
)
return job_request
except:
raise IdMappingError
@retry(ValueError, delay=2, tries=5)
def check_status(job_request: IdMappingJob) -> str:
"""
Obtain job status
Args:
job_reuqest: an IdMappingJob Object
Returns:
FINISHED or FAILED
"""
try:
job_status = job_request.get_status()
if job_status == "FINISHED":
return job_status
elif job_status == "RUNNING":
raise ValueError()
except:
return "FAILED"
@retry(IdMappingError, delay=2, tries=25)
def get_results(job_request: IdMappingJob) -> list:
"""
Retrives individual results
Args:
job_reuqest: an IdMappingJob Object
Returns:
A list of Id mapping results in the format of [{'from': '1', 'to': 'P04217'}, {'from': '1', 'to': 'V9HWD8'}
"""
try:
returned = list(job_request.each_result())
return returned
except:
raise IdMappingError
def get_uniprot_ids_from_gene_ids(gene_ids: str) -> list[dict[str, str]]:
"""
By using NCBI Gene IDs, this function maps to UniProt IDs. One NCBI Gene ID can be mapped to one or many.
Args:
gene_ids: A string of NCBI Gene IDs separated by comma
Returns:
A list of dictionaries, each dictionary consists of {'from': 'NCBI Gene ID', 'to': 'UniProt ID'}
"""
job_request = returned = None
results_parsed = None
job_request = submit_query(gene_ids)
if job_request is not None:
jstatus = check_status(job_request)
if jstatus != "FAILED":
returned = get_results(job_request)
if returned is not None:
results_parsed = []
for result in returned:
results_parsed.append(result)
return results_parsed
from unipressed.
Hi @yoonkihoon. If there really is an intermittent issue with the uniprot API, then I think your @retry
solution is a good one. Feel free to submit it as a PR.
from unipressed.
Related Issues (20)
- Improve docstrings for "duplicate" fields like `xref` which has many value prefixes
- Move `include_isoform` to `uniprotkb` only
- Better type annotations for `each_record()` HOT 1
- Support automatic parsing of more formats
- Better handling for ranges
- Investigate using Pydantic models
- Fix missing fields listed only in the autocomplete HOT 1
- Integrate the OpenAPI spec
- Implement a Python API for the entry retrieval API HOT 1
- Support the `/uniprotkb/{accession}/publications` and ` /uniprotkb/{accession}/interactions` endpoints
- Support the stream API for all search endpoints
- Support the `/genecentric` endpoints HOT 1
- Support ID Mapping Requests
- Error calling get_status() on IdMappingClient request HOT 3
- Include filter/field options in the documentation HOT 1
- Add FormatType to the docs, and any other missing Literals
- Add `include_isoform` parameter HOT 2
- Fix the constructor documentation HOT 1
- Clarify which files are auto-generated
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unipressed.