chembl / chembl_webresource_client Goto Github PK

View Code? Open in Web Editor NEW

352.0 32.0 94.0 2.23 MB

Official Python client for accessing ChEMBL API

Home Page: https://www.ebi.ac.uk/chembl/api/data/docs

License: Other

Python 74.18% Jupyter Notebook 25.82%

python rest-client rest chemistry cheminformatics chemoinformatics chembl

chembl_webresource_client's Introduction

ChEMBL webresource client

This is the only official Python client library developed and supported by ChEMBL group.

The library helps accessing ChEMBL data and cheminformatics tools from Python. You don't need to know how to write SQL. You don't need to know how to interact with REST APIs. You don't need to compile or install any cheminformatics frameworks. Results are cached.

The client handles interaction with the HTTPS protocol and caches all results in the local file system for faster retrieval. Abstracting away all network-related tasks, the client provides the end user with a convenient interface, giving the impression of working with a local resource. Design is based on the Django QuerySet interface. The client also implements lazy evaluation of results, which means it will only evaluate a request for data when a value is required. This approach reduces number of network requests and increases performance.

Installation

pip install chembl_webresource_client

Live Jupyter notebook with examples

Click here

Available filters

The design of the client is based on Django QuerySet (https://docs.djangoproject.com/en/1.11/ref/models/querysets) and most important lookup types are supported. These are:

exact
iexact
contains
icontains
in
gt
gte
lt
lte
startswith
istartswith
endswith
iendswith
range
isnull
regex
iregex
search

Only operator

only is a special method allowing to limit the results to a selected set of fields. only should take a single argument: a list of fields that should be included in result. Specified fields have to exists in the endpoint against which only is executed. Using only will usually make an API call faster because less information returned will save bandwidth. The API logic will also check if any SQL joins are necessary to return the specified field and exclude unnecessary joins with critically improves performance.

Please note that only has one limitation: a list of fields will ignore nested fields i.e. calling only(['molecule_properties__alogp']) is equivalent to only(['molecule_properties']).

For many 2 many relationships only will not make any SQL join optimisation.

Settings

In order to use settings you need to import them before using the client:

from chembl_webresource_client.settings import Settings

Settings object is a singleton that exposes Instance method, for example:

Settings.Instance().TIMEOUT = 10

Most important options:

CACHING: should results be cached locally (default is True)
CACHE_EXPIRE: cache expiry time in seconds (default 24 hours)
CACHE_NAME: name of the .sqlite file with cache
TOTAL_RETRIES: number of total retires per HTTP request (default is 3)
CONCURRENT_SIZE: total number of concurrent requests (default is 50)
FAST_SAVE: Speedup cache saving up to 50 times but with possibility of data loss (default is True)

Citing

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489243/

chembl_webresource_client's People

Contributors

Stargazers

Watchers

Forkers

teng-lin deeenes epyzerknapp spencerericksen phoenixaja mehmetazizyirik alvarovm minghao2016 eaglegenomics cthoyt zeromtmu subject-am neksa johnsantamariajr lifeixianshen chemphy jmarinllao xxffliu unsterbliche kokellab pengyayuan donkey1818 prcurran rowanalytics mlsoar orientalcds lxlsu aspirincode zhenglz sshojiro wenqiang2019 highdxy lovebingo kelly1210 tasdique martin-sicho wesley-jellett 1wert bbyun28 2333sky sailfish009 ozgurozkan123 cameronbrown100 nazaalopez cscales100 erikzhang-9762 stjordanis wxfsd cnewton1428 plin1112 harsh-nandan unixjunkie mycode-bit shunsunsun pinarsiyah takshan giribio andersonjader kdpan 100jy ajmuruga alihojatnia alllev prasannavd tanx-123 daedalus1427 paul-goldsmith muluayele999 huangjeake anomic-cr d-martinelli irrelevant2021 jourmore phantomhustle rain-forest-feather ammounaliza nikschap2107 c10h kaziaa shiyx409 kimist99 avivio huangliang0828 barionleg menggf woaiyong710 imbasri31 bhoy-troy dineshravindraraju olio-labs ganeshjalakam ericwang228 zapabob julian2001

chembl_webresource_client's Issues

sdf file output error

In [5]: with open('mols_2D.sdf', 'w') as output:
...: for mol in mols:
...: output.write(mol)
...: output.write('$$$$\n')
...:

TypeError Traceback (most recent call last)
in ()
1 with open('mols_2D.sdf', 'w') as output:
2 for mol in mols:
----> 3 output.write(mol)
4 output.write('$$$$\n')
5

TypeError: expected a string or other character buffer object

pip install fails (python 2.7.12)

Collecting unittest2py3k (from unittest2six->chembl_webresource_client)
  Using cached https://files.pythonhosted.org/packages/4e/3d/d44421e8d828af1399c1509c196db92e2a58f3764b01a0ee928d7025d1ca/unittest2py3k-0.5.1.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-NSvIsU/unittest2py3k/setup.py", line 12, in <module>
        from unittest2 import __version__ as VERSION
      File "unittest2/__init__.py", line 61, in <module>
        from .case import (TestCase, FunctionTestCase, SkipTest, skip, skipIf,
      File "unittest2/case.py", line 539
        def assertAlmostEqual(self, first, second, *, places=None, msg=None,
                                                    ^
    SyntaxError: invalid syntax
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-NSvIsU/unittest2py3k/

link github from pypi

Hi there,
Is there any documentation available except the blog post?
Would be nice to have a link from pypi to the github repo.
Cheers

Example usage?

Hi,

Are there examples of the usage of this python client?

E.g. query: approved drugs for lung cancer

cannot access TargetResource

I try to access TargetResource via chembl_webresource_client but some error occurs.

Clinical trial phase

I'm having trouble figuring out how would one get a clinical trial phase of a compound via chembl_webresource_client. Is this possible?

Compatibility with Python 2.7.7+ and python 3.+

The client library is meant to be used by all Python users. There is no guarantee that they will use any specific version of Python or downgrade/upgrade just to be able to use the client library. Since Python 3 becomes more and more popular it's very important to support this version and add automated tests.

There is a well known issue with compatibility of the old client with python 2.7.7+ due to the problem with dependent libraries. Because new_client is not using them anymore it would be worth fixing this issue as well.

How to extract all the activities measured for a protein

I was trying to download all the measured activities for this protein (CHEMBL5455). I can download the csv file from the following link:
https://www.ebi.ac.uk/chembl/g/#browse/activities/filter/target_chembl_id%3ACHEMBL5455

Is there a way to automate this in Python?

Thanks,
Xiaokang

Conda install on python 3.6 fails (pip works though)

When I run the conda install line in a python 3.6 conda environment, I get the following error:

Fetching package metadata .............
Solving package specifications: .

UnsatisfiableError: The following specifications were found to be in conflict:
  - chembl_webresource_client -> easydict -> python >=2.7,<2.8.0a0
  - python 3.6*
Use "conda info <package>" to see the dependencies for each package.

Looks like the requirements are set to only include python 2. Since this library supports python 3, it's probably best to change that requirement.

How to access the experimental data from single published article by DOI or PMID?

Dear developer,
I cannot find a tutorial on the topic of searching compounds and corresponding bioactivities extracted from specified published articles (which could be queried/located by DOIs).
If chembl_webresource_client has this function, would you please demonstrate the proper python scripts?
Thank you!

Extract ChEMBL version associated per compound

To whom it may concern,

is there a way to extract something like a publication/experimental date associated with a ChEMBL compound?

For benchmarking studies, we would like to use time-split validation. Therefore, it would be of tremendous help for us to get an association between ChEMBL version (or publication/experimental date) and a ChEMBL compound?

Best regards,
Paul

Checking service status returns TypeError

Doing:

from chembl_webresource_client import *
targets = TargetResource()
targets.status()

Returns:

TypeError                                 Traceback (most recent call last)
<ipython-input-8-fdc433ae91b9> in <module>()
----> 1 targets.status()

/usr/lib/python2.7/site-packages/chembl_webresource_client/web_resource.pyc in status(self)
     89     def status(self):
     90         service = self.get_service()
---> 91         if not 'status' in service:
     92             return False
     93         return service['status'] == 'UP'

TypeError: argument of type 'NoneType' is not iterable

If I check https://www.ebi.ac.uk/chemblws/status/ I get "UP". So ChEMBL seems to be working..

RecursionError: maximum recursion depth exceeded

Trying first line of your "quick start" guide results to an error:

>>> from chembl_webresource_client.new_client import new_client
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/site-packages/chembl_webresource_client/new_client.py", line 70, in <module>
    new_client = client_from_url(Settings.Instance().NEW_CLIENT_URL + '/spore')
  File "/usr/lib/python3.6/site-packages/chembl_webresource_client/new_client.py", line 30, in client_from_url
    res = requests.get(url)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 65, in get
    return request('get', url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 49, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 461, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/usr/lib/python3.6/site-packages/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/usr/lib/python3.6/site-packages/requests/packages/urllib3/connectionpool.py", line 341, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3.6/site-packages/requests/packages/urllib3/connectionpool.py", line 762, in _validate_conn
    conn.connect()
  File "/usr/lib/python3.6/site-packages/requests/packages/urllib3/connection.py", line 238, in connect
    ssl_version=resolved_ssl_version)
  File "/usr/lib/python3.6/site-packages/requests/packages/urllib3/util/ssl_.py", line 240, in ssl_wrap_socket
    ciphers=ciphers)
  File "/usr/lib/python3.6/site-packages/requests/packages/urllib3/util/ssl_.py", line 208, in create_urllib3_context
    context.options |= options
  File "/usr/lib64/python3.6/ssl.py", line 459, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  File "/usr/lib64/python3.6/ssl.py", line 459, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  File "/usr/lib64/python3.6/ssl.py", line 459, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  [Previous line repeated 320 more times]
RecursionError: maximum recursion depth exceeded

chembl_webresource_client.version()

It would be nice to have the a convenient function:

chembl_webresource_client.__version__()
or
chembl_webresource_client.version()

to query the current version of the client.

No connection without error message.

Hi,

thank you for this client its really helpful software!
The following comes up, when I initiate the client from an activated anaconda-environment. The connection works, when the environment is deactivated. It seems there are certain dependencies not fullfilled, however the error message does not say what exactly goes wrong. Are you aware of specific python packages (environment conditions in general) to run the chembl client??

here is what happens ----->

import chembl_webresource_client as cwc
assays = cwc.AssayResource()
assays.status()

TypeError Traceback (most recent call last)
in ()
1 import chembl_webresource_client as cwc
2 assays = cwc.AssayResource()
----> 3 targets.status()

/home/swacker/Programme/anaconda/lib/python2.7/site-packages/chembl_webresource_client/web_resource.pyc in status(self)
95 def status(self):
96 service = self.get_service()
---> 97 if not 'status' in service:
98 return False
99 return service['status'] == 'UP'

TypeError: argument of type 'NoneType' is not iterable

Querying molecules by common name depend on capitalization.

from chembl_webresource_client.new_client import new_client as cwc
cwc.molecule.search('viagra')
[{'atc_classifications': [], 'availability_type': '1', 'biotherapeutic': None, 'black_box_warning': '1', 'chebi_par_id': 58987, 'chirality': '2', 'cross_references': [{'xref_id': 'sildenafil%20citrate', 'xref_name': 'sildenafil citrate', 'xref_src': 'DailyMed'}, {'xref_id': '144205270', 'xref_name': 'SID: 144205270', 'xref_src': 'PubChem'}, {'xref_id': '170465285', 'xref_name': 'SID: 170465285', 'xref_src': 'PubChem'}, {'xref_id': 'Sildenafil', 'xref_name': None, 'xref_src': 'Wikipedia'}], 'dosed_ingredient': True, 'first_approval': 1998, 'first_in_class': '0', 'helm_notation': None, 'indication_class': 'Impotence Therapy', 'inorganic_flag': '0', 'max_phase': 4, 'molecule_chembl_id': 'CHEMBL1737', 'molecule_hierarchy': {'molecule_chembl_id': 'CHEMBL1737', 'parent_chembl_id': 'CHEMBL192'}, 'molecule_properties': {'acd_logd': '2.45', 'acd_logp': '2.47', 'acd_most_apka': '10.05', 'acd_most_bpka': '6.03', 'alogp': '1.61', 'aromatic_rings': 3, 'full_molformula': 'C28H38N6O11S', 'full_mwt': '666.71', 'hba': 8, 'hba_lipinski': 10, 'hbd': 1, 'hbd_lipinski': 1, 'heavy_atoms': 33, 'molecular_species': 'NEUTRAL', 'mw_freebase': '474.59', 'mw_monoisotopic': '474.2049', 'num_lipinski_ro5_violations': 0, 'num_ro5_violations': 0, 'psa': '113.42', 'qed_weighted': '0.55', 'ro3_pass': 'N', 'rtb': 7}, 'molecule_structures': {'canonical_smiles': 'CCCc1nn(C)c2C(=O)NC(=Nc12)c3cc(ccc3OCC)S(=O)(=O)N4CCN(C)CC4.OC(=O)CC(O)(CC(=O)O)C(=O)O', 'standard_inchi': 'InChI=1S/C22H30N6O4S.C6H8O7/c1-5-7-17-19-20(27(4)25-17)22(29)24-21(23-19)16-14-15(8-9-18(16)32-6-2)33(30,31)28-12-10-26(3)11-13-28;7-3(8)1-6(13,5(11)12)2-4(9)10/h8-9,14H,5-7,10-13H2,1-4H3,(H,23,24,29);13H,1-2H2,(H,7,8)(H,9,10)(H,11,12)', 'standard_inchi_key': 'DEIYFTQMQPDXOT-UHFFFAOYSA-N'}, 'molecule_synonyms': [{'molecule_synonym': 'Revatio', 'syn_type': 'TRADE_NAME', 'synonyms': 'REVATIO'}, {'molecule_synonym': 'Sildenafil Citrate', 'syn_type': 'FDA', 'synonyms': 'SILDENAFIL CITRATE'}, {'molecule_synonym': 'Sildenafil Citrate', 'syn_type': 'OTHER', 'synonyms': 'SILDENAFIL CITRATE'}, {'molecule_synonym': 'Sildenafil Citrate', 'syn_type': 'TRADE_NAME', 'synonyms': 'SILDENAFIL CITRATE'}, {'molecule_synonym': 'Sildenafil Citrate', 'syn_type': 'USAN', 'synonyms': 'SILDENAFIL CITRATE'}, {'molecule_synonym': 'UK-9248010', 'syn_type': 'RESEARCH_CODE', 'synonyms': 'UK-92,480-10'}, {'molecule_synonym': 'UK-92480-10', 'syn_type': 'RESEARCH_CODE', 'synonyms': 'UK-92480-10'}, {'molecule_synonym': 'Viagra', 'syn_type': 'TRADE_NAME', 'synonyms': 'VIAGRA'}], 'molecule_type': 'Small molecule', 'natural_product': '0', 'oral': True, 'parenteral': True, 'polymer_flag': False, 'pref_name': 'SILDENAFIL CITRATE', 'prodrug': '0', 'score': 1.8465444, 'structure_type': 'MOL', 'therapeutic_flag': True, 'topical': False, 'usan_stem': '-afil', 'usan_stem_definition': 'PDE5 inhibitors', 'usan_substem': '-afil', 'usan_year': 1997, 'withdrawn_class': None, 'withdrawn_country': None, 'withdrawn_flag': False, 'withdrawn_reason': None, 'withdrawn_year': None}, {'atc_classifications': ['G04BE03'], 'availability_type': '1', 'biotherapeutic': None, 'black_box_warning': '1', 'chebi_par_id': 9139, 'chirality': '2', 'cross_references': [{'xref_id': 'sildenafil%20citrate', 'xref_name': 'sildenafil citrate', 'xref_src': 'DailyMed'}, {'xref_id': '26748898', 'xref_name': 'SID: 26748898', 'xref_src': 'PubChem'}, {'xref_id': '50085897', 'xref_name': 'SID: 50085897', 'xref_src': 'PubChem'}], 'dosed_ingredient': True, 'first_approval': 1998, 'first_in_class': '0', 'helm_notation': None, 'indication_class': 'Impotence Therapy', 'inorganic_flag': '0', 'max_phase': 4, 'molecule_chembl_id': 'CHEMBL192', 'molecule_hierarchy': {'molecule_chembl_id': 'CHEMBL192', 'parent_chembl_id': 'CHEMBL192'}, 'molecule_properties': {'acd_logd': '2.45', 'acd_logp': '2.47', 'acd_most_apka': '10.05', 'acd_most_bpka': '6.03', 'alogp': '1.61', 'aromatic_rings': 3, 'full_molformula': 'C22H30N6O4S', 'full_mwt': '474.59', 'hba': 8, 'hba_lipinski': 10, 'hbd': 1, 'hbd_lipinski': 1, 'heavy_atoms': 33, 'molecular_species': 'NEUTRAL', 'mw_freebase': '474.59', 'mw_monoisotopic': '474.2049', 'num_lipinski_ro5_violations': 0, 'num_ro5_violations': 0, 'psa': '113.42', 'qed_weighted': '0.55', 'ro3_pass': 'N', 'rtb': 7}, 'molecule_structures': {'canonical_smiles': 'CCCc1nn(C)c2C(=O)NC(=Nc12)c3cc(ccc3OCC)S(=O)(=O)N4CCN(C)CC4', 'standard_inchi': 'InChI=1S/C22H30N6O4S/c1-5-7-17-19-20(27(4)25-17)22(29)24-21(23-19)16-14-15(8-9-18(16)32-6-2)33(30,31)28-12-10-26(3)11-13-28/h8-9,14H,5-7,10-13H2,1-4H3,(H,23,24,29)', 'standard_inchi_key': 'BNRNXUUZRGQAQC-UHFFFAOYSA-N'}, 'molecule_synonyms': [{'molecule_synonym': 'Nipatra', 'syn_type': 'TRADE_NAME', 'synonyms': 'NIPATRA'}, {'molecule_synonym': 'Revatio', 'syn_type': 'TRADE_NAME', 'synonyms': 'REVATIO'}, {'molecule_synonym': 'Sildenafil', 'syn_type': 'ATC', 'synonyms': 'SILDENAFIL'}, {'molecule_synonym': 'Sildenafil', 'syn_type': 'BAN', 'synonyms': 'SILDENAFIL'}, {'molecule_synonym': 'Sildenafil', 'syn_type': 'BNF', 'synonyms': 'SILDENAFIL'}, {'molecule_synonym': 'Sildenafil', 'syn_type': 'INN', 'synonyms': 'SILDENAFIL'}, {'molecule_synonym': 'Sildenafil', 'syn_type': 'FDA', 'synonyms': 'Sildenafil'}, {'molecule_synonym': 'UK-92480', 'syn_type': 'RESEARCH_CODE', 'synonyms': 'UK-92480'}, {'molecule_synonym': 'Viagra', 'syn_type': 'TRADE_NAME', 'synonyms': 'VIAGRA'}, {'molecule_synonym': 'Vizarsin', 'syn_type': 'TRADE_NAME', 'synonyms': 'VIZARSIN'}], 'molecule_type': 'Small molecule', 'natural_product': '0', 'oral': True, 'parenteral': True, 'polymer_flag': False, 'pref_name': 'SILDENAFIL', 'prodrug': '0', 'score': 0.25654137, 'structure_type': 'MOL', 'therapeutic_flag': True, 'topical': False, 'usan_stem': '-afil', 'usan_stem_definition': 'PDE5 inhibitors', 'usan_substem': '-afil', 'usan_year': 1997, 'withdrawn_class': None, 'withdrawn_country': None, 'withdrawn_flag': False, 'withdrawn_reason': None, 'withdrawn_year': None}]

while

cwc.molecule.search('Viagra')
[]

The query should not depend on capitalization, I think.

Old chembl_webresource_client.sqlite incompatible with new version

It seems that old chembl_webresource_client.sqlite can be incompatible with new version. I updated chembl_webresource_client from 07. to current (0.8.4) and experienced some weird behaviour. When I deleted the sqlite file everything started working normally.
Maybe it would be useful if there would be a warning notifying user that old sqlite db might be incompatible?

Improving time turnaround for activity requests

Hello

Is there a way to improve the time it takes to retrieve activities against a target? The relevant code I'm using:

for x in chembl_ids:
        if x not in existing_files:
                print("Checking CHEMBL id {}".format(x))
                res = activity.filter(target_chembl_id='{}'.format(x), relation='=', assay_type='B', standard_type="IC50")

                for sub in res:
                        if sub['parent_molecule_chembl_id'] and sub['canonical_smiles'] and sub['standard_value'] is not None:
                                ids.append(str(sub['parent_molecule_chembl_id']))
                                smiles.append(str(sub['canonical_smiles']))
                                ic50.append(float(sub['standard_value']))

Currently, it takes ~2 minutes and 45 seconds per ID/request--and given that I want to pull ~5000 targets, I'm hoping to improve that time :) Is there a SETTINGS that decreases ping time?

Number of compounds form chembl web client is not equal to chembl website

I wonder why number of compounds form chembl web client is less than that
show on chembl website?

Limit in search similarity

Hello, is there a possibility to put a limit during the search of a similarity using the example :

from chembl_webresource_client.new_client import new_client
similarity = new_client.similarity
res = similarity.filter(smiles="COC@@HC(=O)", similarity=70)

The idea is to stop the search at 5 compounds and gain time.
Thanks,
Colin

chembl_molecule_id to "mechanism of action" ?

Hi,

What's the best way to get the linkage between a chemical and it's "mechanism of action"?

For example, under the Compounds report card for clozapine (https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL42) there is a "Mechanism of Action" heading that contains: Dopamine D2 receptor antagonist and Serotonin 2a (5-HT2a) receptor antagonist.

Cheers,

Imran

Failed to generate MACCS fingerprints using utils.sdf2fps()

Hi,
Thanks for this nice tool.
I tried the following things to generate MACCS-166 fingerprints for some of the compounds of my interest.

from chembl_webresource_client.new_client import new_client
from chembl_webresource_client.utils import utils

molecule = new_client.molecule
molecule.set_format('sdf')

cpd_list = ["CHEMBL268177","CHEMBL268439"]


morgan = []
for i in cpd_list:
    cmpnd = molecule.get(i)
    fps = utils.sdf2fps(cmpnd) #by default it is 'morgan'
    morgan.append(fps)

print(morgan[0])

"""
output
#FPS1
#num_bits=2048
#software=RDKit/2017.03.3
00000000000000000000000000000000000000000000000000000000000000000000000000000080000000400000000000000000000000000000000000000000000000000000000040000000000000000004200000000000000800000000000000001000000000000200000000080000004000002000080000000000000000000000000000000000000000000000000000000800002000000000000000000004000000000000000000000000100000000000000000000200100000000000000000000000100000008000000000000000000000000000000000004000000000000000000000000000000002000000000000000000008000000000000000000000    CHEMBL268177
"""
# but if i change it
maccs = []
for i in cpd_list:
    cmpnd = molecule.get(i)
    fps = utils.sdf2fps(cmpnd, 'maccs') #How to define that I want MACCS-166?
    maccs.append(fps)

print(maccs[0])

"""
output for this
#FPS1
#num_bits=2048
#software=RDKit/2017.03.3
"""

Even curl failed to show any desired result of mine

!curl -X POST -F "file=cmpnd" -F "type=maccs" https://www.ebi.ac.uk/chembl/api/utils/sdf2fps
#FPS1
#num_bits=2048
#software=RDKit/2017.03.3

Is it a bug or I misunderstood something?

No module names packages.urllib3.util

I installed by the way of this website http://chembl.blogspot.co.uk/2014/05/a-python-client-for-accessing-chembl.html the module to access to ChemBL data. But when I try to run the example from the web http://chembl.blogspot.co.uk/2014/05/a-python-client-for-accessing-chembl.html to retrieve drug indication, I got the same error.

I've got the urllib3 library installed and if I import 'chembl_webresource_client' module it works fine. But with the new client it doesn't work.

The error says:
Traceback (most recent call last): File "test_ChemBLindication.py", line 1, in <module> from chembl_webresource_client.new_client import new_client File "/usr/local/lib/python2.7/dist-packages/chembl_webresource_client/new_client.py", line 10, in <module> from chembl_webresource_client.query_set import QuerySet File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in __import__ result = _import(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/chembl_webresource_client/query_set.py", line 8, in <module> from chembl_webresource_client.url_query import UrlQuery File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in __import__ result = _import(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/chembl_webresource_client/url_query.py", line 21, in <module> from requests.packages.urllib3.util import Retry File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in __import__ result = _import(*args, **kwargs) ImportError: No module named packages.urllib3.util

webresource client: TypeError: 'NoneType' object has no attribute 'getitem

I just installed the web client into a new Conda env and ran this piece of code (taken from https://github.com/chembl/chembl_webresource_client):

from chembl_webresource_client.new_client import new_client
target = new_client.target
activity = new_client.activity
herg = target.search('herg')[0]
herg_activities = activity.filter(target_chembl_id=herg['target_chembl_id']).filter(standard_type="Ki")

Based on the issue referenced below:
pip install --force-reinstall gevent==1.2.2
pip install --force-reinstall greenlet==0.4.12

I'm ending up with this error message:
TypeError: 'NoneType' object has no attribute 'getitem'

Any clue what is going wrong?

Hi everyone, I have just been checking this in python 2.7 and the main issue are the updates to gevent and greenlet libraries, so I would suggest to execute the following pip commands:

pip install --force-reinstall gevent==1.2.2
pip install --force-reinstall greenlet==0.4.12

Originally posted by @juanfmx2 in #38 (comment)

fail to load molecule by name from molecule.search()

Hi,
In January 2018, when I run the following code I get some results.

from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
res = molecule.search('aspirin')

Now the res list is empty !

Did you have an idea ?

PS: For viagra there is two results ;)

Thanks

KeyError: '.json' When Running `molecule.get('CHEMBL25')`

First, great work here, I love the extensiveness of this API, its breadth is incredible.

I was just having a minor issue (which might just be me overlooking the obvious), but when I run one of the example queries:

from chembl_webresource_client.new_client import new_client
molecule = new_client.molecule
m1 = molecule.get('CHEMBL25')

I get the following stack on the line m1 = molecule.get('CHEMBL25'):

  File "C:\Users\User Name\Anaconda3\lib\site-packages\chembl_webresource_client\query_set.py", line 177, in get
    return self.query.get(*args, **kwargs)
  File "C:\Users\User Name\Anaconda3\lib\site-packages\chembl_webresource_client\url_query.py", line 267, in get
    return self._get_by_ids(args[0])
  File "C:\Users\User Name\Anaconda3\lib\site-packages\chembl_webresource_client\url_query.py", line 292, in _get_by_ids
    headers = {'Accept': mimetypes.types_map['.'+self.frmt]}
KeyError: '.json'

I'm not really sure what to try and would appreciate any pointers in the right direction! Thanks!

Downloading bioactivities for cell lines

I'd like to download all the GI50/EC50 data for a particular cell line (e.g. CHEMBL3308372), but using the standard method doesn't seem to work:

activity = new_client.activity
for x in chembl_ids:
    res = activity.filter(target_chembl_id='{}'.format(x), assay_type='F', standard_type="EC50", standard_value__isnull=False

This res seems to return a dataframe of celllines, and not the bioactivities against that cell line. Is there a better way to approach this?

Warnings generated with Python 3.6

There seem to be some issues with Monkey Patching in gevent

/Users/pwalters/anaconda/envs/rdkit_2018_03/lib/python3.6/site-packages/grequests.py:22: MonkeyPatchWarning: Monkey-patching ssl after ssl has already been imported may lead to errors, including RecursionError on Python 3.6. It may also silently lead to incorrect behaviour on Python 3.7. Please monkey-patch earlier. See gevent/gevent#1016. Modules that had direct imports (NOT patched): ['urllib3.contrib.pyopenssl (/Users/pwalters/anaconda/envs/rdkit_2018_03/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py)', 'urllib3.util (/Users/pwalters/anaconda/envs/rdkit_2018_03/lib/python3.6/site-packages/urllib3/util/init.py)'].
curious_george.patch_all(thread=False, select=False)

Log all failures.

By design the client fails silently when any network problem is encountered. This can be helpful for less experienced users but it makes it harder to debug any serious issues as we've learned in #1. This is why it's so important to add some logging when the error was detected. Standard logging library should be enough as it very flexible and well known.

Queries take a long time.

I freshly installed the package in a python 3.7 conda environment together with rdkit. The first query 'viagra' was quite fast. But looking for antibiotics 'Amoxicillin' or 'amoxicillin' takes several minutes and returns something questionable.

502 Proxy Error

Hello,

I am working on extracting activity data for a list of targets. I keep getting several error for many of the targets Here are two examples of the errors I get for two of the targets I am interested in:

Thank you for your help!

Error1: (target_chembl_id="CHEMBL1075094")

`bash-4.1$ python
Python 3.6.1 |Anaconda custom (64-bit)| (default, May 11 2017, 13:09:58)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.

from chembl_webresource_client.new_client import new_client
available_resources = [resource for resource in dir(new_client) if not resource.startswith('_')]
activities = new_client.activity
act = activities.filter(target_chembl_id="CHEMBL1075094")
len(act)
96445
act[10980] #the number where it errors out changes everytime I rerun it

Traceback (most recent call last):
File "uniprot_to_chemblassays.py", line 35, in
act[i]['molecule_chembl_id'],
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/query_set.py", line 171, in getitem
return self.query[k]
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/url_query.py", line 188, in getitem
page = self.get_page()
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/url_query.py", line 395, in get_page
handle_http_error(res)
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/http_errors.py", line 113, in handle_http_error
raise exception_class(request.url, request.text)
chembl_webresource_client.http_errors.HttpBadGateway: Error for url https://www.ebi.ac.uk/chembl/api/data/activity.json, server response:

<title>502 Proxy Error</title>

Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request POST /chembl/api/data/activity.json.

Reason: Error reading from remote server

Apache/2.2.15 (Red Hat) Server at www.ebi.ac.uk Port 80

Error 2: (target_chembl_id="CHEMBL2640")

from chembl_webresource_client.new_client import new_client
available_resources = [resource for resource in dir(new_client) if not resource.startswith('_')]
activities = new_client.activity
#activities.set_format('json')
...
act = activities.filter(target_chembl_id="CHEMBL2640")
len(act)
Traceback (most recent call last):
File "", line 1, in
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/query_set.py", line 98, in len
return len(self.query)
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/url_query.py", line 152, in len
self.get_page()
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/url_query.py", line 395, in get_page
handle_http_error(res)
File "/home/zgaieb/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/http_errors.py", line 113, in handle_http_error
raise exception_class(request.url, request.text)
chembl_webresource_client.http_errors.HttpApplicationError: Error for url https://www.ebi.ac.uk/chembl/api/data/activity.json, server response:

The service you have requested is currently offline for essential maintenance. Please try again later.

[Errno 111] Connection refused

Hi,
I'm trying out the chembl_webresource_client for the first time, but I keep getting the same error every time I import new_client.
Can you understand what's causing the problem from the error?

---------------------------------------------------------------------------
ConnectionRefusedError                    Traceback (most recent call last)
/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/connection.py in _new_conn(self)
    140             conn = connection.create_connection(
--> 141                 (self.host, self.port), self.timeout, **extra_kw)
    142 

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     82     if err is not None:
---> 83         raise err
     84 

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     72                 sock.bind(source_address)
---> 73             sock.connect(sa)
     74             return sock

/home/subjectAM/anaconda3/lib/python3.6/site-packages/gevent/_socket3.py in connect(self, address)
    306                 else:
--> 307                     raise error(result, strerror(result))
    308         finally:

ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    600                                                   body=body, headers=headers,
--> 601                                                   chunked=chunked)
    602 

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    356         else:
--> 357             conn.request(method, url, **httplib_request_kw)
    358 

/home/subjectAM/anaconda3/lib/python3.6/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1238         """Send a complete request to the server."""
-> 1239         self._send_request(method, url, body, headers, encode_chunked)
   1240 

/home/subjectAM/anaconda3/lib/python3.6/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1284             body = _encode(body, 'body')
-> 1285         self.endheaders(body, encode_chunked=encode_chunked)
   1286 

/home/subjectAM/anaconda3/lib/python3.6/http/client.py in endheaders(self, message_body, encode_chunked)
   1233             raise CannotSendHeader()
-> 1234         self._send_output(message_body, encode_chunked=encode_chunked)
   1235 

/home/subjectAM/anaconda3/lib/python3.6/http/client.py in _send_output(self, message_body, encode_chunked)
   1025         del self._buffer[:]
-> 1026         self.send(msg)
   1027 

/home/subjectAM/anaconda3/lib/python3.6/http/client.py in send(self, data)
    963             if self.auto_open:
--> 964                 self.connect()
    965             else:

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/connection.py in connect(self)
    165     def connect(self):
--> 166         conn = self._new_conn()
    167         self._prepare_conn(conn)

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/connection.py in _new_conn(self)
    149             raise NewConnectionError(
--> 150                 self, "Failed to establish a new connection: %s" % e)
    151 

NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f2dc11e1358>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
/home/subjectAM/anaconda3/lib/python3.6/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    439                     retries=self.max_retries,
--> 440                     timeout=timeout
    441                 )

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    638             retries = retries.increment(method, url, error=e, _pool=self,
--> 639                                         _stacktrace=sys.exc_info()[2])
    640             retries.sleep()

/home/subjectAM/anaconda3/lib/python3.6/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    387         if new_retry.is_exhausted():
--> 388             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    389 

MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /chemblws/spore (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2dc11e1358>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
<ipython-input-13-2c837b71d511> in <module>()
----> 1 from chembl_webresource_client.new_client import new_client

/home/subjectAM/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/new_client.py in <module>()
     68 #-----------------------------------------------------------------------------------------------------------------------
     69 
---> 70 new_client = client_from_url(Settings.Instance().NEW_CLIENT_URL + '/spore')
     71 
     72 #-----------------------------------------------------------------------------------------------------------------------

/home/subjectAM/anaconda3/lib/python3.6/site-packages/chembl_webresource_client/new_client.py in client_from_url(url, base_url)
     28 
     29     """
---> 30     res = requests.get(url)
     31     if not res.ok:
     32         raise Exception('Error getting schema from url {0} with status {1} and msg {2}'.format(url, res.status_code, res.text))

/home/subjectAM/anaconda3/lib/python3.6/site-packages/requests/api.py in get(url, params, **kwargs)
     70 
     71     kwargs.setdefault('allow_redirects', True)
---> 72     return request('get', url, params=params, **kwargs)
     73 
     74 

/home/subjectAM/anaconda3/lib/python3.6/site-packages/requests/api.py in request(method, url, **kwargs)
     56     # cases, and look like a memory leak in others.
     57     with sessions.Session() as session:
---> 58         return session.request(method=method, url=url, **kwargs)
     59 
     60 

/home/subjectAM/anaconda3/lib/python3.6/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    506         }
    507         send_kwargs.update(settings)
--> 508         resp = self.send(prep, **send_kwargs)
    509 
    510         return resp

/home/subjectAM/anaconda3/lib/python3.6/site-packages/requests/sessions.py in send(self, request, **kwargs)
    616 
    617         # Send the request
--> 618         r = adapter.send(request, **kwargs)
    619 
    620         # Total elapsed time of the request (approximately)

/home/subjectAM/anaconda3/lib/python3.6/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    506                 raise SSLError(e, request=request)
    507 
--> 508             raise ConnectionError(e, request=request)
    509 
    510         except ClosedPoolError as e:

ConnectionError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /chemblws/spore (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2dc11e1358>: Failed to establish a new connection: [Errno 111] Connection refused',))

Missing link to Travis-CI on home page

It's much easier to get quick feedback as a user on the stability of the project if there's a link/badge on the home page.

Markdown:

[![Build Status](https://travis-ci.org/chembl/chembl_webresource_client.svg?branch=master)](https://travis-ci.org/chembl/chembl_webresource_client)

RST:

.. image:: https://travis-ci.org/chembl/chembl_webresource_client.svg?branch=master
    :target: https://travis-ci.org/chembl/chembl_webresource_client

BaseException in import

For the import statement

from chembl_webresource_client.new_client import new_client

In debug mode the below exception is shown. Any solution to this please?

Exception has occurred: BaseException
exception: no description

Details:

VS Code latest version 1.30.2
All the requirements done like

pip install chembl_webresource_client

also tried
pip install -U chembl_webresource_client

even tried with conda environment
conda install -c chembl chembl_webresource_client

cannot access

i try to import chembl_webresource_client by

from chembl_webresource_client import *

but got this error

TypeErrorTraceback (most recent call last)
<ipython-input-18-4b4dfe9c8e8f> in <module>()
----> 1 from chembl_webresource_client import *
      2 
      3 targets = TargetResource()
      4 print targets.status()

/usr/local/lib/python2.7/dist-packages/chembl_webresource_client/__init__.py in <module>()
     14         pass
     15 
---> 16     gevent.greenlet.Greenlet._report_error = _greenlet_report_error
     17 
     18 import requests

TypeError: can't set attributes of built-in/extension type 'gevent._greenlet.Greenlet'

Make Settings class print friendly

Again, as we've learned from #1 it would be very helpful for debugging if Settings class had nice printable representation allowing to inspect all the setting by a single print statement.

simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Hi,

I find the above error message after I successfully installed the ChEMBL web-resource client package.
My code was only this:
from chembl_webresource_client import *
The whole message is as follows:
Traceback (most recent call last): File "C:/Working_projects/Cancer_related_fristeighbours/Experimental_validation/Python_project/Search_for_chemble_data_for_compounds.py", line 9, in <module> from chembl_webresource_client import * File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\chembl_webresource_client\__init__.py", line 23, in <module> from chembl_webresource_client.utils import utils File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\chembl_webresource_client\utils.py", line 8, in <module> utils = client_from_url(Settings.Instance().utils_spore_url) File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\chembl_webresource_client\spore_client.py", line 17, in client_from_url schema = session.get(url).json() File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\requests\models.py", line 826, in json return complexjson.loads(self.text, **kwargs) File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\simplejson\__init__.py", line 516, in loads return _default_decoder.decode(s) File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\simplejson\decoder.py", line 370, in decode obj, end = self.raw_decode(s) File "C:\Users\dm729\AppData\Local\Continuum\Anaconda2\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode return self.scan_once(s, idx=_w(s, idx).end()) simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I have reinstalled simple json package for Anaconda. I think it can be some kind of file coding issue either within simplejson or in ChEMBL. I am not sure what is the file coding for the requested files from the url.

I am using Anaconda on a windows machine. See below all of my package versions.
alabaster==0.7.7 anaconda-client==1.2.2 argcomplete==1.0.0 astropy==1.1.1 Babel==2.2.0 backports-abc==0.4 backports.ssl-match-hostname==3.4.0.2 beautifulsoup4==4.4.1 bitarray==0.8.1 blaze==0.9.0 bokeh==0.11.0 boto==2.39.0 Bottleneck==1.0.0 cdecimal==2.3 cffi==1.2.1 chembl-webresource-client==0.6.3 clyent==1.2.0 colorama==0.3.6 comtypes==1.1.2 conda==4.1.11 conda-build==1.19.0 conda-env==2.5.0a0 configobj==5.0.6 cryptography==0.9.1 cycler==0.9.0 Cython==0.23.4 cytoolz==0.7.5 datashape==0.5.0 decorator==4.0.6 docutils==0.12 easydict==1.5 enum34==1.1.2 et-xmlfile==1.0.1 fastcache==1.0.2 Flask==0.10.1 funcsigs==0.4 futures==3.0.3 gevent==1.0.2 gevent-websocket==0.9.5 greenlet==0.4.9 grequests==0.2.0 grin==1.2.1 h5py==2.5.0 idna==2.0 ipaddress==1.0.14 ipykernel==4.2.2 ipython==4.1.1 ipython-genutils==0.1.0 ipywidgets==4.1.1 itsdangerous==0.24 jdcal==1.2 jedi==0.9.0 Jinja2==2.8 jsonschema==2.4.0 jupyter==1.0.0 jupyter-client==4.1.1 jupyter-console==4.1.0 jupyter-core==4.0.6 llvmlite==0.8.0 lxml==3.5.0 MarkupSafe==0.23 matplotlib==1.5.1 menuinst==1.4.1 mistune==0.7.1 multipledispatch==0.4.8 nbconvert==4.1.0 nbformat==4.0.1 networkx==1.11 nltk==3.1 nose==1.3.7 notebook==4.1.0 numba==0.23.1 numexpr==2.4.6 numpy==1.11.1 odo==0.4.0 openpyxl==2.3.2 pandas==0.17.1 path.py==0.0.0 patsy==0.4.0 pep8==1.7.0 pickleshare==0.5 Pillow==3.1.0 ply==3.8 psutil==3.4.2 py==1.4.31 pyasn1==0.1.9 pycairo==1.10.0 pycosat==0.6.1 pycparser==2.14 pycrypto==2.6.1 pyflakes==1.0.0 Pygments==2.1.1 pyOpenSSL==0.15.1 pyparsing==2.0.3 pyreadline==2.1 pytest==2.8.5 python-dateutil==2.4.2 python-igraph==0.7.1.post6 pytz==2015.7 pywin32==219 PyYAML==3.11 pyzmq==15.2.0 qtconsole==4.1.1 requests==2.11.1 requests-cache==0.4.4 rope==0.9.4 ruamel-yaml===-VERSION scikit-image==0.11.3 scikit-learn==0.17.1 scipy==0.18.0 simplegeneric==0.8.1 simplejson==3.8.2 singledispatch==3.4.0.3 six==1.10.0 snowballstemmer==1.2.1 sockjs-tornado==1.0.1 sphinx==1.3.5 sphinx-rtd-theme==0.1.9 spyder==2.3.8 SQLAlchemy==1.0.11 statsmodels==0.6.1 sympy==0.7.6.1 tables==3.2.2 toolz==0.7.4 tornado==4.3 traitlets==4.1.0 unicodecsv==0.14.1 uniprot-tools==0.4.1 urllib3==1.16 Werkzeug==0.11.3 xlrd==0.9.4 XlsxWriter==0.8.4 xlwings==0.6.4 xlwt==1.0.0
Thanks for any help,

Dezso

create a higher level utils module

In the https://github.com/mgalardini/chembl_tools repository, @mgalardini provides a set of useful methods and command line utils. It would make sense to merge them with a client to simplify some of the most common tasks.

Proxy server receives invalid response

When I run the following code to get the smiles/ids/ic50s for a particular CHEMBL target:

for x in chembl_ids:
    if x not in existing_files:
        print "Checking kinase id {}".format(x)
        res = activity.filter(target_chembl_id='{}'.format(x), relation='=', assay_type='B', standard_type="IC50")

        for sub in res:
            if sub['parent_molecule_chembl_id'] and sub['canonical_smiles'] and sub['standard_value'] is not None:
                ids.append(str(sub['parent_molecule_chembl_id']))
                smiles.append(str(sub['canonical_smiles']))
                ic50.append(float(sub['standard_value']))
        temp = map(lambda x, y, z: [x,y,z], ids, smiles, ic50)
        clean = []
        for z in temp:
            if z[0] not in [m[0] for m in clean]:
                clean.append(z)

I get the following error:

Traceback (most recent call last):
  File "fetch-data.py", line 48, in <module>
    for sub in res:
  File "/home/spadavec/miniconda2/envs/rdkit/lib/python2.7/site-packages/chembl_webresource_client/query_set.py", line 116, in next
    self.chunk = self.query.next_page()
  File "/home/spadavec/miniconda2/envs/rdkit/lib/python2.7/site-packages/chembl_webresource_client/url_query.py", line 445, in next_page
    return self.get_page()
  File "/home/spadavec/miniconda2/envs/rdkit/lib/python2.7/site-packages/chembl_webresource_client/url_query.py", line 406, in get_page
    handle_http_error(res)
  File "/home/spadavec/miniconda2/envs/rdkit/lib/python2.7/site-packages/chembl_webresource_client/http_errors.py", line 113, in handle_http_error
    raise exception_class(request.url, request.text)
chembl_webresource_client.http_errors.HttpBadGateway: Error for url https://www.ebi.ac.uk/chembl/api/data/activity.json, server response: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/chembl/api/data/activity.json">POST&nbsp;/chembl/api/data/activity.json</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
<hr>
<address>Apache/2.2.15 (Red Hat) Server at www.ebi.ac.uk Port 80</address>
</body></html>

Am I doing something wrong, or is this an error with too many requests being made (I want to make ~500 sequential requests)

possibility to do BLAST search

Hello !

I am trying to retrieve a binding data for a small set of ~300 proteins of interest (retrieve all molecules for each of the proteins). ChEMBL has a possibility to do the BLAST search. I could not find it in the python API. Which way would you recommend ?

Thank you,

MaxRetryError

I was trying to get the descriptors for 968 chemicals from ChemBL. I run into this error:
MaxRetryError: HTTPSConnectionPool(host='www.ebi.ac.uk', port=443): Max retries exceeded with url: /chembl/api/data/molecule/BQJRUJTZSGYBEZ-KTKRFUPESA-N (Caused by ResponseError('too many 404 error responses',)). Could I know how long I have to wait afer each request?

Problem in getting bioactivity data for specific ChEMBL molecules

Hello,

I am using python connect to collect all bioactivity data from ChEMBL. I am trying to collect activity data for each of the ~1.5 millions compound using compound ChEMBL ID as query. I managed to get data for more then 1 million compounds, however the python connect returning error 500 for specific list of molecules. There are more than 80 thousands compounds for which this error is reproducible. Some of the molecules ID listed here: CHEMBL1090938, CHEMBL1094414, CHEMBL1094415...

This is the line which throws error 500.

activity = compounds.bioactivities('CHEMBL1090938', frmt='json')

Any suggetions or help will be greatly appreciated!

Thanks

new_client.similarity.filter returns 502 errors with low similarity threshold

I keep running into 502 errors when searching for similar molecules based on smile strings.
I'm not sure if it's just a bad couple of days for the EMBL servers, or if there's something wrong with the way I'm querying this?

It doesn't seem to fail at a particular smile string, and as it caches, if I re-run it does make progress.

from chembl_webresource_client.new_client import new_client

# active_smiles = list of roughly 1,000 smile strings

similarity_query = new_client.similarity                               
dark_smiles = []                                                       
for smile in active_smiles:
    res = similarity_query.filter(smiles=smile, similarity=70)
    if len(res) == 0:
        dark_smiles.append(smile)

Raises:

HttpBadGateway: Error for url https://www.ebi.ac.uk/chembl/api/data/similarity.json, server response: <!DOCTYPE
 HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/chembl/api/data/similarity.json">POST&nbsp;/chembl/
api/data/similarity.json</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
<hr>
<address>Apache/2.2.15 (Red Hat) Server at www.ebi.ac.uk Port 80</address>
</body></html>

Error trying to grab the activities of a specific assay

new_client.activity.filter(assay_chembl_id='CHEMBL730848')

I get:

TypeError: 'NoneType' object cannot be interpreted as an integer

In: chembl_webresource_client/query_set.py in len(self)

new_client columns filtering

Hi,
I try to use new_client instead of old client to get activities for a specific target.
new_client.activity.filter(target_chembl_id="CHEMBL4040")
Unfortunately the shape of resulting data is very hudge since new_client return 30 columns instead of 10 for old client.
How can I filter my query to a list of specific columns to reduce both download time and data size ?

Sincerely

Fabrice Carles

PhD student

Some print statements left in that print to console.

I noticed that in the file url_query.py lines 290, 291, 292, 296, 297, and 298 print to the console by default. They don't appear to be necessary for function so perhaps adding a "verbose" option would be an interesting idea to suppress the print statements?

(Not necessarily an issue, just something I noticed)

Searching by synonym ?

Is searching molecules / targets by synonyms supported ? If so, can you provide sample usage ?

Thanks

Not compatible with Python37 - async is now a keyword

There are several places where the name async has been used as a variable name. Since this is now a keyword in Python37, the interpreter will send a SyntaxError.

@jmarinllao will send a PR soon :)

Filtering with confidence score

Hi,

I am unaware if it is possible to filter substructure results with a confidence score?
Do you get that as an option somewhere?

Thanks,
George

Getting data from chembl 21 rather than 22?

Hi,
I'm running the simple script below to retrieve all targets from CHEMBL but the number of targets I'm getting matches the total number of targets in ChEMBL 21 and not 22. Am I missing something?
I'm using the latest version and removed all cache files.

Thanks,

Izzy

from chembl_webresource_client import *
targets = TargetResource()
all_targets = targets.get_all()
print "Retrieved", len( all_targets ), "targets"

python ./TEST.py
Retrieved 11019 targets

chembl / chembl_webresource_client Goto Github PK

chembl_webresource_client's Introduction

ChEMBL webresource client

Installation

Live Jupyter notebook with examples

Available filters

Only operator

Settings

Citing

chembl_webresource_client's People

Contributors

Stargazers

Watchers

Forkers

chembl_webresource_client's Issues

In [5]: with open('mols_2D.sdf', 'w') as output: ...: for mol in mols: ...: output.write(mol) ...: output.write('$$$$\n') ...:

Proxy Error

Izzy

Recommend Projects

Recommend Topics

Recommend Org

Jobs

In [5]: with open('mols_2D.sdf', 'w') as output:
...: for mol in mols:
...: output.write(mol)
...: output.write('$$$$\n')
...: