GithubHelp home page GithubHelp logo

zhiyzuo / python-scopus Goto Github PK

View Code? Open in Web Editor NEW
23.0 5.0 28.0 254 KB

PyScopus

Home Page: http://zhiyzuo.github.io/python-scopus/

License: MIT License

Python 14.73% Jupyter Notebook 85.27%
python-scopus scopus python api-wrapper api apis

python-scopus's Introduction

Python-Scopus

license


PyScopus is free for anyone's academic use. Please kindly cite our paper when you use this package for your own research:

Zuo, Z., Zhao, K. and Eichmann, D. (2017), The state and evolution of U.S. iSchools: From talent acquisitions to research outcome. Journal of the Association for Information Science and Technology, 68: 1266–1277. doi:10.1002/asi.23751


A Python wrapper for using Scopus API

Please refer to the documentation page for more details.

python-scopus's People

Contributors

zhiyzuo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

python-scopus's Issues

Keywords

Is it possible to display the keywords for a given paper using pyscopus?

UnicodeEncodeError in retrieve_author() with non-ascii names

I was searching for "Augustín Carstens" with Scopus ID 6603722641:

>>> info = scopus.retrieve_author("6603722641")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/pyscopus/pyscopus.py", line 101, in retrieve_author
UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 2793: ordinal not in range(128)

I'm on Ubuntu 14.04 using with Python 2.7. I however don't know which version of pyscopus I am on as __version__ is not defined.

Get _all_ articles from query

Thanks for writing this nice piece of software!

I wondering how to use it to get all possible articles given a query. For instance, if I use the example from the documentation

search_df = scopus.search("KEY(interdisciplinary collaboration)", count=20, view='STANDARD')

and leave out count, I only get 100 articles.

If I raise count to something very large, such as 1e10, I get the following error:

KeyError                                  Traceback (most recent call last)
<timed exec> in <module>

/usr/local/lib/python3.8/site-packages/pyscopus/scopus.py in search(self, query, count, type_, view)
     67         while True:
     68             index = 25*i
---> 69             result_df = result_df.append(_search_scopus(self.apikey, query, type_, view=view, index=index),
     70                                          ignore_index=True)
     71             if result_df.shape[0] >= count:

/usr/local/lib/python3.8/site-packages/pyscopus/utils.py in _search_scopus(key, query, type_, view, index)
    386     js = r.json()
    387     #print(r.url)
--> 388     total_count = int(js['search-results']['opensearch:totalResults'])
    389     entries = js['search-results']['entry']
    390 

KeyError: 'search-results'

Is there a way to find out the maximum number of results and use this as count?

ValueError in simple retrieve_author() call

I wanted to retrieve author details for Akiko Fujimoto with ID 7102932229:

>>> info = scopus.retrieve_author("7102932229")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/pyscopus/pyscopus.py", line 103, in retrieve_author
  File "build/bdist.linux-x86_64/egg/pyscopus/utils.py", line 203, in _parse_author_retrieval
ValueError: invalid literal for int() with base 10: ''

I'm on Ubuntu 14.04 using with Python 2.7. I however don't know which version of pyscopus I am on as __version__ is not defined.

Key error : 'Search results'

author_result_df = scopus.search_author("AUTHLASTNAME(Zuo) and AUTHFIRST(Zhiya) and AFFIL(Iowa)")

The above Code should provide the following output:
Screenshot (804)
But I am getting key error, why?
Screenshot (805)

IndexError in simple retrieve_author() query

I wanted to retrieve author details for Ahmet K. Karagozoglu with ID 6507550943:

>>> info = scopus.retrieve_author("6507550943")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/pyscopus/pyscopus.py", line 103, in retrieve_author
  File "build/bdist.linux-x86_64/egg/pyscopus/utils.py", line 244, in _parse_author_retrieval
IndexError: list index out of range

I'm on Ubuntu 14.04 using with Python 2.7. I however don't know which version of pyscopus I am on as __version__ is not defined.

Include insttoken

I would like to add the option to add the insttoken to the requests since I have one to increase the number of requests I can do in one week.

KeyError: 'search-results'

When I try to run the example mentioned in the website for search function ( scopus.search("KEY(topic modeling)", count=30) ), I encounter the following error:

KeyError: 'search-results'

Unable to retrieve abstract

I am using pyscopus, it works fine but I am having a problem identifying the authors of articles and especially tracing the abstract of the article after entering the scopus_id. How can I do this?

Screenshot 2023-11-03 alle 16 35 18

'Scopus' object has no attribute 'search'

The method .search() that you implemented two months ago is not working. Trying to use it I get an AttributeError: 'Scopus' object has no attribute 'search'.

print(dir(scopus))
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_abstract_url_base', '_author_retrieve_url_base', '_author_url_base', '_citation_overview_url_base', '_search_url_base', 'apikey', 'authenticate', 'retrieve_abstract', 'retrieve_author', 'retrieve_citation', 'search_author', 'search_author_publication', 'search_venue']

search_author and retrieve_author work but not search_author_publication

Following the quick start notebook (and using subscriber VPN), I am able to successfully run these two lines:
author_result_df = scopus.search_author("AUTHLASTNAME(Zhao) and AUTHFIRST(Kang) and AFFIL(Iowa)")
kang_info_dict = scopus.retrieve_author('36635367700')

but receive a KeyError: 'search-results' when I try
kang_pub_df = scopus.search_author_publication('36635367700')

Could not retrieve Author ID:

I am using PyScopus to extract information from Scopus website.
When i run this particular code : "search_df = scopus.search("KEY(topic modelling)", count=30,view='STANDARD')"
This should be the output:

Screenshot (800)
Screenshot (801)

Screenshot (802)

see in the above screenshot there is a column "authors" which has authorid of each author of a particular journal or articles in a form a list.

But this is what i get when i try it in GoogleColab:
Screenshot (803)

I get empty lists......

Why????

retrieve_citation return KeyError: 'abstract-citations-response'

I'm getting the following error when trying to use the retrieve_citation method.
I'm using python 3.
I know my API key is still valid since other calls work. I'm basically using the code from the examples in the documentation.

KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_194/189473782.py in <module>
----> 1 pub_citations_df = scopus.retrieve_citation(scopus_id_array=['84905286162', '0141607824'],
      2                                             year_range=[2010, 2014])
      3 pub_citations_df

~/.local/lib/python3.8/site-packages/pyscopus/scopus.py in retrieve_citation(self, scopus_id_array, year_range)
    209         js = r.json()
    210 
--> 211         return _parse_citation(js, year_range)
    212 
    213     def retrieve_full_text(self, full_text_link):

~/.local/lib/python3.8/site-packages/pyscopus/utils.py in _parse_citation(js_citation, year_range)
     95 
     96 def _parse_citation(js_citation, year_range):
---> 97     resp = js_citation['abstract-citations-response']
     98     cite_info_list = resp['citeInfoMatrix']['citeInfoMatrixXML']['citationMatrix']['citeInfo']
     99 

KeyError: 'abstract-citations-response'

error doing a general search from documentation

Hello,

i dowloaded and install PyScopus as instructed, assigned a Scopus API key, but when i run the example provided below i get an error:

search_df = scopus.search("KEY(interdisciplinary collaboration)", count=20)

Traceback (most recent call last):
File "", line 1, in
File "...\lib\site-packages\pyscopus
copus.py", line 56, in search
result_df, total_count = search_scopus(self.apikey, query, type, view=view)
File "...\lib\site-packages\pyscopus\tils.py", line 388, in _search_scopus
total_count = int(js['search-results']['opensearch:totalResults'])
KeyError: 'search-results'

KeyError : search-results

err

Hello!
I am sorry but i have this error and i don't know if it is due to the API key or something else...
Unfortunately i cannot connect to my institution because i am off the country.
I dont know if this plays an important role..
Any help much appreciated!

Problems with author retrieval

Thanks for providing this library!

I had some problems with author retrieval, I think when there is only one publication associated and/or a single affiliation. In any case I've added a check for an affiliation information dictionary being passed as a string (line 100-101), and a type check to make sure the pandas dataframe gets a list (line 247-249), see attached file.

utils.py.gz

Abstract retrival JSONDecodeError

First of all, thank you!

I am trying to run the notebook with the examples, but retrieve_abstract() keeps always throwing this error. This is the traceback:

JSONDecodeError Traceback (most recent call last)
in ()
----> 1 pub_info = scopus.retrieve_abstract('84905286162')

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pyscopus\scopus.py in retrieve_abstract(self, scopus_id)
163 r = requests.get('%s/%s'%(APIURI.ABSTRACT, scopus_id), params=par)
164
--> 165 js = r.json()
166
167 try:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\models.py in json(self, **kwargs)
894 # used.
895 pass
--> 896 return complexjson.loads(self.text, **kwargs)
897
898 @Property

~\AppData\Local\Continuum\anaconda3\lib\json_init_.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
346 parse_int is None and parse_float is None and
347 parse_constant is None and object_pairs_hook is None and not kw):
--> 348 return _default_decoder.decode(s)
349 if cls is None:
350 cls = JSONDecoder

~\AppData\Local\Continuum\anaconda3\lib\json\decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):

~\AppData\Local\Continuum\anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I don't know how to solve it.

Kind regards,
Luis

HTTP Error

`papers_count = scopus.search_venue('Computers and Chemical Engineering', count=7048, year_range=(1977,2016), show=False)'

I get an error saying:
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request

KeyError: 'search-results'

Thank you for making PyScopus! I have an API key. However when I try to run the code
author_result_df = scopus.search_author("AUTHLASTNAME(Zuo) and AUTHFIRST(Zhiya) and AFFIL(Iowa)")

I get the following error message:
File "searchSCOPUS.py", line 44, in
setup()
File "searchSCOPUS.py", line 25, in setup
author_result_df = scopus.search_author("AUTHLASTNAME(Zuo) and AUTHFIRST(Zhiya) and AFFIL(Iowa)")
File "/miniconda3/lib/python3.7/site-packages/pyscopus/scopus.py", line 96, in search_author
return self.search(query, count, type_=2, view=view)
File "/miniconda3/lib/python3.7/site-packages/pyscopus/scopus.py", line 56, in search
result_df, total_count = search_scopus(self.apikey, query, type, view=view)
File "/miniconda3/lib/python3.7/site-packages/pyscopus/utils.py", line 388, in _search_scopus
total_count = int(js['search-results']['opensearch:totalResults'])
KeyError: 'search-results'

Is there a way to fix this keyError?

Quick Start example on .retrieve_citation() not working

I use a working Scopus key in variable KEY which is enabled for the Citation Overview API. Trying to reproduce the Quick Start example I get the following error message:

>>> scopus = Scopus(KEY)
>>> pub_citations = scopus.retrieve_citation('84905286162')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pyscopus/scopus.py", line 306, in retrieve_citation
    soup = bs(urlopen(citation_url).read(), 'lxml')
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

Author search error

From [email protected]:

Hi, thank you for this wonderful package. I'm pretty new at Python, and I have got a problem with "scopus.search_author" part. it worked with your sample cord, but it doesn't with some of the professors who I am trying to search of. For example for prof. Black, Dan A. (AU-ID:7402568800), the code goes like this. Is there any solution to this? Thank you in advance. >>> author_result_df = scopus.search_author("AUTHLASTNAME(Black) and AUTHFIRST(Dan)") Traceback (most recent call last): File "", line 1, in File "C:\Program Files (x86)\Python36-32\lib\site-packages\pyscopus\scopus.py", line 103, in search_author return self.search(query, count, type_=2, view=view) File "C:\Program Files (x86)\Python36-32\lib\site-packages\pyscopus\scopus.py", line 63, in search result_df, total_count = search_scopus(self.apikey, query, type, view=view) File "C:\Program Files (x86)\Python36-32\lib\site-packages\pyscopus\utils.py", line 301, in _search_scopus result_df = pd.DataFrame([parse_entry(entry, type) for entry in entries]) File "C:\Program Files (x86)\Python36-32\lib\site-packages\pyscopus\utils.py", line 301, in result_df = pd.DataFrame([parse_entry(entry, type) for entry in entries]) File "C:\Program Files (x86)\Python36-32\lib\site-packages\pyscopus\utils.py", line 208, in _parse_entry return _parse_author(entry) File "C:\Program Files (x86)\Python36-32\lib\site-packages\pyscopus\utils.py", line 111, in _parse_author affil = entry['affiliation-current'] KeyError: 'affiliation-current'

AttributeError for simple search_author() query

I had strange problem with a query for "KEVIN KORDANA"

>>> first = "kevin"
>>> last = "kordana"
>>> query_dict = {'authfirst': first, 'authlast': last}
>>> author_results = scopus.search_author(query_dict)
A total number of  2  records for the query.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/pyscopus/pyscopus.py", line 67, in search_author
  File "build/bdist.linux-x86_64/egg/pyscopus/utils.py", line 38, in _parse_author
AttributeError: 'NoneType' object has no attribute 'find'

I'm on Ubuntu 14.04 using with Python 2.7. I however don't know which version of pyscopus I am on as __version__ is not defined.

Installation fails

On ubuntu 14.04 with sudo rights. pip install pyscopus results in

Collecting pyscopus
  Could not find a version that satisfies the requirement pyscopus (from versions: )
No matching distribution found for pyscopus

problem with example query

hi
i am having a problem with your example query.

from pyscopus import Scopus
key='xxx_mykeyhere_xxxx'
scopus = Scopus(key)
search_df = scopus.search("KEY(interdisciplinary collaboration)", count=30)
Traceback (most recent call last):
File "", line 1, in
File "\Programs\Python\Python36-32\lib\site-packages\pyscopus\scopus.py", line 56, in search
result_df, total_count = search_scopus(self.apikey, query, type, view=view)
File "
\Programs\Python\Python36-32\lib\site-packages\pyscopus\utils.py", line 303, in _search_scopus
total_count = int(js['search-results']['opensearch:totalResults'])
KeyError: 'search-results'

Empty Author List Retrieval

Hi, I used this code
search_df = scopus.search("AFFIL(uninamehere)", count=30, view='STANDARD')

but it seems it does not return the authors. May I know if there is any way I can fix this in my code?

image

Thanks for pyscopus!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.