Comments (9)
Thanks - I can't really work around that quota for now. Also, make sure you have your API registered for research if that's the purpose as Scopus mandates this I believe.
Can you show me an example of the query or what you want to download? I can't debug or help without specifics.
from rscopus.
Here's some basic code I think gets you the most of the way:
library(rscopus)
au_ids = c(23480260200, 8708052900, 54896131300,
55570070100,
55479219200, 7409391345, 55500593700, 39362440900)
# get all the data for the authors (including all co-authors)
res = lapply(au_ids, author_data)
names(res) = au_ids
# get co-authors
all_authors = lapply(res, function(x) {
x$full_data$author
})
# get unique IDs for those authors
unique_authors = lapply(all_authors, function(x) {
unique(x$authid)
})
# collapse all authors together
combined_authors = unlist(unique_authors)
combined_authors = unique(combined_authors)
# don't need the original authors in there
combined_authors = setdiff(combined_authors, au_ids)
# just doing first 5 due to API limits (but you can run these in chunks)
run_authors = combined_authors[1:5]
all_author_res = lapply(
run_authors,
author_data,
count = 200, view = "STANDARD")
names(all_author_res) = run_authors
all_author_res[[1]]$df
from rscopus.
Please follow up with the duplicates with Elsevier/Scopus.
You seem to have changed your goal. I gave the solution I feel you requested. I have provided the tools, but I don't have any other info in these things and you can open another issue for "Quota" limits, but otherwise this is a scripting question and not development question and am closing.
from rscopus.
Ok, hope that they will provide this in the future, but I am not that optimistic.
For my case I was trying to retrieve all publications of a small country via the scopus_search API.
For now I have changed my strategy and am planning to use PubmedIDs to circumvent this issue.
from rscopus.
Thanks a lot for this suggestion.
For know I am trying the following approach, starting with a csv file containing pubmedIDs:
#Cut pubmed_list in small parts (large search request are not handled by the API, and smaller can also maximize the utilization of the weekly quota's. Subsequently a search_string is created in the format: "PMID(123456 OR 123457)" which can be handled by the scopus_search. The output object of this are subsequently stored in a list.
chunk <- 5
n <- nrow(pubmed_list)
r <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(pubmed_list,r)
res_list <- list()
for (number in range(1:4)){
names(d)[names(d)==number]<-"string"
string = data.frame(d$string[[1]])
string$OR = " OR "
names(string) = c("pmid","OR")
string = paste0(string$pmid,string$OR)
string = substr(string,1,nchar(string)-4)
string = paste0("PMID(",string,")")
res=scopus_search(query=string, view="COMPLETE", max_count = 1)
res_list[[number]] <- res
names(d)[names(d)=="string"]<-number}
OT: Not related to this package, but nevertheless worth mentioning, some articles are included twice in Elsevier. For example: PMID(30428293)
from rscopus.
OK - where are you seeing the 80k limit?
from rscopus.
I think PubMed IDs may cause some problems as I've seen them not return results given my permission for my API key: "API key in this example was setup with authorized CORS domains." as I've tried on interactive APIs: https://dev.elsevier.com/interactive.html
from rscopus.
Such as PMID(30391859)
: https://www.ncbi.nlm.nih.gov/pubmed/30391859, but PMID(30391859)
in scopus search gets nothing.
from rscopus.
Just as clarification, since I am not encountering issues anymore(so it can remain closed), my current strategy is to search for a particular disease, for example: "Heart Defects, Congenital"[Mesh], download the list with pubmedIDs and save them in a csv, and cut them in chuncks. Subsequently a for loop is used to transform the IDs to a search string and the output dataframes are stored in lists, which are rbind to dataframes. The for loop breaks automatically when the quota is reached.
`pubmed_list_diabetes<-read.csv("pubmed_list_diabetes.csv")
rm(d)
chunk <- 1
n <- nrow(pubmed_list)
r <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(pubmed_list,r)
publications_list = list()
affiliations_list = list()
authors_list = list()
remaining = 10
for (number in 1:20){
if (remaining < chunk){
break }
names(d)[names(d)==number]<-"string"
string = data.frame(d$string[[1]])
string$OR = " OR "
names(string) = c("pmid","OR")
string = paste0(string$pmid,string$OR)
string = substr(string,1,nchar(string)-4)
string = paste0("PMID(",string,")")
res=scopus_search(query=string, view="COMPLETE", max_count = 1)
entries = gen_entries_to_df(res$entries)
entries$df$entry_number2=paste0(number,".",entries$df$entry_number)
publications_list[[number]] = entries$df
entries$affiliation$entry_number2=paste0(number,".",entries$affiliation$entry_number)
affiliations_list[[number]] = entries$affiliation
entries$author$entry_number2=paste0(number,".",entries$author$entry_number)
authors_list[[number]] = entries$author
names(d)[names(d)=="string"]<-number
remaining=res$get_statements$headers$`x-ratelimit-remaining`}`
Such as
PMID(30391859)
: https://www.ncbi.nlm.nih.gov/pubmed/30391859, butPMID(30391859)
in scopus search gets nothing.
The searches returning no articles can be due to two reasons:
- PubMed covers more medical journals than scopus
- Some PubMed articles are in scopus, but without their PubMedID, probably due to that indexing at PubMed and Scopus does not happen at the same time (for example, you will find the article PMID(30391859) in scopus by searching on it's title "A dual modeling approach to automatic segmentation of cerebral T2 hyperintensities and T1 black holes in multiple sclerosis".
The first reason is not solvable, but the second one can be quite easily corrected, by downloading from PubMed a csv linking titles to PubMedIDs (which can be used to search by using the titles for the articles that retrieve no result when searching for their PubMedIDs).
from rscopus.
Related Issues (20)
- Get to know the query size before download HOT 3
- Basic scopus search HOT 4
- Non-character argument error applying bibtex_core_data() HOT 6
- Extracting Abstracts HOT 3
- Excluding of self-citations does not work in author_retrieval() HOT 4
- Unable to generate valid bibtex HOT 1
- Please add function to check current status of quota HOT 1
- Panel data for authors? HOT 1
- error while using the function process_object_retrieval() HOT 1
- is there any existing function to retrieve the tables that are embedded in the scopus provided open source articles? HOT 7
- FWCI abstraction HOT 1
- retrieve email address of corresponding author HOT 5
- bibtex scopus HOT 6
- bibtex_core_data no comma after doi field HOT 6
- argument inconsistency in citation_retrieval compared to article_retrieval and abstract_retrieval HOT 2
- Retrieving text HOT 1
- Scopus_search does not start from specified start entry HOT 1
- Minor issue with bibtext_core_data for punctuated titles HOT 3
- Query on multiple subject areas HOT 1
- Use of an institution token
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rscopus.