Thanks for developing this package. It has been functioning perfectly so far. However,

Here's some basic code I think gets you the most of the way: <div class="highlight

Such as PMID(30391859) : <a href="https://www.ncbi.nlm

Downloading more than the Scopus quota's about rscopus HOT 9 CLOSED

muschellij2 commented on July 21, 2024

Downloading more than the Scopus quota's

from rscopus.

Comments (9)

muschellij2 commented on July 21, 2024 1

Thanks - I can't really work around that quota for now. Also, make sure you have your API registered for research if that's the purpose as Scopus mandates this I believe.

Can you show me an example of the query or what you want to download? I can't debug or help without specifics.

from rscopus.

muschellij2 commented on July 21, 2024 1

Here's some basic code I think gets you the most of the way:

library(rscopus)
au_ids = c(23480260200, 8708052900, 54896131300, 
	55570070100,
	55479219200, 7409391345, 55500593700, 39362440900)
# get all the data for the authors (including all co-authors)
res = lapply(au_ids, author_data)
names(res) = au_ids

# get co-authors
all_authors = lapply(res, function(x) {
	x$full_data$author
})

# get unique IDs for those authors
unique_authors = lapply(all_authors, function(x) {
	unique(x$authid)
})

# collapse all authors together
combined_authors = unlist(unique_authors)
combined_authors = unique(combined_authors)
# don't need the original authors in there
combined_authors = setdiff(combined_authors, au_ids)

# just doing first 5 due to API limits (but you can run these in chunks)
run_authors = combined_authors[1:5]
all_author_res = lapply(
	run_authors, 
	author_data,
	count = 200, view = "STANDARD")
names(all_author_res) = run_authors
all_author_res[[1]]$df

from rscopus.

muschellij2 commented on July 21, 2024 1

Please follow up with the duplicates with Elsevier/Scopus.

You seem to have changed your goal. I gave the solution I feel you requested. I have provided the tools, but I don't have any other info in these things and you can open another issue for "Quota" limits, but otherwise this is a scripting question and not development question and am closing.

from rscopus.

Gerwi commented on July 21, 2024

Ok, hope that they will provide this in the future, but I am not that optimistic.

For my case I was trying to retrieve all publications of a small country via the scopus_search API.

For now I have changed my strategy and am planning to use PubmedIDs to circumvent this issue.

from rscopus.

Gerwi commented on July 21, 2024

Thanks a lot for this suggestion.

For know I am trying the following approach, starting with a csv file containing pubmedIDs:

#Cut pubmed_list in small parts (large search request are not handled by the API, and smaller can also maximize the utilization of the weekly quota's. Subsequently a search_string is created in the format: "PMID(123456 OR 123457)" which can be handled by the scopus_search. The output object of this are subsequently stored in a list. 

chunk <- 5
n <- nrow(pubmed_list)
r  <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(pubmed_list,r)


res_list <- list()
for (number in range(1:4)){
    names(d)[names(d)==number]<-"string"
    string = data.frame(d$string[[1]])
    string$OR = " OR "
    names(string) = c("pmid","OR")
    string = paste0(string$pmid,string$OR)
    string = substr(string,1,nchar(string)-4)
    string = paste0("PMID(",string,")")
    res=scopus_search(query=string, view="COMPLETE", max_count = 1)
    res_list[[number]] <- res
    names(d)[names(d)=="string"]<-number}

OT: Not related to this package, but nevertheless worth mentioning, some articles are included twice in Elsevier. For example: PMID(30428293)

from rscopus.

muschellij2 commented on July 21, 2024

OK - where are you seeing the 80k limit?

from rscopus.

muschellij2 commented on July 21, 2024

I think PubMed IDs may cause some problems as I've seen them not return results given my permission for my API key: "API key in this example was setup with authorized CORS domains." as I've tried on interactive APIs: https://dev.elsevier.com/interactive.html

from rscopus.

muschellij2 commented on July 21, 2024

Such as PMID(30391859): https://www.ncbi.nlm.nih.gov/pubmed/30391859, but PMID(30391859) in scopus search gets nothing.

from rscopus.

Gerwi commented on July 21, 2024

Just as clarification, since I am not encountering issues anymore(so it can remain closed), my current strategy is to search for a particular disease, for example: "Heart Defects, Congenital"[Mesh], download the list with pubmedIDs and save them in a csv, and cut them in chuncks. Subsequently a for loop is used to transform the IDs to a search string and the output dataframes are stored in lists, which are rbind to dataframes. The for loop breaks automatically when the quota is reached.

`pubmed_list_diabetes<-read.csv("pubmed_list_diabetes.csv")

rm(d)
chunk <- 1
n <- nrow(pubmed_list)
r  <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(pubmed_list,r)


publications_list = list()
affiliations_list = list()
authors_list = list()
remaining = 10
for (number in 1:20){
    if (remaining < chunk){
    break }
    names(d)[names(d)==number]<-"string"
    string = data.frame(d$string[[1]])
    string$OR = " OR "
    names(string) = c("pmid","OR")
    string = paste0(string$pmid,string$OR)
    string = substr(string,1,nchar(string)-4)
    string = paste0("PMID(",string,")")
    res=scopus_search(query=string, view="COMPLETE", max_count = 1)
    entries = gen_entries_to_df(res$entries)
    entries$df$entry_number2=paste0(number,".",entries$df$entry_number)
    publications_list[[number]] = entries$df
    entries$affiliation$entry_number2=paste0(number,".",entries$affiliation$entry_number)
    affiliations_list[[number]] = entries$affiliation
    entries$author$entry_number2=paste0(number,".",entries$author$entry_number)
    authors_list[[number]] = entries$author
    names(d)[names(d)=="string"]<-number
    remaining=res$get_statements$headers$`x-ratelimit-remaining`}`

Such as PMID(30391859): https://www.ncbi.nlm.nih.gov/pubmed/30391859, but PMID(30391859) in scopus search gets nothing.

The searches returning no articles can be due to two reasons:

PubMed covers more medical journals than scopus
Some PubMed articles are in scopus, but without their PubMedID, probably due to that indexing at PubMed and Scopus does not happen at the same time (for example, you will find the article PMID(30391859) in scopus by searching on it's title "A dual modeling approach to automatic segmentation of cerebral T2 hyperintensities and T1 black holes in multiple sclerosis".

The first reason is not solvable, but the second one can be quite easily corrected, by downloading from PubMed a csv linking titles to PubMedIDs (which can be used to search by using the titles for the articles that retrieve no result when searching for their PubMedIDs).

from rscopus.

Downloading more than the Scopus quota's about rscopus HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs