GithubHelp home page GithubHelp logo

Comments (9)

muschellij2 avatar muschellij2 commented on July 21, 2024 1

Thanks - I can't really work around that quota for now. Also, make sure you have your API registered for research if that's the purpose as Scopus mandates this I believe.

Can you show me an example of the query or what you want to download? I can't debug or help without specifics.

from rscopus.

muschellij2 avatar muschellij2 commented on July 21, 2024 1

Here's some basic code I think gets you the most of the way:

library(rscopus)
au_ids = c(23480260200, 8708052900, 54896131300, 
	55570070100,
	55479219200, 7409391345, 55500593700, 39362440900)
# get all the data for the authors (including all co-authors)
res = lapply(au_ids, author_data)
names(res) = au_ids

# get co-authors
all_authors = lapply(res, function(x) {
	x$full_data$author
})

# get unique IDs for those authors
unique_authors = lapply(all_authors, function(x) {
	unique(x$authid)
})

# collapse all authors together
combined_authors = unlist(unique_authors)
combined_authors = unique(combined_authors)
# don't need the original authors in there
combined_authors = setdiff(combined_authors, au_ids)

# just doing first 5 due to API limits (but you can run these in chunks)
run_authors = combined_authors[1:5]
all_author_res = lapply(
	run_authors, 
	author_data,
	count = 200, view = "STANDARD")
names(all_author_res) = run_authors
all_author_res[[1]]$df

from rscopus.

muschellij2 avatar muschellij2 commented on July 21, 2024 1

Please follow up with the duplicates with Elsevier/Scopus.

You seem to have changed your goal. I gave the solution I feel you requested. I have provided the tools, but I don't have any other info in these things and you can open another issue for "Quota" limits, but otherwise this is a scripting question and not development question and am closing.

from rscopus.

Gerwi avatar Gerwi commented on July 21, 2024

Ok, hope that they will provide this in the future, but I am not that optimistic.

For my case I was trying to retrieve all publications of a small country via the scopus_search API.

For now I have changed my strategy and am planning to use PubmedIDs to circumvent this issue.

from rscopus.

Gerwi avatar Gerwi commented on July 21, 2024

Thanks a lot for this suggestion.

For know I am trying the following approach, starting with a csv file containing pubmedIDs:

#Cut pubmed_list in small parts (large search request are not handled by the API, and smaller can also maximize the utilization of the weekly quota's. Subsequently a search_string is created in the format: "PMID(123456 OR 123457)" which can be handled by the scopus_search. The output object of this are subsequently stored in a list. 

chunk <- 5
n <- nrow(pubmed_list)
r  <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(pubmed_list,r)


res_list <- list()
for (number in range(1:4)){
    names(d)[names(d)==number]<-"string"
    string = data.frame(d$string[[1]])
    string$OR = " OR "
    names(string) = c("pmid","OR")
    string = paste0(string$pmid,string$OR)
    string = substr(string,1,nchar(string)-4)
    string = paste0("PMID(",string,")")
    res=scopus_search(query=string, view="COMPLETE", max_count = 1)
    res_list[[number]] <- res
    names(d)[names(d)=="string"]<-number}

OT: Not related to this package, but nevertheless worth mentioning, some articles are included twice in Elsevier. For example: PMID(30428293)

from rscopus.

muschellij2 avatar muschellij2 commented on July 21, 2024

OK - where are you seeing the 80k limit?

from rscopus.

muschellij2 avatar muschellij2 commented on July 21, 2024

I think PubMed IDs may cause some problems as I've seen them not return results given my permission for my API key: "API key in this example was setup with authorized CORS domains." as I've tried on interactive APIs: https://dev.elsevier.com/interactive.html

from rscopus.

muschellij2 avatar muschellij2 commented on July 21, 2024

Such as PMID(30391859): https://www.ncbi.nlm.nih.gov/pubmed/30391859, but PMID(30391859) in scopus search gets nothing.

from rscopus.

Gerwi avatar Gerwi commented on July 21, 2024

Just as clarification, since I am not encountering issues anymore(so it can remain closed), my current strategy is to search for a particular disease, for example: "Heart Defects, Congenital"[Mesh], download the list with pubmedIDs and save them in a csv, and cut them in chuncks. Subsequently a for loop is used to transform the IDs to a search string and the output dataframes are stored in lists, which are rbind to dataframes. The for loop breaks automatically when the quota is reached.

`pubmed_list_diabetes<-read.csv("pubmed_list_diabetes.csv")

rm(d)
chunk <- 1
n <- nrow(pubmed_list)
r  <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(pubmed_list,r)


publications_list = list()
affiliations_list = list()
authors_list = list()
remaining = 10
for (number in 1:20){
    if (remaining < chunk){
    break }
    names(d)[names(d)==number]<-"string"
    string = data.frame(d$string[[1]])
    string$OR = " OR "
    names(string) = c("pmid","OR")
    string = paste0(string$pmid,string$OR)
    string = substr(string,1,nchar(string)-4)
    string = paste0("PMID(",string,")")
    res=scopus_search(query=string, view="COMPLETE", max_count = 1)
    entries = gen_entries_to_df(res$entries)
    entries$df$entry_number2=paste0(number,".",entries$df$entry_number)
    publications_list[[number]] = entries$df
    entries$affiliation$entry_number2=paste0(number,".",entries$affiliation$entry_number)
    affiliations_list[[number]] = entries$affiliation
    entries$author$entry_number2=paste0(number,".",entries$author$entry_number)
    authors_list[[number]] = entries$author
    names(d)[names(d)=="string"]<-number
    remaining=res$get_statements$headers$`x-ratelimit-remaining`}` 

Such as PMID(30391859): https://www.ncbi.nlm.nih.gov/pubmed/30391859, but PMID(30391859) in scopus search gets nothing.

The searches returning no articles can be due to two reasons:

  • PubMed covers more medical journals than scopus
  • Some PubMed articles are in scopus, but without their PubMedID, probably due to that indexing at PubMed and Scopus does not happen at the same time (for example, you will find the article PMID(30391859) in scopus by searching on it's title "A dual modeling approach to automatic segmentation of cerebral T2 hyperintensities and T1 black holes in multiple sclerosis".

The first reason is not solvable, but the second one can be quite easily corrected, by downloading from PubMed a csv linking titles to PubMedIDs (which can be used to search by using the titles for the articles that retrieve no result when searching for their PubMedIDs).

from rscopus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.