GithubHelp home page GithubHelp logo

cjabradshaw / epsilonindex Goto Github PK

View Code? Open in Web Editor NEW
6.0 5.0 2.0 410 KB

A function to assess the ε-index of a researcher's relative citation performance

License: GNU General Public License v3.0

R 100.00%
gender bibliometrics citations h-index m-quotient ranking career-stage academia performance

epsilonindex's Introduction

ε-index

ε-index

R function to calculate the ε-index of a researcher's relative citation performance

Prof Corey J. A. Bradshaw
Global Ecology, Flinders University, Adelaide, Australia
September 2021
e-mail

Existing citation-based indices used to rank research performance do not permit a fair comparison of researchers among career stages or disciplines, nor do they treat women and men equally. We designed the ε-index, which is simple to calculate, based on open-access data, corrects for disciplinary variation, can be adjusted for career breaks, and sets a sample-specific threshold above and below which a researcher is deemed to be performing above or below expectation.

Code accompanies the article:

BRADSHAW, CJA, JM CHALKER, SA CRABTREE, BA EIJKELKAMP, JA LONG, JR SMITH, K TRINAJSTIC, V WEISBECKER. 2021. A fairer way to compare researchers at any career stage and in any discipline using open-access citation data. PLoS One 16(9): e0257141. doi:10.1371/journal.pone.0257141

--
DIRECTIONS

  1. Create a .csv file of exactly the same format as the example file in this repository ('datasample.csv'):
  • COLUMN 1: personID — any character identification of an individual researcher (can be a name)
  • COLUMN 2: gender — researcher's gender ("F" or "M")
  • COLUMN 3: i10 — researcher's i10 index (# papers with ≥ 10 citations); must be > 0
  • COLUMN 4: h — researcher's h-index
  • COLUMN 5: maxcit — number of citations of researcher's most cited peer-reviewed paper
  • COLUMN 6: firstyrpub — the year of the researcher's first published peer-reviewed paper
  1. Import the sample .csv file, or your own following the format indicated above (make sure first to specify the directory in which 'datasample.csv' resides using the 'setwd()' command):

     setwd("/path") # where /path is the directory path on your machine
     example.dat <- read.csv("datasample.csv", header=T) 
    
  2. Alternatively, you can automatically harvest the necessary citation data from Google Scholar using the 'get.profile.func.R' function, which produces a file that can be called directly by the 'epsilon.index.func.R':

    i. Predefine a Google Scholar ids vector (12-character user ID from scholar.google.com), e.g.,

      ids <- c("1sO0O3wAAAAJ","ZBUju2QAAAAJ","oGAui-IAAAAJ","cpJnEYIAAAAJ","ptDEg44AAAAJ","PJYrOvQAAAAJ","4UxbBYIAAAAJ") 
    

    ii. Then define a 'genders' vector of the same length, e.g.,

      genders <- c("M","M","F","M","M","F","F")
    

    iii. Load get.profile.func

    iv. Define an input file that the epsilon.index.func will use, e.g.,

      example.dat <- getProfiledatFunc(ids, genders)
    

    Note: The estimation of the first year of publication (Y1) can return errors because the function does not differentiate peer-reviewed and non-peer-reviewed entries in Google Scholar, nor can it avoid clearly erroneous entries in a researcher's publication history. We recommend that all harvested values for the year of first publication be checked manually for each researcher in the sample. A case in point is id=ptDEg44AAAAJ that returns Y1 = 1791, but the true year of first publication for this researcher is 1982.

  3. Load the function ('epsilon.index.func') in R by submitting the entire function code (lines 20 to 212) to the R console.

  4. Simply run the function as follows:

     epsilonIndexFunc(dat.samp=example.dat, bygender=c('no','yes'), sort.index=c('e', 'd', 'ep', 'dp'))
    

where 'bygender' indicates whether you want to calculate the gender-debiased index, and 'sort.out' is a sorting option for the final results table based on desired index (default = 'e')

possible values: 'e' = pooled; 'ep' = normalised; 'd' = gender-debiased; 'dp' = normalised gender-debiased

If there are insufficient individuals per gender to estimate a gender-specific index, we recommmend selecting bygender='no' and not using or sorting based on the gender-debiased index (option 'd'). If the individuals in the sample are not all in the same approximate discipline, we recommend not using or sorting based on either of the two normalised indices (options 'ep' or 'dp').

The output includes the following columns:

  • person: researcher's ID (specified by user)
  • gender: F=female; M=male
  • yrs.publ: number of years since first peer-reviewed article
  • gender.eindex: ε-index relative to others of the same gender in the sample
  • expectation: whether above or below expectation based on chosen index (default is 'e' = pooled index)
  • m-quotient: h-index ÷ yrs.publ
  • h-index: h-index
  • debiased.e.prime.index: scaled gender.eindex (gender ε′-index)
  • gender.rank: rank from gender.eindex (1 = highest)
  • rnk.debiased: gender-debiased rank (1 = highest)
  • pooled.eindex: ε-index generated from the entire sample (not gender-specific)
  • e.prime.index: scaled pooled.eindex (ε′-index)
  • pooled.rnk: rank from pooled.eindex (1 = highest)

and

if sort.index = 'ep':

  • eprime.rnk: rank from scaled pooled.eindex (ε′-index)

or if sort.index = 'dp':

  • eprime.debiased.rnk: rank from scaled gender.eindex (gender ε′-index)
  1. You can easily export the output to a file like this:

     out <- epsilon.index.func(dat.samp=example.dat, sort.index=c('e', 'd', 'ep', 'dp'))
     write.table(out,file="rank.output.csv",sep=",",dec = ".", row.names = F,col.names = TRUE)
    

epsilonindex's People

Contributors

cjabradshaw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

epsilonindex's Issues

Unable to follow example code?

Hi,

I'm trying to use this approach including getting profile data from googlescholar. I'm struggling to follow the code and have run into a few issues:

  1. Step four has a different function name to the one defined in the uploaded code 'getProfiledatFunc' vs 'getProfileFunc'
    `iv. Define an input file that the epsilon.index.func will use, e.g.,

example.dat <- getProfiledatFunc(ids, genders)`

  1. The package "scholar" is required for the function 'get_profile' defined in the function. It may be worth adding that package to the example code.

  2. When addressing the above two, I still get the error: Error in getProfileFunc(ids, genders) : unused argument (genders). When I pass a vector of only ids to the function I get the error Error in gsdata[, 1] : incorrect number of dimensions

That's where my R knowledge ends!

Not calculating

I have a simple 24 row list with the requisite 6 columns uploaded and the data seems to import correctly. When I click on "calculate e-index" it says "An error has occurred. Check your logs or contact the app author for clarification."

Any ideas?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.