GithubHelp home page GithubHelp logo

paper-gender-analysis's Introduction

CS Conference Gender Analysis

Build Status Binder

Overview

This repository attempts to analyze the gender of first authors of papers at various conferences. There are several caveats here. Inferring gender based on name is never exact and the accuracy of this method has not been tested at all so any results should be considered suspect. Aside from manually labelling the gender of each author (also a difficult and potentially error-prone task), there are several approaches that could improve the accuracy of this method. For example, attempting to fetch the country of the author's affiliation could provide a more accurate prediction.

Dependencies

We make use of the genderComputer library for gender inference which is installed as a submodule. Therefore it is necessary to run git submodule update --init to fetch submodules in this repository. We also make use of Pipenv to manage dependencies, so this must be installed first as well. To install other dependencies, run pipenv install.

Running

The downloaded files can be analyzed by running the following command:

pipenv run python analyze_genders.py

This will print a CSV file with inferred counts of first authors by gender. You can also use this notebook for further analysis.

Adding a new conference

To add a new conference, simply edit fetch-papers.sh to retrieve new JSON data files. The files should be named CONF-xx.json where CONF is the name of the conference and xx is the year. The link to the JSON files can be obtained by looking at the table of contents for the proceedings in DBLP and selecting the JSON export link. Since data coming from DBLP is CC0 and can be freely shared, any new data files should be committed to this repository.

Fetching data from Scopus

To fetch data from Scopus, you will need an API key. This API key should be set in the .env file as SCOPUS_API_KEY. Data from Scopus can then be fetched by running fetch-scopus.sh. This will fetch all data on DB conferences from Scopus where a DOI is available from DBLP and save to scopus.json. Note that this requires the installation of jq to process the JSON from DBLP.

paper-gender-analysis's People

Contributors

michaelmior avatar

Watchers

 avatar Nele Sina Noack avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.