GithubHelp home page GithubHelp logo

fagan2888 / nber Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ledwindra/nber

4.0 1.0 2.0 1.12 GB

๐ŸŽ“ NBER website has a new design, let's scrape and analyze each paper with daily automatic update. ๐Ÿš€

License: Do What The F*ck You Want To Public License

Jupyter Notebook 97.83% Python 2.07% Shell 0.10%

nber's Introduction

CodeQL

About โœŒ๐Ÿฝ

Hello world ๐ŸŒ! Are you an economist, or economics student, or just a random person like me who is interested in economics? Do you want to write a paper, a thesis, or just ramble on some stuffs but don't have any fresh ideas on what should be the topic? Worry no more! Because, this repository is for you!

Warning! โš ๏ธ

Since this repository uses cron job from GitHub Actions to update the data, consequently the .git directory will eat up disk space. Hence, it is not advisable to clone this repository to your local machine. If you are interested to do something similar, just download this repository as a zipped file. You can do the following:

# download repository from main branch
wget https://github.com/ledwindra/nber/archive/main.zip

This won't include the .git directory and you can play around with the programs and data inside your local machine.

Download data

If you don't want to run this locally and just want to get straight to the data, just chill, relax and, download them Enjoy! ๐ŸŒž โ›ฑ ๐Ÿฅฅ ๐ŸŒด ๐Ÿ˜Ž.

  1. NBER
column_name data_type description
id integer NBER working paper ID
citation_title string Paper title
citation_author string Paper author(s). Can be more than one. Hence it is stored as an array
citation_publication_date date Date of paper being published
issue_date date Paper's issuance date
revision_date date Paper's revision date
topics string Paper topic(s). Can be more tan one. Hence it is stored as an array
program string Paper program(s). Can be more tan one. Hence it is stored as an array
projects string Paper project(s). Can be more tan one. Hence it is stored as an array
working_groups string Paper working group(s). Can be more tan one. Hence it is stored as an array
abstract string string
acknowledgement string Paper's acknowledgement (in paragraph)
  1. NBER citations (from RePEc)
column_name data_type description
id integer NBER working paper ID
cites integer Total cites for each paper
cited_by integer Numbers of times each paper being cited by other researchers
reference string A list of references for each paper
  1. Wikipedia

Columns are not fixed because each economist may have different completeness of information.

Use case

What can be done from this dataset? Well, let's take a look at index.ipynb. ๐Ÿ“™

Permission

  1. NBER Check its robots.txt. Everybody is not disallowed to get /papers/ tag.

  2. RePEc Coming from its open API: http://citec.repec.org/api.html

  3. Wikipedia Check robots.txt:

User-agent: *
Allow: /w/api.php?action=mobileview&
Allow: /w/load.php?
Allow: /api/rest_v1/?doc
Disallow: /w/
Disallow: /api/
Disallow: /trap/
Disallow: /wiki/Special:
Disallow: /wiki/Spezial:
Disallow: /wiki/Spesial:
Disallow: /wiki/Special%3A
Disallow: /wiki/Spezial%3A
Disallow: /wiki/Spesial%3A

We're using https://en.wikipedia.org/wiki/ so it's safe.

Closing

If you have read up to this line, thank you for bearing with me. Hope this is useful for your purpose! ๐Ÿ˜Ž ๐Ÿป

nber's People

Contributors

ledwindra avatar

Stargazers

 avatar Dima Sinno avatar Thabang  avatar Jin Tian avatar

Watchers

Olaf avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.