GithubHelp home page GithubHelp logo

codeforamerica / naics-api Goto Github PK

View Code? Open in Web Editor NEW
93.0 37.0 53.0 12.14 MB

Basic API to return NAICS codes and information

License: BSD 3-Clause "New" or "Revised" License

JavaScript 46.22% Python 53.78%

naics-api's Introduction

NAICS-API

NAICS (North American Industry Classification System) is maintained by the United States Bureau of Labor Statistics to classify business types. It is used for aggregating, presenting, and analyzing data and trends in the US economy.

The classification system is currently hosted by the Census Bureau and provided in various Excel and PDF documents, with some rudimentary HTML output and a not-so-great search tool. Our goal is to improve on the Census Bureau's offerings by providing an API to make information machine-readable, with better search functionality, to assist with developing applications that depend on understanding or collecting information about businesses.

The Product

NAICS API is currently a Node.js server that returns NAICS data in a JSON format. Information stored on the server has been scraped or collected from files on the Census.gov web site. Most of the information for 2007 and 2012 has now been scraped thanks to the addition of a python scraper by Mike Migurski (see ./data/scrape-examples-xrefs).

API documentation

Latest API documentation is hosted at Apiary.io.

API example requests

Example request

http://api.naics.us/v0/q?year=2012&code=519120

To get NAICS codes above a given code

http://api.naics.us/v0/q?year=2012&code=519120&above=1

To get NAICS codes below a given code

http://api.naics.us/v0/q?year=2012&code=51&below=1

To get all NAICS codes for a given years codes (only 2007 and 2012 are available right now)

http://api.naics.us/v0/q?year=2012

To get all NAICS codes for given search terms (searches only title and index right now)

http://api.naics.us/v0/s?year=2012&terms=libraries

Warning! The URL (server and/or structure) is likely to change in the very near future. Do not use for production (yet).

Usage

Additional information

Development setup (on Mac OS X 10.8)

First-time setup

  1. Download and install Node.js.

  2. Clone this repository to a folder on your computer. The rest of this document will refer to this folder as $PROJECT_ROOT.

  3. Install project dependencies.

    cd $PROJECT_ROOT npm install

Every time you sync $PROJECT_ROOT with the remote GitHub repo

  1. Update the project dependencies.

    cd $PROJECT_ROOT npm install

To start the REST API server

  1. Start the REST API server.

    cd $PROJECT_ROOT npm start

Contributing

Help Needed

There are other data that can be included in the API. Not all of these are within the scope of the scraper however.

  • Illustrative examples from 2007 NAICS
  • Information from NAICS prior to 2007 (2002, 1997 - low priority)
  • Data for converting between different NAICS codes and other systems, like SIC or NIGP

On the API side:

  • The API should perform searches on all the available data and return relevant results from the requester (e.g. a business type lookup application)
  • Close existing issues.

Submitting an Issue

We use the GitHub issue tracker to track bugs and features. Before submitting a bug report or feature request, check to make sure it hasn't already been submitted. When submitting a bug report, please include a Gist that includes a stack trace and any details that may be necessary to reproduce the bug, including your gem version, Ruby version, and operating system. Ideally, a bug report should include a pull request with failing specs.

Submitting a Pull Request

  1. Fork the repository.
  2. Create a topic branch.
  3. Add specs for your unimplemented feature or bug fix.
  4. Implement your feature or bug fix.
  5. Add, commit, and push your changes.
  6. Submit a pull request.

Code for America Tracker

naics-api's People

Contributors

migurski avatar richaagarwal avatar ycombinator avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

naics-api's Issues

Search results should include entries that contain search terms that are separated by other words

(10/5 EDIT: Reworded for clarity, with some more examples.)

Currently, a search like "space research" returns only results where the two words are not separated by other words. In our case, this functionality is good for returning more relevant results (e.g. NAICS code 927110 “Space Research and Technology”) that the official Census search result does not have. (This can be checked on this search form).

In comparison, the official Census search result also returns 541712, “Research and Development in the Physical, Engineering, and Life Sciences (except Biotechnology)”, because the words "space research" appear, separated by other words, in the list of index entries (aka, the list of alternate titles). We should do the same.

Another situation that might be more prevalent is if someone searches for "pizza shop." The current search returns no results. The official Census search will properly return the result of 722513 (pizza delivery shops)

cc @rclosner

Service Unavailable

I previously used this API by hitting the address http://naics.codeforamerica.org/. Trying to use it today, I see the address changed at some point to http://api.naics.us/. In any case, neither of those addresses are accessible at this moment. Is this a temporary issue, or is the service no longer available? If the latter, is there a suggested replacement?

Thanks for your help!

Consider being helpful by defaulting to last NAICS code set when a user tries to retrieve codes for non-NAICS years (e.g. 2008)

From @ycombinator: Trying to get 2008 codes will return 2007, or trying to get 2013 codes will return 2012, with a helpful (?) warning that NAICS codes don't actually exist for the requested year.

My comment: feature usefulness might depend on actual behavior. The hypothesis is that most people who use NAICS just know what years are available, and will not bother looking for a 2013 code.

2007 Illustrative Examples are not included

They are in the PDF definition file, but not on the census.gov online search tool, so they haven't been scraped. (Maybe they can never be scraped without accessing the PDF?)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.