GithubHelp home page GithubHelp logo

kgudel / hmda-census Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cfpb/hmda-census

0.0 0.0 0.0 83.18 MB

ETL for geographic and Census data used by the HMDA Platform

Jupyter Notebook 83.44% Python 16.56%

hmda-census's Introduction

HMDA Census Geographic and Demographic Data

Table of Contents

Repository Purpose

  • Provide an ETL for geographic and Census data used by the HMDA Platform
  • Check to ensure the accuracy of Census data in the HMDA Platform

Requirements and Setup and Running the Code

Install Requirements The code is built in Python3.X which can be found at the link below. The following packages are also required and can be installed using the commands listed.

  • Python 3.6 or greater
  • set up a virtual environment if desired: virtualenv venv
    • turn on virtual environment: source venv/bin/activate
    • turn off virtual environment: deactivate
  • install requirements packages: pip install -r requirements.txt
  • Note: to load data files to a database, you must have one installed locally. This code has been tested with PostgreSQL

Creating Yearly Census File for the HMDA Platform

  1. Update the python/census_config.yaml to include the relevant years census file in msa_md_delinations section.
  2. Update the year variable in the python/create_ffiec_census_file.py file to be the year for which you want to generate the platform census file.
  3. Run the python/create_ffiec_census_file.py file.
  4. The file will be created in output/ as ffiec_census_msamd_names_<year>.txt
  5. Move the file in the HMDA-Platform repo as common/src/main/resources/ffiec_census_<year>.txt

Working With the Scripts

Configuration: Determines which years of data to use, allows selection of fields in both data files, and contains data specifications and URLs relevant.

The configuration is used in the census_functions.py class. The test.py script contains examples that use the class to download, cut, merge, and load to database the resulting census data.

Current issues:

  • MSA to tract mapping verification needs to be updated for the new codebase
  • MSA delineation files pre-2000 are in a different format that is yet to be parsed

Sources of Data

The HMDA Platform uses data the combines elements of the FFIEC Census Flat File and the OMB MSA delineation files. The FFIEC Census file contains over 1,000 data elements, of which the HMDA Platform uses a small subset. The OMB MSA bulletines are primarily used for names.

The Office of Management and Budget produces MSA data. Updates can include changes to an MSA's boundaries or creation of new MSAs. These data have no regular publication cycle. HMDA Operations uses the MSA definitions in effect on 12/31 of the year preceding collection, this aligns with other Regulation C criteria.

The Census delineation files are used to map names to MSA/MD geographies.

The FFIEC produces an annual Census Flat File containing demographic data and a mapping of MSA data to Census tract.

Additional Census data is available, but not used in this project: The Census reference files contain MSA/MD, micropolitan statistical area definitions, names, and maps to county and tract codes.

Uses of Data

The HMDA Platform uses data during data submission and publication.

During submission Census data are used to verify the relationship between reported geographic identifiers for loans and applications.

In publication the Census demographic and geographic data are used to add demographic information to LAR datasets. The variables added include:

  • Total Population
  • Minority Population Percentage
  • FFIEC Median Family Income
  • Tract to MSA/MD Income Percentage
  • Number of Owner Occupied Units
  • Number of 1 to 4 Family Units
  • MSA (new in 2018, was previously submitted by FIs)

Census geographic data are used to map MSAs to county and tract areas in the Aggregate and Disclosure reports and for geographic lookup features in HMDA data tools web interfaces.

See here for the HMDA-Platform logic mapping Census to LAR data.

HMDA Publication Products

  • Aggregate Reports: contain MSA level data on application and lending activity for all institutions reporting HMDA data.
  • Disclosure Reports: contain MSA level data on application and lending activity for a single institution.
  • LAR snapshot publication: contains the entire dataset of loans and applications submitted in accordance with Regulation C.

HMDA Platform Census Files

hmda-census's People

Contributors

kgudel avatar kibrael avatar patrickgoraft avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.