GithubHelp home page GithubHelp logo

speoc-pt-1's Introduction

This is a website built using the template Dimension by HTML5 UP from html5up.net by @ajlkn. Many thanks!

speoc-pt-1's People

Contributors

alicezg2 avatar davidch2020 avatar jasmine-m-garcia avatar liaochris avatar mariad18 avatar petergaotx avatar realjiachengli avatar snapwhiz914 avatar

Watchers

 avatar  avatar

speoc-pt-1's Issues

Occupations

Tasklist

Categorize occupations in a similar way to all states. Produce 3 tables

  • Table 1: 13* tables like the current CT one, one for each state. In this table (or in a separate one), also include how much debt was held by each occupation in a county in a state (like the current CT table, but with more information)
  • Table 2: 1 table for all states, containing amount of debt held by each occupation. When calculating amount, add up using the 6p_Dollar and 6p_Cent columns.

Include number of individuals in occupation for each table, average amount held by each occupation, average amount held by people w/o occupation, total amount held by people w/o occupation and number of people w/o occupation. For the total amount column, we should also add a percentage column.

Histogram of occupation vs no occupation debt distribution

  • One histogram for all the states
  • 13* histograms, one for each state (like the CT one you made already)
  • when i say 13 I mean however many states we have data for.

Town-County-State Asset table

Let's see if we can impute county or state for certain towns when they don't have a state listed.

  • Create table of the amount of assets in each town, with town county and state columns. Create separate table for "weird" missing values.
  • Sanity check to make sure the towns we categorize as being from the same are in the same county, according to our county crosswalk
  • Check each of the fuzzy string matches

Maps

Maria, please add to the task list if you have any ideas or lmk if any of these are infeasible. Others, feel free to jump in if you have questions or are interested in working on the maps.

Different Types of Maps

  • One map with all 13 colonies (or however many we have data for)
  • ~13 maps, one for each state
  • add county names to maps (for maps that only contain one state)

Different Types of Debt Aggregations (I'll get you the data and reference the issue when I do)

  • Map with average amount of debt held, per debt holder
  • Map with per capita amount of debt held, per county population

Proofing Maps

  • On the report, add links to old and new county maps for each state so we're confident boundaries haven't changed much, if at all

Main Improvements

These are the problems the code currently has, you can read the notes above each function for more details.

  1. Get rid of the SettingWithCopyWarning: the way I wrote the original code was not optimized for pandas so it gave a warning when run. This probably requires reading documentation and finding a smarter way to implement merging/replacing rows.

  2. Deal with sheets that have two names on certificates: the standardize function now forces each sheet to have four name columns (two full names) but the rest of the code does not properly handle that. I think the NaN values in the empty name 2 columns seem to be interfering with something in the original code.

  3. Add other functions to clean cell content: Chris has a lot of code he wrote for specific situations that could be applied to the data before it undergoes simplification. I have written a simple function to lowercase everything.

  4. Determine what other information to include in the clean data: I added a row called 'Cert Count' to keep track of how many rows are merged. I also added a sanity check in the bookkeeping function that adds all the Cert Counts together and compares it to the original number of rows in the data. The check seems to work, but the actual simplify code is broken so I don't know for sure.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.