This is a website built using the template Dimension by HTML5 UP from html5up.net by @ajlkn. Many thanks!
speoc-pt-1's Introduction
speoc-pt-1's People
speoc-pt-1's Issues
Summer 2022 General Report
Report Link: https://docs.google.com/document/d/1d2tXolnwAempscJsL5v80vjU9GEMAhRLjn-C637vma0/edit
Idea is to keep track of our work - for each section show a few figures/tables, describe how we made them. The rest of the figures/tables can either be linked (include github url) or put in an appendix
We should include links to the code and to the figures when we're referencing them, in addition to actually including the picture.
Occupations
Tasklist
Categorize occupations in a similar way to all states. Produce 3 tables
- Table 1: 13* tables like the current CT one, one for each state. In this table (or in a separate one), also include how much debt was held by each occupation in a county in a state (like the current CT table, but with more information)
- Table 2: 1 table for all states, containing amount of debt held by each occupation. When calculating amount, add up using the 6p_Dollar and 6p_Cent columns.
Include number of individuals in occupation for each table, average amount held by each occupation, average amount held by people w/o occupation, total amount held by people w/o occupation and number of people w/o occupation. For the total amount column, we should also add a percentage column.
Histogram of occupation vs no occupation debt distribution
- One histogram for all the states
- 13* histograms, one for each state (like the CT one you made already)
- when i say 13 I mean however many states we have data for.
Town-County-State Asset table
Let's see if we can impute county or state for certain towns when they don't have a state listed.
- Create table of the amount of assets in each town, with town county and state columns. Create separate table for "weird" missing values.
- Sanity check to make sure the towns we categorize as being from the same are in the same county, according to our county crosswalk
- Check each of the fuzzy string matches
Maps
Maria, please add to the task list if you have any ideas or lmk if any of these are infeasible. Others, feel free to jump in if you have questions or are interested in working on the maps.
Different Types of Maps
- One map with all 13 colonies (or however many we have data for)
- ~13 maps, one for each state
- add county names to maps (for maps that only contain one state)
Different Types of Debt Aggregations (I'll get you the data and reference the issue when I do)
- Map with average amount of debt held, per debt holder
- Map with per capita amount of debt held, per county population
Proofing Maps
- On the report, add links to old and new county maps for each state so we're confident boundaries haven't changed much, if at all
Main Improvements
These are the problems the code currently has, you can read the notes above each function for more details.
-
Get rid of the SettingWithCopyWarning: the way I wrote the original code was not optimized for pandas so it gave a warning when run. This probably requires reading documentation and finding a smarter way to implement merging/replacing rows.
-
Deal with sheets that have two names on certificates: the standardize function now forces each sheet to have four name columns (two full names) but the rest of the code does not properly handle that. I think the NaN values in the empty name 2 columns seem to be interfering with something in the original code.
-
Add other functions to clean cell content: Chris has a lot of code he wrote for specific situations that could be applied to the data before it undergoes simplification. I have written a simple function to lowercase everything.
-
Determine what other information to include in the clean data: I added a row called 'Cert Count' to keep track of how many rows are merged. I also added a sanity check in the bookkeeping function that adds all the Cert Counts together and compares it to the original number of rows in the data. The check seems to work, but the actual simplify code is broken so I don't know for sure.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.