GithubHelp home page GithubHelp logo

cancer_mortality_regression's Introduction

Cancer Mortality Regression


Built Linear Regression model to predict death rate in different counties across the U.S. Data was collected from the American Community Survey (census.gov), clinicaltrials.gov, and cancer.gov (see https://data.world/nrippner/ols-regression-challenge for more detail).


Summary

Cancer has a large impact on society, affecting people from all different walks of life across the United States. In this report, we created a regression model that can be used to predict cancer mortality rate for counties across the U.S. We chose this topic due to a common interest in national healthcare and healthcare policy. The research objective of this project was twofold:

  1. Identify key characteristics that are associated with cancer death rates
  2. Build a model to predict the cancer death rate in each county (death rates normalized according to population)

Through building a regression model and analyzing the predictions of our model, we hoped to glean some insight into the characteristics of cancer mortality rates in the United States (see "Discussion and Future Improvements" section of final report for further discussion of model applications).

File Structure

  • Final Report.Rmd, Final-Report.pdf: All code from start to finish, final report submission
  • Project Proposal.Rmd, Project-Proposal.pdf: Proposal for the final project

Data

  • cancer_reg.csv: original dataset downloaded from OLS challenge linked above (see Appendix A of report for full data dictionary)
  • regions.csv: dataset mapping states to their geographic region and division
  • State_Abbreviation_Mapping.csv: dataset maping states to their abbreviations
  • Other datasets: intermediate datasets produced during cleaning and preprocessing (can ignore if looking at only Final Report)

Archive

Contains intermediate files that represent different components of the final report


Kendall Kikkawa, Jonathan Luo, and Andre Sha's final project UC Berkeley's Stat 151A (Linear Modeling: Theory and Applications) in Fall 2020.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.