GithubHelp home page GithubHelp logo

ccao-data / model-condo-avm Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 2.0 23.68 MB

Automated valuation model for all class 299 and 399 residential condominiums in Cook County

License: GNU Affero General Public License v3.0

R 95.06% Dockerfile 0.99% HCL 3.96%
assessment condo data-science machine-learning model property-taxes r tidymodels

model-condo-avm's People

Contributors

dfsnow avatar jeancochrane avatar wrridgeway avatar

Stargazers

 avatar  avatar

model-condo-avm's Issues

Remove common area specific valuation

Valuations believes we are likely perpetuating inaccurate common area designations for condo parcels and that common areas should not be designated as condo parcels anyhow. We should completely ignore common area as a category when valuing condos in the pipeline.

Fix write_csv issue in export script

For whatever reason, DESCRIPTION and setup.R are not loading readr into the library while running export. Name-spacing it resolves the issue.

Improve handling of new construction and divisions

The condo model currently handles new construction the same way it handles any other property: by using sales to predict the value of unsold properties. This method doesn't work well for new properties because the sales are inevitably higher than similar non-new properties.

We should add flags and/or a separate valuation methodology for new construction condos. Specifically, we should look for new 299 PINs resulting from divisions and 297 PINs that become 299s. We should also flag large YoY drops in building price, as this can be an additional indicator of misvalued new construction.

Note that for the under construction 297 or 299 PINs may not be the same as the final 299 PINs.

Revisit condo strata imputation

The condo model uses recipes-based imputation for condo strata. Currently, it uses KNN using Gower's distance and a few of the most salient condo features (year built, distance, etc.). We should revisit this method considering the strata features do most of the work in the condo model and there are many condos in the City.

Value nonlivables as a function of livables within building

Nonlivables are difficult to assign a value to pre-disaggregation. If we exclude them from aggregation and then only assign them a value using their relative share of a building's total value, this could avoid shifting values for livable units.

[Infra updates] Copy res model infra updates to the condo model

We have a number of issues in the backlog to make it easier to deploy and run the residential model:

Once these changes have been deployed to the res model and we're feeling confident in their stability and their usefulness, we should replicate them in the condo model as well.

There may be opportunities for factoring out some shared code into a shared composite action, but that would have the downside of requiring us to manage a third repo containing the action that would need to be updated and versioned in order to make any changes. It may be simpler to just duplicate the logic between these two repos, unless A) the logic is nearly 100% identical or B) we realize that we'll want to use such an action in other repos as well.

Maintain feature parity with `model-res-avm`

model-res-avm has been undergoing significant upgrades for modelling this year. We need to make sure these upgrades are incorporated into the condo model. This includes:

  • Model features
  • Pipeline structure (Setup.R, removing sales val, etc.)
  • Adjustments to reporting ingest and structure
  • AWS EC2 integration

Add price breakdowns by strata to condo pipeline report

It's a little difficult currently to evaluate the precision with which strata are built and imputed. We should add a section to the pipeline report that breaks down sales prices (min, med, mean, max) by strata that are assigned and imputed.

Update pipeline to run with new column names

We've changed a lot of column names since we last ran the pipeline. We'll need to update instances where they're hard-coded to make sure the pipeline can run for the sake of investigating the importance of characteristics.

Set 299 parking space values

In the 2021 model, the CCAO used a few different methods to value parking. Primarily, it used a flat $10K FMV for most spaces. We should revisit this method and replace it with a variable rate or regression-based model.

Add new filters for ingesting sales

default.vw_pin_sale will soon be unfiltered by default and we need to use certain conditions to make sure we don't ingest unwanted sales:

AND NOT sale.sale_filter_is_outlier
AND NOT sale.sale_filter_deed_type
AND NOT sale.sale_filter_less_than_10k
AND NOT sale.sale_filter_same_sale_within_365

Update documentation

This repo has undergone major changes over the last year. We need to update readmes to properly reflect those changes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.