GithubHelp home page GithubHelp logo

addr's Introduction

Lifecycle: experimental CRAN status R-CMD-check

addr

The goal of addr is to clean, parse, harmonize, and hash messy real-world addresses in R.

Addresses that were not validated at the time of collection are often heterogenously formatted, making them difficult to compare or link to other sets of addresses. The addr package is designed to (1) clean character strings of addresses, (2) use the usaddress library to tag address components, (3) expand common street type abbreviations, and (4) paste together select components to create a standardized address. Standardized addresses can be hashed to create hashdresses that can be used to merge with other sets of addresses.

Installation

You can install the development version of addr from GitHub with:

# install.packages("pak")
pak::pak("cole-brokamp/addr")

addr requires a working Rust toolchain; install one using rustup.

Example

library(addr)

Transform messy, real-world addresses into a character vector of standardized address:

addr_standardize(
  x = c(
    "3333 Burnet Avenue Apt 2 Cincinnati OH 45220",
    "3333 bUrNeT Avenue Cincinnati OH 45220",
    "3333 Burnet Avenue Apt #2 Cincinnati OH 45220",
    "3333 Burnet Ave Cincinnati OH 45220",
    "3333 Burnet Av. Cincinnati OH 45220"
  )
)
#> [1] "3333 burnet avenue cincinnati oh 45220"
#> [2] "3333 burnet avenue cincinnati oh 45220"
#> [3] "3333 burnet avenue cincinnati oh 45220"
#> [4] "3333 burnet avenue cincinnati oh 45220"
#> [5] "3333 burnet avenue cincinnati oh 45220"

Use a hash representing the standardized address instead:

addr_hash(
  x = c(
    "3333 Burnet Avenue Apt 2 Cincinnati OH 45220",
    "3333 bUrNeT Avenue Cincinnati OH 45220",
    "3333 Burnet Avenue Apt #2 Cincinnati OH 45220",
    "3333 Burnet Ave Cincinnati OH 45220",
    "3333 Burnet Av. Cincinnati OH 45220"
  )
)
#> [1] "da219816d9cb3e1bb53291312cfa1dfd" "da219816d9cb3e1bb53291312cfa1dfd"
#> [3] "da219816d9cb3e1bb53291312cfa1dfd" "da219816d9cb3e1bb53291312cfa1dfd"
#> [5] "da219816d9cb3e1bb53291312cfa1dfd"

For finer control, use addr_tag() to generate tagged address components:

addr_tag(c("290 Ludlow Avenue Apt #2 Cincinnati OH 45220", "3333 Burnet Ave Cincinnati OH 45219"))
#> [[1]]
#>       AddressNumber          StreetName  StreetNamePostType       OccupancyType 
#>               "290"            "Ludlow"            "Avenue"               "Apt" 
#> OccupancyIdentifier           PlaceName           StateName             ZipCode 
#>                 "2"        "Cincinnati"                "OH"             "45220" 
#> 
#> [[2]]
#>      AddressNumber         StreetName StreetNamePostType          PlaceName 
#>             "3333"           "Burnet"              "Ave"       "Cincinnati" 
#>          StateName            ZipCode 
#>               "OH"            "45219"

addr's People

Contributors

cole-brokamp avatar

Stargazers

Andrew Allen Bruce avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.