GithubHelp home page GithubHelp logo

dheia / address-normalization Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zerodahero/address-normalization

0.0 0.0 0.0 73 KB

Very basic address normalizer. Based on port of the perl module Geo::StreetAddress::US originally written by Schuyler D. Erle.

PHP 100.00%

address-normalization's Introduction

Basic Address Normalizer

Build Status

Purpose

The main purpose of this package is as the first layer of address normalization and standardization. Recommended use is to pre-parse/normalize an address and compare to an existing cache/record set using the hash functions.

A way to normalize US mailing addresses without the need for an external service. This is a port of the perl module Geo::StreetAddress::US originally written by Schuyler D. Erle.

This is a fork from khartnett/address-normalization -- kudos for the original work!

Limitations

This is a very basic normalizer. It realistically only handles US-based addresses, and should not be considered dependable for strict address-to-address comparison. This normalizer does not verify the validity of the address! If you are dependent on accurate addresses, you need to be using some other means (3rd party service, most likely) to verify an address.

Why?

I forked and added features to this package because I needed a decent first-layer to pre-normalize addresses before sending them our standardization service. This helps us limit the number of calls and strict dependence on the service, but also lets us catch a few easy-to-match scenarios here and there, which is a better user experience.

Alternatives

Libpostal is probably the best of its class in this area. I decided not to use Libpostal because: (1) It requires a few Gbs of space, which is undesirable in my current environment, and (2) it's probably overkill, since I consider our 3rd party service to be authoritative in the matter anyway.

Installation

composer require zerodahero/address-normalization

Usage

Normalizing

<?php
use ZeroDaHero\Normalizer;
$normalizer = new Normalizer();

// This returns a \ZeroDaHero\Address object with the parsed components
$address = $normalizer->parse('204 southeast Smith Street Harrisburg, or 97446');

$address->getAddressComponents();
/* output:
[
    "number" => "204",
    "street" => "Smith",
    "street_type" => "St",
    "unit" => "",
    "unit_prefix" => "",
    "suffix" => "",
    "prefix" => "SE",
    "city" => "Harrisburg",
    "state" => "OR",
    "postal_code" => "97446",
    "postal_code_ext" => null,
    "street_type2" => null,
    "prefix2" => null,
    "suffix2" => null,
    "street2" => null,
] */

$address->toString();
/* string_result:
"204 SE Smith St, Harrisburg, OR 97446"
*/

Comparing

<?php
use ZeroDaHero\Normalizer;
$normalizer = new Normalizer();

$address1 = $normalizer->parse('204 southeast Smith Street Harrisburg, or 97446');
$address2 = $normalizer->parse('204 SE Smith St. Harrisburg, Oregon 97446');
// Same street, different number
$address3 = $normalizer->parse('207 SE Smith St. Harrisburg, Oregon 97446');

$address1->is($address2); // true
$address2->is($address3); // false
$address1->isSameStreet($address3); // true

// or can compare hashes directly
$address1->getFullHash() === $address2->getFullHash(); // true

Formatting

<?php
use ZeroDaHero\Normalizer;
$normalizer = new Normalizer();

$address = $normalizer->parse('204 southeast Smith Street Harrisburg, or 97446');

$address->toString();
// or
(string) $address;
/* string:
  "204 SE Smith St, Harrisburg, OR 97446"
*/

$address->toArray();
/* array:
  [
    '204 SE Smith St',
    'Harrisburg, OR 97446'
  ]

Hashing

If you only need to make use of a consistent way of hashing (e.g. if you're starting with a dependable 5-part address, such as from a 3rd party service), you can build a SimpleAddress.

<?php
use ZeroDaHero\SimpleAddress;

$address = new SimpleAddress('1234 Main St NE', null, 'Minneapolis', 'MN', '55401');
$address->getHash(); // full hash minus zip
$address->getFullHash(); // full hash including zip

// or do it all with the factory method:
SimpleAddress::hashFromParts('1234 Main St NE', null, 'Minneapolis', 'MN', '55401');

// CANNOT hash street, since the component parts don't exist
$address->getStreetHash(); // throws exception

address-normalization's People

Contributors

khartnett avatar sushantiwari avatar zerodahero avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.