GithubHelp home page GithubHelp logo

osm-search / tiger-data Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 3.0 1.1 MB

Preprocessing US Census TIGER data for Nominatim geocoder

License: GNU General Public License v2.0

Shell 2.73% Python 97.27%

tiger-data's People

Contributors

lonvia avatar mtmail avatar ripnyt-ripnyt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tiger-data's Issues

Filter negative house number ranges

From Nominatim's Postgresql logfile

WARNING:  Negative house number range (-201 to 499)
WARNING:  Negative house number range (-123 to 101)
WARNING:  Negative house number range (-498 to -200)
WARNING:  Negative house number range (-499 to -201)
WARNING:  Negative house number range (-101 to 199)

Tiger edge.shp is not the best source for geocoding.

Just note here: I read the Tiger data specification and it mentioned the best source for geocoding is the "addrfeat.shp" file instead of "Edge.shp".

I got some news from @lonvia "The address range data didn't exist when Tiger support was implemented, so that would count as feature request (i.e. somebody needs to program it)"

capture

Tiger import: sql errors in log

Hi,

last week i imported a new planet with all additional data (master branch / postgres12)

In tiger data sql files there are more than 33 million "select tiger_line_import" statements. After import the postgresql logfile had a lot of syntax errors for approx 1.5% of these statements

  • 410.000 errors of type "invalid syntax for type integer"
    (second parameter should be integer) e.g.
    SELECT tiger_line_import(ST_GeomFromText('LINESTRING(-85.937244 31.384285,-85.937262 31.384142,-85.937319 31.383953,-85.937452 31.383749,-85.937794 31.383686,-85.938286 31.383649,-85.938534 31.383803,-85.938865 31.384175)',4326), 'UN98', 'UN00', 'all', 'Pearl St', 'Coffee, AL', '36351');

  • 60.000 errors of type "Road too short for number range"
    e.g. Road too short for number range 424 to 5364 on E State St, Clarke, AL (4.0973280119849804e-07)
    PL/pgSQL-Funktion tiger_line_import(geometry,integer,integer,text,text,text,text) Zeile 44 bei RAISE

While this doesn't affect nominatim from working maybe we should leave out the statements not matching database structure of the tiger-preprocessed files ?

Thanks

Change format from SQL to CSV

The current SQL format restricts how the data can be processed: in PostgrSQL only. Nominatim is currently moving in a direction where the data might be preprocessed in Python before being added to the database. Therefore I would like to move to a simple CSV output format with the following columns:

  • from
  • to
  • interpolation
  • street
  • city
  • state (as a two-letter abbreviation)
  • postcode
  • geometry (as a WKT)

I'll start with a converter from the existing SQL to this CSV format but eventually the TIGER conversion script needs to be changed to emit CSV natively.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.