mapzen / chef-metroextractor Goto Github PK

View Code? Open in Web Editor NEW

51.0 51.0 15.0 297 KB

Creates metro extracts/shapefiles from OSM planet data:

Home Page: https://mapzen.com/data/metro-extracts

License: GNU General Public License v3.0

Python 3.55% Ruby 32.84% HTML 53.64% Shell 9.98%

chef-metroextractor's People

Stargazers

Watchers

Forkers

geopan heffergm miguelramosfdz jen3839 eflegara ustroetz rijvirajib kalvish kleopatra999 almccon demonrem igledaniel rui-santos cuulee bfreeds

chef-metroextractor's Issues

Neighbourhoods in imposm shp and geojson files (NYC/Chicago)

I'm new to looking at OSM data. For both NYC and Chicago, the places file seems very sparse within the actual city boundaries because there are no neighborhood names, mostly only places of the surrounding towns and suburbs.

Here's the places file for the NYC metro extract plotted on top of a google maps base for comparison.

The data I'd like to see in the places file seems to be in OSM as place type: neighbourhood (for example https://www.openstreetmap.org/node/158842996#map=18/40.72927/-73.98736&layers=C) but in extract places file there are no Features with that type.

A bit of searching around doesn't turn up anything for why that might be. Is that something that can be added to the extracts?

add power polygons to landuse

please add power polygons to landuse, like we did over in tiles:

tilezen/vector-datasource#199

Enhancement: change processing order of SHP files

Reference issue #19 .

Wrong character encoding in warsaw-poland.imposm-geojson

Hi,
There's a problem with UTF-8 character encoding in imposm-geojson files. For instance, a fragment of the warsaw_poland-admin.geojson file looks like:

{ "type": "Feature", "properties": { "id": 2.0, "osm_id": -336132.0, "name": "JÃ³zefosÅ�aw", "type": "administrative", "admin_leve": 8.0 }, "geometry": { "type": "Polygon", "coordinates"

but the name of this area should be: "Józefosław".
This name is correctly encoded in warsaw_poland.osm-admin.dbf file, so it seems it's getting mangled during conversion to geojson format.

issues with portland metro extract

Hi @heffergm, I don't know if this is the right place to file this, but... I spent a few hours yesterday trying to figure out why I couldn't render the Willamette River, only fill its islands with water. It looks like this.

Today I created a new database, imported the old metro.teczno data, and the river looks pretty much as I'd expect it to.

So something seems to have changed in the extraction. Any ideas?

relation based extracts

is there a way to generate extracts based on a OSM relation? or geojson/GPX/whatever describing the shape?

cut not in a rectangular shape, but irregular.

Tags are incorrectly truncated in OSM line shapefile

For instance, let's say I load the Dallas, TX OSM line shapefile into my database using shp2pgs2l:

$ shp2pgsql -I -s 4326 -c /tmp/dallas_texas_osm_line public.road_geometries | psql postgres://:@localhost:5432/my_db

Inspecting the tags in the database, we see that some of the values in the tags column are abruptly truncated.

For example if I query

SELECT name, osm_id, tags FROM road_geometries WHERE osm_id = 10022714

The result is:

name   | Edgehill Road
osm_id | 10022714
tags   | "tiger:cfcc"=>"A41", "tiger:tlid"=>"103516618:103532208:103516622:103556965:103556966:225914365:103532156", "tiger:county"=>"Johnson, TX", "tiger:source"=>"tiger_import_dch_v0.6_20070830", "tiger:reviewed"=>"no", "tiger:zip_left"=>"76028", "tiger:name_ba

And the last visible tag, "tiger:name_ba is truncated.

These truncated tags make the data less usable, because I suspect the truncation is omitting some of the tags that are present in the source data, and it makes it difficult to parse out individual tags for instance by converting the field to hstore type.

missing node for way in extract

hey @heffergm I noticed in the Vancouver extract I sourced from metro extracts there is a way which seems to link to a node which isn't present in the same extract file.

Any idea what might be causing this?

$> cd /tmp
$> wget https://s3.amazonaws.com/metro-extracts.mapzen.com/vancouver_canada.osm.pbf
$> osmconvert --out-csv vancouver_canada.osm.pbf | grep "257738889\|23787107"

way 23787107    Brooksbank Elementary

You can see that way:23787107 is present in the file but it's child node:257738889 is not.

Upgrade to PostGIS 2.2 to fix UTF8 > WIN1252 encoding errors

Paul was kind enough to point out that there is a bug in the current version of PostGIS that makes funky character encodings. Once v2.2 is out, we should upgrade so Metro Extract shapefiles get the new new.

From Paul: "@mapzen unfortunately though you specify SHAPE_ENCODING the DBFs still have a WIN1252 encoding header, limitation of pgsql2shp, fix in 2.2" on September 23, 2015

https://twitter.com/pwramsey/status/646795842557341696

Offer a denomalized format for document-store databases

It would be ideal for me to have OSM data available in a denomalized format.

This could be GeoJSON or any other format which would allow easy insertion of the data in to a noSQL database like elasticsearch or mongodb.

A fair few people have been asking for this sort of thing but as far as I'm aware it's not available online at this time.

Currently if you want to insert the osm data in a document-store, you have a few options:

[standard approach] Write the normalized data to a RDBMS and export it

Simply put; you import the data in to PostgreSQL then write queries to JOIN it and map it across to your docstore.

[insanity approach] Try to normalize the data in real-time using `RAM`

You can try to take small extracts and assemble the data in memory. This uses huge amounts of RAM and is not very practical.

[3rd party approach] Download the data in a format that works for you

This is what I'm proposing; basically you download the data in GeoJSON and insert the records one-by-one in to the docstore without having to install another db or tools.

mapzen / chef-metroextractor Goto Github PK

chef-metroextractor's People

Stargazers

Watchers

Forkers

chef-metroextractor's Issues

[standard approach] Write the normalized data to a RDBMS and export it

[insanity approach] Try to normalize the data in real-time using RAM

[3rd party approach] Download the data in a format that works for you

Recommend Projects

Recommend Topics

Recommend Org

Jobs

[insanity approach] Try to normalize the data in real-time using `RAM`