GithubHelp home page GithubHelp logo

libbeaches's Introduction

Extracting coastline vectors from planet.osm

This document discusses various automated methods to extract, from the Open Street Map dataset, a set of vectors representing the coastline and some of the challenges related to each method.

The planet.osm file

This section provides some basic technical info on the way Open Street Map publishes data.
The planet.osm file and its various extracts consist of 3 types of entities. Each entity has a positive integer id and some associated metadata in the form of key=value pairs.

  1. Node - A point with a latitude and longitude, with precision of about 1cm.
  2. Way - A sequence of nodes. Basically a linestring. Order is important. May be closed.
    • e.g. a closed way with natural=water may be a pond.
  3. Relations - Usually a set of ways. Make no assumptions about order.
    • e.g. a relation with type=multipolygon can be used to represent a river, possibly with islands.

Files are usually sorted s.t. nodes are first, then ways, then relations. The author calls this order "the worst one possible."

For a quick sense of scale - just so we know what we're up against - there are currently (Oct-2018) 4.8billion nodes in the OSM dataset. (https://www.openstreetmap.org/stats/data_stats.html) That means we need 33 bits to represent an id. (The id space is dense but not strictly sequential.) Let's round that up to the next register size, say 64 bits. Latitude and Longitude are each encoded as 32 bit integers. So to store a node with its id and location well need 8+4+4=16 bytes. So to store all of the nodes we'll need 4.8•G•node * 16•bytes/node = 76.8•Gbytes!
In conclusion, there are already a lot of nodes and this number grows pretty quickly.

What is the "coastline"

Well there's an entire wiki page on that. https://wiki.openstreetmap.org/wiki/Coastline
Basically, its ambiguous. Empirically, there is no consistency among OSM contributors.

The coastline as perscibed in the OSM dataset is rather conservative.

For many applications we'll want to augment these vectors with the boundaries of the water reachable from the sea.

"natural"="coastline"

https://wiki.openstreetmap.org/wiki/Tag:natural%3Dcoastline
This is special key-value pair that is only applied to ways. This tag is unqiue in being applied consistently and correctly. This, combined with its water-on-the-right sematics, makes it the only tag that (almost) defines a winding direction. (More on this below.)
Sidebar: When applied incorrectly this tag can cause bad rendering artifacts. Since the most active users of OSM polygons are people rendering base maps, errors get corrected or reverted quickly.

Water must be on the right

From the wiki linked above:

The direction the ways are drawn is very important! They must be drawn so that the land is on the left side and water on the right side...

In practice, this is half right. The natural=coastline pair guarantees water on the right, but do not assume that land is on the left. A way, or some subset of its nodes, may also be labeled as the boundary of a water polygon. This is common for river mouths. In this case, water is on the left (and the right). This is necessary since ways labeled as coastline are expected to form a closed ring around continents or islands.
Labeling the way as the exterior of a water polygon does not guarantee that the left side is water. River-banks may be labeled as coastline; i.e. the river is effectively inside the ocean. A river bank bordering land may (or may not) be labeled exactly the same as a river mouth bordering the ocean. Features like estuaries that can be internal to other water polygons can further complicate determining whether there is indeed land on the left side. (Yes, none of this makes sense.)

The images below show two different river/sea interfaces. Both have opinionated and diligent stewards.

Water Polygons

To generate a more complete coastline we'll need to include the water polygons that are reachable from the sea.

Polygons

  • Polygons are reprsented in the OMS data in 2 ways.
    1. A closed way.
    2. A relation labeled type=multipolygon.

Polygons do not have winding direction. A closed way can be CW or CCW. The order and orientation of ways within a multipolygon relation should be considered random. This makes assembling polygons more complicated but somewhat simplifies editing. (More on trying to correctly edit polygons later!)
The only reliable way to find the interior side of a polygon is to load the location of every node and perform a ray-cast or similar. Note: there are lots of nodes.

Reachability

We'll consider two polygons to be adjacent if their boundaries share at least 2 consecutive nodes. A polygon is reachable from another if it is adjacent to it or if it is adjacent to a polygon that is reachable.

Finding the set of reachable polygons

Tools for working with OSM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.