GithubHelp home page GithubHelp logo

osmbe / road-completion-old Goto Github PK

View Code? Open in Web Editor NEW
8.0 7.0 3.0 614.9 MB

This repository contains all the code needed to compare open-data road datasets to OSM data.

JavaScript 86.59% Shell 10.47% PLpgSQL 1.44% Dockerfile 1.50%
vector-tiles tippecanoe mbtiles osm-data

road-completion-old's People

Contributors

gplv2 avatar hispanicmojitos avatar jbelien avatar jodidl avatar joostschouppe avatar warrieka avatar xivk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

road-completion-old's Issues

Figure out a way to directly write mvt

Figure out a way to directly write mvt from the difference script. This should enable us to rebuild the data faster if we only process the modified tiles.

Set up link to Maproulette

The output file should be pushed to Maproulette after it gets generated. Using the API you can update "tasks" (missing roads) with a new status.

In the other direction, we need to harvest Comments and changed status from Maproulette to feed the proces with info about false positives. See dedicated issue.

More advance filtering

Look at joost's suggestions:

I also built in a filter:
"if length outside buffer <10m OR length outside buffer <10% drop from errorlist"
That way you eliminate most mini missings and many false positive dead end streets without removing too many real errors.

Before creating a task, segments should be merged if they have a node in common and have the same name

Official road data tends to be split at every junction. That means that if a longer road is missing, a lot of microtasks could potentially be created. It is probably best to create just one task for those. Otherwise, a lot of microtasks will turn up as "false positives" or "already mapped".

If this is too complicated, it could be avoided by having frequent enough automatic checking of task completion.

Picture to illustrate the problem. Only one segment is returned as a maproulette task, but people are going to map the whole thing.

image

Reporting false positives

Maproulette has an option to mark something as a false positive. However, there are many reasons something might be a false positive:

  • it might be a wrong analysis
  • REF data might be wrong

In the first case, do we want to remove it from further analysis? Or just improve our process to make sure it is dropped?
In the second case, we do want it removed, both from our analysis and from the REF data.

I think the only way to do that, is to have a second line of mappers, who look at just the false positives.

So what we need is this:

  • query Maproulette to collect all the things marked as false positive. Also copy any available "Comments".
  • add these to a list (without destroying what is already on the list)
  • Optional: allow mappers to manually add objects to the list, for those that use our mapping layer outside of Maproulette
  • allow mappers to visualize the list for further analysis, give access to REF-data managers so they can fix those classified as "RED data error" (see below)
  • allow mappers to classify the false positives: REF data error; processing error; resolved and to identify things that need a survey.
  • feed the REF data errors to our processing to avoid offering them to mappers again

This would mean, from the regular Maproulette user perspective, every time they answer NO on the flowchart below, they just mark False Positive. We can encourage them to leave a Comment throug the system, so the second line mappers already have an idea what's going on.

image

Buffer difference

When we compare road data from OSM data, we use a buffer and then check what's (not) in this buffer.

By using turf.difference function we only have the missing (or extra) geometry but not the whole geometry and the data associated to it.

It would be better to have the full geometry and full object of what's missing.

Finish upgrading difference script

Finished upgrading the difference script to the new turfjs.

The main issue remaining is that turfjs has changed. The main todo is merging all individual buffers into one polygon.

Once that is done this is expected to work.

Pass the centroid of the problem to the editor

I don't really know how the parameter for the editor is defined, but seems to be based on the mapbounds of the missing road. That means if the issue is not zoomed in a lot or you haven't centered your map properly, you will be very confused when you arrive in the editor software.

logical code problem in filter

Given this code:

// filter out roads that are shorter than 30m and have no name tigerRoads.features.forEach(function(road, i) { if (filter(road)) tigerRoads.features.splice(i,1); });

It will not do as expected. Since you modify the array while looping it, the counters will change, you will end up deleting either not enough, or the wrong indexes.

for example : say the filter function has to delete i = 6, 7 and 12 in an object, it will first delete 6 and then the indexes of the same object will shuffle down ,leaving no gap. But the initial loop isn't over yet.

I'm using some of the code ideas here in GRB tool, hence why I looked into this.

See https://stackoverflow.com/questions/13244888/delete-record-from-javascript-array-object

Tippecanoe issue : not enough disk space or RAM ?

Here is the error message I have :

For layer 0, using name "wegsegmentgeojson"
1015137 features, 153762574 bytes of geometry, 18682 bytes of separate metadata, 22823715 bytes of string pool
tile 0/0/0 size is 893771 with detail 12, >500000
tile 1/1/0 size is 1700988 with detail 12, >500000
tile 1/1/0 size is 893596 with detail 11, >500000
tile 2/2/1 size is 3187965 with detail 12, >500000
tile 2/2/1 size is 1700970 with detail 11, >500000
tile 2/2/1 size is 893596 with detail 10, >500000
tile 3/4/2 size is 5781936 with detail 12, >500000
tile 3/4/2 size is 3187965 with detail 11, >500000
tile 3/4/2 size is 1700971 with detail 10, >500000
tile 3/4/2 size is 893596 with detail 9, >500000
tile 4/8/5 has 310322 features, >200000
Try using -B (and --drop-lines or --drop-polygons if needed) to set a higher base zoom level.



*** NOTE TILES ONLY COMPLETE THROUGH ZOOM 3 ***

Attribute comparison

In the case of Wegenregister, these are interesting attribute comparisons:

  • Straatnaam (LSTRNM & RSTRNM)
  • Morfologie van de weg (LBLMORF) (autostrade, rijweg, fietspad, aardeweg,…)
  • Status (LBLSTATUS) (in gebruik of niet in gebruik – eigenlijk gewoon filteren dat jullie enkel rekening houden met “in gebruik”)
  • Toegangsbeperking (LBLTGBEP) (prive of openbaar)
  • Wegcategorie: (LBLWEGCAT)

A bit hardee to access attributes:

  • In AttWegverharding: LBLTYPE (verhard of onverhard)
  • In AttRijstroken: LBLRICHT (indicatie van eenrichtingsstraten)

We need good source data, and for now it seems that only Nevele will put their hand in the fire about their data quality. We're in touch with their GIS responsible.

Taking into account coverage limits

Some OSM vector tiles will not have any coverage. I suppose these can be excluded easily.
But what will happen with tile having only have partial coverage? In the first phase, this will pose no problem, as we will focus on "missing in OSM". In a second phase, I suppose the output will have to be filtered by a coverage polygon of the external dataset. Or maybe we can filter the OSM vector tile before analysis?

Group adjacent output

Current output is for every single segment. In the current REF data, a new segment begins at every junction (and only at junctions). In the case of a new development, that means the area is split into many tasks. A mapper will be tempted to do the entire area, so the other tasks need manual or automatic closing. In the case of a Maproulette Challenge, there would be no problem if it is updated very often.

In the case of false positives, the current single segment approach is a bit problematic. One mapper could mark the segment as false positive (or problematic in a different way). But it is a lot of work to map all the segmentes in the area as false positive. Then the next mapper might be less critical and still copy the data.
This might be solved by grouping segments together.

The disadvantage of any grouping, is that mapper feedback (false positive, too difficult, a comment) is then forced on the entire group. This makes the feedback harder to give (I could map some of this area, but a little part was wrong) and harder to process at the data source (I can't now simply match the feedback to my original data object, but just get some info about a wider area).

Suggested solution:

  • group output that is adjecent and has the same name under one object
  • join the main unique identifiers (several WS_GIDN or one streetname ID) to the task

Ask AIV for an URL to download GRB roads from

This project works best if the external data (e.g. government data) is updated frequently. In Flanders, we can choose between:

  • wegenregister: which is available as an URL, but the URL changes with every infrequent update
  • GRB wvb: which is updated very often, but can only be downloaded by humans

@bertvannuffelen suggested asking about improvements in the way AIV offers data. In both cases, an URL which a script could call would be extremely useful. Priority is GRB. Note: GRB by url would also be useful for GRB hoofdgebouw and GRB bijgebouw, as right now downloading them is a painful proces for @gplv2 .

Unsupported MBTiles format

Hi everyone,
I am trying to run this project on my own dataset. I have my own GeoJSON file, with only "id" property, and only "LineString" type.

I am converting it to MBTiles using tippecanoe , but when running index.js , I am getting unsupported MBTiles format. What could be the possible reasons?

I am able to visualise the .MBTiles file on Mapbox, Not sure what is wrong.

Do a reverse check

We are checking if all roads in the REF data are also in OSM.

We also need to check if all roads in OSM are in the REF data. In many cases, doing so leads to corrections in OSM, especially with regards to improving geometry. In most cases, REF data is probably wrong.

What I would suggest is we do not do this yet. Let us ask REF data management to do this. We can offer training. They can then give the job of validating REF data to anyone, not just GIS professionals. The people doing it, will fix a lot of OSM errors, but will mostly be giving detailed feedback to REF data management. By doing this together with the REF people, we can shape the output so that it becomes directly useful to management. We can do this with something similar as what I describe here, however then the second line mappers are REF data management GIS professionals, not advanced OSM volunteers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.