Hi, I'm trying to use libosmscout's MapSer

`MapService::LoadMissingTileData()` produces more data than expected about libosmscout HOT 7 OPEN

frank-aurich commented on September 25, 2024

`MapService::LoadMissingTileData()` produces more data than expected

from libosmscout.

Comments (7)

Framstag commented on September 25, 2024

See AreaWayIndex and AreaIndex. Libosmscout uses bitmap based index per type. That means, that for each type there is a flat tile-based index. Each tile has a list of containing objects. The size of a tile depends on the actual type, especially distribution. The "right" magnification is decided based on AreaIndexGenerator::FitsIndexCriteria(). Type with many object instances should have smaller tiles than type with less data.

Idea is to optimize the size of the index on disk with too much performance loss. The mechanism could be influenced to get smaller tiles with less data overhead but bigger size on disk, though currently values seem to be hard coded.

So, indexes are not perfect, it is expected that you get possibly more data than selected.

The painter should filter them out, if the corresponding parameter is set (which the index cannot, since it does not hold the actual data and thus has no boundary to filter against).

from libosmscout.

frank-aurich commented on September 25, 2024

it is expected that you get possibly more data than selected.

But I'm getting WAY MORE data then selected. That doesn't seem right?

The painter should filter them out, if the corresponding parameter is set

What parameter is that, and how does it do the filtering?

from libosmscout.

Framstag commented on September 25, 2024

it is expected that you get possibly more data than selected.

But I'm getting WAY MORE data then selected. That doesn't seem right?

The painter should filter them out, if the corresponding parameter is set

What parameter is that, and how does it do the filtering?

Checked the code. There is no such parameter. The code in MapPainter.cpp checks for "visible in bounding box" using the local IsVisibleWay() and IsVisibleArea() methods.

from libosmscout.

Framstag commented on September 25, 2024

it is expected that you get possibly more data than selected.

But I'm getting WAY MORE data then selected. That doesn't seem right?

You can exchange the given code in AreaIndexGenerator.h starting at line 180 witht he following code:

    GeoBox boundingBox=typeData.tileBox.GetCenter().GetBoundingBox(typeData.indexLevel);

    progress.Info("Writing map for "+
                  typeInfo.GetName()+" , "+
                  ByteSizeToString(1.0*dataOffsetBytes*typeData.tileBox.GetCount()+dataSize)+" ("+
                  GetEllipsoidalDistance(boundingBox.GetTopLeft(),boundingBox.GetBottomRight()).AsString()+")");

This will dump the diagonal in Km of the center tile of the given index for the given type.

On my system it returns 15,757647 km for index level 11, 7,881098 km for level 12 and 3,939981 km for level 12. Yes these are huge areas and thus may return some data. Heuristics should make sure that you do not get too much data in relation to the limits (per tile).

By changing the following params:

                     parameter.GetAreaRouteMinMag(),
                     parameter.GetAreaRouteIndexMaxLevel(),

You should be able to enforce higher starting level and thus smaller tiles. You should get less overhead data, but it would be interesting if database decreases or increased and what influence on performance you will get.

from libosmscout.

Framstag commented on September 25, 2024

Corresponding changes in code improved and committed

from libosmscout.

frank-aurich commented on September 25, 2024

I can see an improvement with the changes done to the import process:
Before:

After:

Feature (way) count was reduced from ~2800 to ~1800, which is definitely better than before.

I still find it curious that there are tiny ways of lower class that are sprinkled all over the map, and are far away from the requested tile.

I'll see how I can work with this.

from libosmscout.

Framstag commented on September 25, 2024

As I explained. The code looks for each type that is active in the style sheet in the index. Depending on the type, its data count, the index tile may be differently sized...and thus may contain more or less data. And it at least load one tile but depending on the bonding box even multiple tiles.

It would be interesting to know which types have a high count.

You can check the output of the index generator yourself (step 16 for example):

highway_footway 0 | 2919 < 1236 < 136 *121* - 4412
   WW highway_footway has more than 1% cells with much too high entry count, will use smaller tile size

There are 0 tiles that are empty
There are 2919 tiles, that have a low fill count
1236 has an "ok" count
136 have a higher than requested fill count
121 have a much higher fill count
In sum we have 4412 tiles

As you can see, distribution is not even. Nevertheless, for ways we still use a bitmap count, that ties the most effective tile size. An alternative would be a tree-based index - however these are normally bigger than more expensive. In the perfect world, one would analyze distribution and would decide differently depending on the type. However, this is not implemented.

The sprinkle effect is created because likely these values still are contained din one big or a few big tiles.

If you make TypeData.tileBox accessible from AreaIndex.h during rendering, you should be able to render the various tiles of each type.

My changes resulted in reducing the limit for tile size (now smaller tiles are allowed) and in a slightly improved distribution handling. For further optimization, you likely have to further increase the magnification limit. See

ImportParameter.cpp

      areaWayIndexMinMag(10), // Should not be >= than areaWayIndexMaxMag
      areaWayIndexMaxMag(15),

If this does not help enough, we would have to change the "good distribution" criteria in AreaIndexGenerator,h:

    size_t tooLowValue=4*average/10;
    size_t tooHighValue=64+32;
    size_t muchTooHighValue=128+64;
...
    if (double(muchTooHighCount) / double(allCount) >= 0.01) {
      progress.Warning(typeInfo.GetName() + " has more than 1% cells with much too high entry count, will use smaller tile size");
      return false;
    }

    if (double(tooHighCount) / double(allCount) >= 0.05) {
      progress.Warning(typeInfo.GetName() + " has more than 5% cells with too high entry count, will use smaller tile size");
      return false;
    }

    if (double(tooLowCount) / double(allCount) >= 0.2) {
      progress.Warning(typeInfo.GetName() + " has more than 20% cells with <40% of average filling");
    }
...

@Karry : Note that my changes may already have influence on current database size and performance.

from libosmscout.

`MapService::LoadMissingTileData()` produces more data than expected about libosmscout HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs