GithubHelp home page GithubHelp logo

plokhotnyuk / rtree2d Goto Github PK

View Code? Open in Web Editor NEW
130.0 12.0 11.0 16.36 MB

RTree2D is a 2D immutable R-tree for ultra-fast nearest and intersection queries in plane and spherical coordinates

License: Apache License 2.0

Scala 99.21% Shell 0.79%
scala rtree 2d str sort-tile-recursive high-performance spatial-index geo-index scala-js scala-native

rtree2d's Introduction

RTree2D

Actions Build Scala Steward Scala.js Maven Central

RTree2D is a 2D immutable R-tree with STR (Sort-Tile-Recursive) packing for ultra-fast nearest and intersection queries.

Goals

Main our requirements was:

  • efficiency - we wanted the R-Tree to be able to search through millions of entries efficiently even in case of highly overlapped entries, also, we needed to be able to quickly rebuild R-tries with a per minute rate producing minimum pressure on GC
  • immutability - different threads needed to be able to work with the same R-tree without problems, at the same time some thread can build a new version of the R-tree reusing immutable entries from the previous version

To archive these goals we have used:

  • STR packing that is a one of the most efficient packing method which produces balanced R-tree
  • a memory representation and access patterns to it which are aware of a cache hierarchy of contemporary CPUs
  • an efficient TimSort version of merge sorting from Java which minimize access to memory during packing
  • efficient implementations of nearest and range search functions with minimum of virtual calls and allocations

How to use

Add the library to a dependency list:

libraryDependencies += "com.github.plokhotnyuk.rtree2d" %% "rtree2d-core" % "0.11.13"

Entries of R-tree are represented by RTreeEntry instances which contains payload and 4 coordinates of the minimum bounding rectangle (MBR) for it.

Add import, create entries, build an R-tree from them, and use it for search a nearest entry or search intersections by point or rectangle requests:

import com.github.plokhotnyuk.rtree2d.core._                                                         
import EuclideanPlane._                                                                  
                                                                                         
val box1 = entry(1.0f, 1.0f, 2.0f, 2.0f, "Box 1")                                        
val box2 = entry(2.0f, 2.0f, 3.0f, 3.0f, "Box 2")                                        
val entries = Seq(box1, box2)                                                            
                                                                                         
val rtree = RTree(entries)                                                               
                                                                                         
assert(rtree.entries == entries)                                                         
assert(rtree.nearestOption(0.0f, 0.0f) == Some(box1))                      
assert(rtree.nearestOption(0.0f, 0.0f, maxDist = 1.0f) == None)                          
assert(rtree.nearestK(0.0f, 0.0f, k = 1) == Seq(box1))                     
assert(rtree.nearestK(0.0f, 0.0f, k = 2, maxDist = 10f) == Seq(box2, box1))  
assert(rtree.searchAll(0.0f, 0.0f) == Nil)                                               
assert(rtree.searchAll(1.5f, 1.5f) == Seq(box1))                                         
assert(rtree.searchAll(2.5f, 2.5f) == Seq(box2))                                         
assert(rtree.searchAll(2.0f, 2.0f) == Seq(box1, box2))                                   
assert(rtree.searchAll(2.5f, 2.5f, 3.5f, 3.5f) == Seq(box2))                             
assert(rtree.searchAll(1.5f, 1.5f, 2.5f, 2.5f).forall(entries.contains))                 

RTree2D can be used for indexing spherical coordinates, where X-axis is used for latitudes, and Y-axis for longitudes in degrees. Result distances are in kilometers:

import com.github.plokhotnyuk.rtree2d.core._
import SphericalEarth._

val city1 = entry(50.0614f, 19.9383f, "Kraków")
val city2 = entry(50.4500f, 30.5233f, "Kyiv")
val entries = Seq(city1, city2)

val rtree = RTree(entries, nodeCapacity = 4/* the best capacity for nearest queries for spherical geometry */)

assert(rtree.entries == entries)
assert(rtree.nearestOption(0.0f, 0.0f) == Some(city1))
assert(rtree.nearestOption(50f, 20f, maxDist = 1.0f) == None)
assert(rtree.nearestK(50f, 20f, k = 1) == Seq(city1))
assert(rtree.nearestK(50f, 20f, k = 2, maxDist = 1000f) == Seq(city2, city1))
assert(rtree.searchAll(50f, 30f, 51f, 31f) == Seq(city2))
assert(rtree.searchAll(0f, -180f, 90f, 180f).forall(entries.contains))

Precision of 32-bit float number allows to locate points with a maximum error ±1 meter at anti-meridian.

Used spherical model of the Earth with the Mean radius and Haversine formula allow to get ±0.3% accuracy in calculation of distances comparing with Vincenty’s formulae on an oblate spheroid model.

Please, check out Scala docs in sources and tests for other functions which allows filtering or accumulating found entries without allocations.

How it works

Charts below are latest results of benchmarks which compare RTree2D with Archery, David Monten's rtree, and JTS libraries on the following environment: Intel® Core™ i9-11900H CPU @ 2.5GHz (max 4.9GHz), RAM 32Gb DDR4-3200, Ubuntu 22.04, Oracle JDK 17.

Main metric tested by benchmarks is an execution time in nanoseconds. So lesser values are better. Please, check out the Run benchmarks section bellow how to test other metrics like allocations in bytes or number of some CPU events.

Benchmarks have the following parameters:

  • geometry to switch geometry between plane and spherical (currently available only for the RTree2D library)
  • nearestMax a maximum number of entries to return for nearest query
  • nodeCapacity a maximum number of children nodes (BEWARE: Archery use hard coded 50 for limiting a number of children nodes)
  • overlap is a size of entries relative to interval between them
  • partToUpdate a part of RTree to update
  • rectSize is a size of rectangle request relative to interval between points
  • shuffle is a flag to turn on/off shuffling of entries before R-tree building
  • size is a number of entries in the R-tree

The apply benchmark tests building of R-tries from a sequence of entires.

apply

The nearest benchmark tests search an entry of the R-tree that is nearest to the specified point.

nearest

The nearestK benchmark tests search up to 10 entries in the R-tree that are nearest to the specified point.

nearest

The searchByPoint benchmark tests requests that search entries with intersects with the specified point.

searchByPoint

The searchByRectangle benchmark tests requests that search entries with intersects with the specified rectangle that can intersect with up to 100 entries.

searchByRectangle

The entries benchmark tests returning of all entries that are indexed in the R-tree.

entries

The update benchmark tests rebuild of R-tree with removing of +10% entries and adding of +10% another entries to it.

update

Charts with their results are available in subdirectories (each for different value of overlap parameter) of the docs directory.

How to contribute

Build

To compile, run tests, check coverage for different Scala versions use a command:

sbt clean +test
sbt clean coverage test coverageReport mimaReportBinaryIssues

Run benchmarks

Benchmarks are developed in the separated module using Sbt plugin for JMH tool.

Feel free to modify benchmarks and check how it works with your data, JDK, and Scala versions.

To see throughput with allocation rate run benchmarks with GC profiler using the following command:

sbt -java-home /usr/lib/jvm/jdk-17 clean 'rtree2d-benchmark/jmh:run -prof gc -rf json -rff rtries.json .*'

It will save benchmark report in rtree2d-benchmark/rtries.json file.

Results that are stored in JSON can be easy plotted in JMH Visualizer by drugging & dropping of your file(s) to the drop zone or using the source or sources parameters with an HTTP link to your file(s) in the URLs: http://jmh.morethan.io/?source=<link to json file> or http://jmh.morethan.io/?sources=<link to json file1>,<link to json file2>.

Also, there is an ability to run benchmarks and visualize results with a charts command. It adds -rf and -rff options to all passes options and supply them to jmh:run task, then group results per benchmark and plot main score series to separated images. Here is an example how it can be called for specified version of JVM, value of the overlap parameter, and patterns of benchmark names:

sbt -java-home /usr/lib/jvm/zulu-17 clean 'charts -p overlap=1 -p rectSize=10 -p nearestMax=10 -p nodeCapacity=16 -p partToUpdate=0.1 -p geometry=plane .*'

Results will be places in a cross-build suffixed subdirectory of the benchmark/target directory in *.png files (one file with a chart for each benchmark):

$ ls rtree2d-benchmark/target/scala-2.12/*.png

rtree2d-benchmark/target/scala-2.12/apply[geometry=plane,nearestMax=10,nodeCapacity=16,overlap=1,partToUpdate=0.1,rectSize=10].png
...
rtree2d-benchmark/target/scala-2.12/searchByRectangle[geometry=plane,nearestMax=10,nodeCapacity=16,overlap=1,partToUpdate=0.1,rectSize=10].png

For testing of RTree2D with spherical geometry and different node capacities use the following command (chart files will be placed in the same directory as above):

sbt -java-home /usr/lib/jvm/zulu-17 clean 'charts -p overlap=1 -p rectSize=10 -p nearestMax=10 -p nodeCapacity=4,8,16 -p partToUpdate=0.1 -p geometry=spherical RTree2D.*'

Publish locally

Publish to local Ivy repo:

sbt +publishLocal

Publish to local Maven repo:

sbt +publishM2

Release

For version numbering use Recommended Versioning Scheme that is used in the Scala ecosystem.

Double check binary and source compatibility, including behavior, and release using the following command (credentials are required):

sbt -java-home /usr/lib/jvm/jdk-8 -J-Xmx8g clean release

Do not push changes to github until promoted artifacts for the new version are not available for download on Maven Central Repository to avoid binary compatibility check failures in triggered Travis CI builds.

rtree2d's People

Contributors

anderender avatar plokhotnyuk avatar scala-steward avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rtree2d's Issues

Line / box intersections?

I'm not sure if I'm missing something obvious, but is it possible to calculate the intersections of a line (specified by lat/long start) with a box on a spherical earth?

I can only see point / box interactions?

Support Consumer<RTreeEntry[A]> as alternative to returning a Seq?

Thank you for this tool. Very useful. I am using it from Java.

For the methods that return a Seq<RTreeEntry>, it would be nice to have corresponding methods that accept an additional Consumer<RTreeEntry> argument and return void.

That would avoid the memory allocation of a large List/Seq when possible. Possible?

R-tree serialization support

R-tree serialization could be a pretty nice extra feature to support as a part of this fantastic tiny library.

Is there any interest in such kind of functionality support, or that was intentionally not implemented? Could be nice to establish / contribute into some sort of a standard for the R-tree serialization format (if there is no such yet).

// definitely a feature request / enhancement / help wanted issue

Cannot plot charts for benchmarks with iterations greater than 2 sec

Full stack trace:

[error] java.lang.RuntimeException: Values less than or equal to zero not allowed with logarithmic axis
[error] 	at org.jfree.chart.axis.LogarithmicAxis.autoAdjustRange(LogarithmicAxis.java:528)
[error] 	at org.jfree.chart.axis.NumberAxis.configure(NumberAxis.java:414)
[error] 	at org.jfree.chart.axis.Axis.setPlot(Axis.java:968)
[error] 	at org.jfree.chart.plot.XYPlot.<init>(XYPlot.java:674)
[error] 	at Bencharts$.$anonfun$apply$2(Bencharts.scala:56)
[error] 	at Bencharts$.$anonfun$apply$2$adapted(Bencharts.scala:31)
[error] 	at scala.collection.immutable.Map$Map2.foreach(Map.scala:146)
[error] 	at Bencharts$.apply(Bencharts.scala:31)
[error] 	at $d22675fc1fd118a5f375$.$anonfun$benchmark$8(build.sbt:129)
[error] 	at scala.Function1.$anonfun$compose$1(Function1.scala:44)
[error] 	at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:39)
[error] 	at sbt.std.Transform$$anon$4.work(System.scala:66)
[error] 	at sbt.Execute.$anonfun$submit$2(Execute.scala:262)
[error] 	at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:16)
[error] 	at sbt.Execute.work(Execute.scala:271)
[error] 	at sbt.Execute.$anonfun$submit$1(Execute.scala:262)
[error] 	at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:174)
[error] 	at sbt.CompletionService$$anon$2.call(CompletionService.scala:36)
[error] 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error] 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[error] 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[error] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[error] 	at java.lang.Thread.run(Thread.java:748)

java.lang.ClassCastException: com.github.plokhotnyuk.rtree2d.core.RTreeNode cannot be cast to com.github.plokhotnyuk.rtree2d.core.RTreeEntry

Hi!

Not sure if this is a good place to ask, but I am kind of stuck.

We've been using rtree2d without any issues for roughly 1.5 years.
Suddenly, this error keeps appearing randomly when using .nearestK():

java.lang.ClassCastException: com.github.plokhotnyuk.rtree2d.core.RTreeNode cannot be cast to com.github.plokhotnyuk.rtree2d.core.RTreeEntry
	at com.github.plokhotnyuk.rtree2d.core.RTreeNode.nearest(RTree.scala:364)
	at com.github.plokhotnyuk.rtree2d.core.RTreeNode.nearest(RTree.scala:378)
	at com.github.plokhotnyuk.rtree2d.core.RTreeNode.nearest(RTree.scala:378)
	at com.github.plokhotnyuk.rtree2d.core.RTree$$anon$3.<init>(RTree.scala:204)
	at com.github.plokhotnyuk.rtree2d.core.RTree.nearestK(RTree.scala:203)
	at com.github.plokhotnyuk.rtree2d.core.RTree.nearestK$(RTree.scala:200)
	at com.github.plokhotnyuk.rtree2d.core.RTreeNode.nearestK(RTree.scala:353)
       [...omitted...]
	at akka.actor.typed.internal.BehaviorImpl$ReceiveMessageBehavior.receive(BehaviorImpl.scala:150)
	at akka.actor.typed.Behavior$.interpret(Behavior.scala:274)
	at akka.actor.typed.Behavior$.interpretMessage(Behavior.scala:230)
	at akka.actor.typed.internal.adapter.ActorAdapter.handleMessage(ActorAdapter.scala:126)
	at akka.actor.typed.internal.adapter.ActorAdapter.aroundReceive(ActorAdapter.scala:106)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:573)
	at akka.actor.ActorCell.invoke(ActorCell.scala:543)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:269)
	at akka.dispatch.Mailbox.run(Mailbox.scala:230)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:242)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

Sadly, I've not been able to reproduce it locally, or even by querying our service for locations which seemingly led to the crash.

The tree should contain less than 20k entries. This number has not increased significantly in the past few months.
The tree is constructed with the default capacity.
k is set to 10, maxDist varies in the 3000m range.

rtree2d version: 0.9.0 (can't spot any code changes in that area)
Scala version: 2.13.2
Java version: 8 (Amazon Coretto)

Has this been encountered before?
Any pointers towards how we could mitigate this would be much appreciated!

NoSuchMethod error creating entries in Scala 2.11.12 / Spark 2.4.3

Hi, when trying the examples in the readme I run into this following error.
I have downloaded the jar to my local maven repo.

  • Spark 2.4.3
  • Scala 2.11.12
scala> import com.sizmek.rtree2d.core._
import com.sizmek.rtree2d.core._

scala> import SphericalEarth._
import SphericalEarth._

scala> val city1 = entry(50.0614f, 19.9383f, "Krakow")
java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
  at com.sizmek.rtree2d.core.RTreeEntry.<init>(RTree.scala:332)
  at com.sizmek.rtree2d.core.Spherical.entry(Geometry.scala:107)
  at com.sizmek.rtree2d.core.Spherical.entry$(Geometry.scala:102)
  at com.sizmek.rtree2d.core.SphericalEarth$.entry(Geometry.scala:230)
  ... 53 elided

findNearestOption does not find point within a distance in particular situations

Andriy, first of all, I really admire the work you've done on this library. The performance is amazing.

I found one issue which is related to the algorithm of searching nearest points.
I have a tree constructed with nodeCapacity = 4, distance calculator is SphericalEarth. The tree is represented as below:

RTreeNode(-26.05,21.64581,-17.80165,27.84296)
    RTreeNode(-26.05,21.64581,-17.80165,25.91667)
      RTreeNode(-21.69785,21.64581,-18.36536,23.41667)
        RTreeEntry(-21.69785,21.64581,-21.69785,21.64581,BWGNZ)
        RTreeEntry(-18.36536,21.84219,-18.36536,21.84219,BWSWX)
        RTreeEntry(-21.66667,22.05,-21.66667,22.05,BWSUN)
        RTreeEntry(-19.98333,23.41667,-19.98333,23.41667,BWMUB)
      RTreeNode(-26.05,21.77962,-23.9988,25.67728)
        RTreeEntry(-23.9988,21.77962,-23.9988,21.77962,BWHUK)
        RTreeEntry(-26.05,22.45,-26.05,22.45,BWTBY)
        RTreeEntry(-24.60167,24.72806,-24.60167,24.72806,BWJWA)
        RTreeEntry(-25.22435,25.67728,-25.22435,25.67728,BWLOQ)
      RTreeNode(-21.41494,23.75201,-17.80165,25.59263)
        RTreeEntry(-19.16422,23.75201,-19.16422,23.75201,BWKHW)
        RTreeEntry(-17.80165,25.16024,-17.80165,25.16024,BWBBK)
        RTreeEntry(-21.3115,25.37642,-21.3115,25.37642,BWORP)
        RTreeEntry(-21.41494,25.59263,-21.41494,25.59263,BWLET)
      RTreeNode(-24.87158,25.86556,-24.62694,25.91667)
        RTreeEntry(-24.62694,25.86556,-24.62694,25.86556,BWMGS)
        RTreeEntry(-24.87158,25.86989,-24.87158,25.86989,BWRSA)
        RTreeEntry(-24.65451,25.90859,-24.65451,25.90859,BWGBE)
        RTreeEntry(-24.66667,25.91667,-24.66667,25.91667,BWGAB)
    RTreeNode(-23.10275,26.71077,-21.17,27.84296)
      RTreeNode(-23.10275,26.71077,-21.97895,27.84296)
        RTreeEntry(-22.38754,26.71077,-22.38754,26.71077,BWSER)
        RTreeEntry(-23.10275,26.83411,-23.10275,26.83411,BWMAH)
        RTreeEntry(-22.54605,27.12507,-22.54605,27.12507,BWPAL)
        RTreeEntry(-21.97895,27.84296,-21.97895,27.84296,BWPKW)
      RTreeNode(-21.17,27.50778,-21.17,27.50778)
        RTreeEntry(-21.17,27.50778,-21.17,27.50778,BWFRW)

When searching for the nearest point to -24.65527, 25.91904 with the maxDist parameter set to 50 km, I am expecting the tree to return Some(RTreeEntry(-24.65451,25.90859,-24.65451,25.90859,BWGBE)). However, the method returns None. If I maxDist is set to infinity all works fine.

I did some investigation and the situation can be represented graphically like below:
image
image

A is the point that I'm looking for the nearest objects, and B is the expected closest object.

Not nearest distance selected at the antimeridian

Following asserts do not pass:

import SphericalEarthDistanceCalculator.calculator._
assert(distance(0, 0, RTreeEntry(0, 10, 3)) === distance(0, -180, RTreeEntry(-10, -160, 10, 170, 3)))
assert(distance(0, 0, RTreeEntry(0, 10, 3)) === distance(0, 180, RTreeEntry(-10, -170, 10, 160, 3)))

Incorrect distance calculation / can't express box in Western Hemisphere?

val innerBox: RTreeEntry[String] = entry[String](17.84f, -67.38f, 18.62f, -65.51f, "innerBox")

innerBox.asJson.toString()

val h_latitude = 18.0f
val h_longditude = -74.3f

SphericalEarth.distanceCalculator.distance(h_latitude, h_longditude, innerBox)

This produces an answer of zero, which I think is "wrong" (or at least, not what I want)

what I think is happening, is that the -ve longitude constraint is going "the wrong way" around the world. However, I can't switch the numbers around, because the library complains at me.

Any hint on how this might work?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.