GithubHelp home page GithubHelp logo

How could this be faster? about nearby-cities HOT 11 CLOSED

zeke avatar zeke commented on August 22, 2024
How could this be faster?

from nearby-cities.

Comments (11)

mourner avatar mourner commented on August 22, 2024 2

Hey everyone, I just released geokdbush — a new ultra-high-performance library for k-nearest-neighbor queries of points on Earth (including proper handling of dateline wrapping, Earth curvature etc). It's inspired by sphere-knn, but uses a different algorithm and builds on kdbush.

It indexes all points from all-the-cities in under 70ms — which is 15 times faster than sphere-knn. Queries are very fast too. Here are my benchmark results for the all-the-cities dataset:

benchmark geokdbush sphere-knn naive
index 138398 points 69ms 967ms n/a
query 1000 closest 4ms 4ms 155ms
query 50000 closest 31ms 368ms 155ms
query all 138398 80ms 29.7s 155ms
1000 queries of 1 55ms 165ms 18.4s

from nearby-cities.

mourner avatar mourner commented on August 22, 2024 1

Usually you don't need all the cities when doing such a query, only K closest ones or cities within a certain radius. To do these kinds of queries really fast, index the cities with RBush. Check out:

from nearby-cities.

zeke avatar zeke commented on August 22, 2024

Thanks, @mourner! I will have to give that a try tonight. I peeked at the test file for rbush-knn and each coordinate is an array with four elements, e.g. [7,47,8,47]. What are these coordinates? Will it work if I just pass in lng/lat pairs?

from nearby-cities.

morganherlocker avatar morganherlocker commented on August 22, 2024

@zeke those are bboxes (bboxen?). For points, you can duplicate the pair like: [lon, lat, lon, lat].

from nearby-cities.

zeke avatar zeke commented on August 22, 2024

I see. For reference:

from nearby-cities.

zeke avatar zeke commented on August 22, 2024

Hey @mourner I just noticed this: mapbox/supercluster#5 -- should I use kdbush too?

from nearby-cities.

tmcw avatar tmcw commented on August 22, 2024

If I were a betting man, I'd say https://github.com/darkskyapp/sphere-knn is by far the best option, and it solves problems that rbush etc will not by handling dateline wrapping etc.

from nearby-cities.

mourner avatar mourner commented on August 22, 2024

@zeke basically any kind of spatial index that supports knn queries will be good enough for this use case. kdbush is kind of extreme, it's crazy fast but created for use cases where you need that extra performance, and limited in features (and knn search not implemented yet). Sphere KNN looks like a great option.

from nearby-cities.

rprieto avatar rprieto commented on August 22, 2024

Thanks for all the advice on this thread! I was thinking on adding geolocation to http://thumbsup.github.io and need the extra speed when processing large collections. I tried switching to sphere-knn and here are the results.

Module load time

  • Current = 900ms to load all-the-cities
  • Sphere KNN = 900ms (same) + extra 1.2s to initialise the data structure

Search time

  • Current = 1.6s on example provided (Mission Canyon)
  • Sphere KNN = really depends on the options!
Max results Distance limit Number of results Top result Time taken
undefined undefined 138398 Mission Canyon 17s
1000 undefined 1000 Mission Canyon 6ms
100 undefined 100 Mission Canyon 2ms
10 undefined 10 Mission Canyon 1ms
unlimited 1000km 1501 Mission Canyon 6ms
unlimited 100km 57 Mission Canyon 2ms
unlimited 10km 3 Mission Canyon 1ms

So all in all, unless you're asking for all cities sorted by distance (17s) then you can probably get results within a few milliseconds. Note that I haven't done extensive testing, just

  • resolved half a dozen lat/long manually, and the results were good (correct city)
  • tried processing 500 lat/longs in bulk with max-results = 1, and sphere-knn found all 500 within 8ms total
  • the unit tests still pass too

Interestingly the code for nearby-cities almost becomes non-existent!
Do you think it's worth a PR?

const cities = require('all-the-cities')
const sphereKnn = require('sphere-knn')
const lookup = sphereKnn(cities)
module.exports = function (input) {
  return lookup(input.latitude, input.longitude, inputs.maxResults || 100, input.maxDistance || 100000)
}

from nearby-cities.

zeke avatar zeke commented on August 22, 2024

Thanks for sharing, @rprieto.

Do you think it's worth a PR?

Definitely! 👍

from nearby-cities.

zeke avatar zeke commented on August 22, 2024

Looks great, @mourner. Want to open a PR?

from nearby-cities.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.