GithubHelp home page GithubHelp logo

pelias / api Goto Github PK

View Code? Open in Web Editor NEW
217.0 32.0 162.0 4.02 MB

HTTP API for Pelias Geocoder

Home Page: http://pelias.io

License: MIT License

JavaScript 90.12% CoffeeScript 9.82% Shell 0.04% Dockerfile 0.03%
geocoding

api's Introduction

A modular, open-source search engine for our world.

Pelias is a geocoder powered completely by open data, available freely to everyone.

Local Installation · Cloud Webservice · Documentation · Community Chat

What is Pelias?
Pelias is a search engine for places worldwide, powered by open data. It turns addresses and place names into geographic coordinates, and turns geographic coordinates into places and addresses. With Pelias, you’re able to turn your users’ place searches into actionable geodata and transform your geodata into real places.

We think open data, open source, and open strategy win over proprietary solutions at any part of the stack and we want to ensure the services we offer are in line with that vision. We believe that an open geocoder improves over the long-term only if the community can incorporate truly representative local knowledge.

Pelias

A modular, open-source geocoder built on top of Elasticsearch for fast and accurate global search.

What's a geocoder do anyway?

Geocoding is the process of taking input text, such as an address or the name of a place, and returning a latitude/longitude location on the Earth's surface for that place.

geocode

... and a reverse geocoder, what's that?

Reverse geocoding is the opposite: returning a list of places near a given latitude/longitude point.

reverse

What are the most interesting features of Pelias?

  • Completely open-source and MIT licensed
  • A powerful data import architecture: Pelias supports many open-data projects out of the box but also works great with private data
  • Support for searching and displaying results in many languages
  • Fast and accurate autocomplete for user-facing geocoding
  • Support for many result types: addresses, venues, cities, countries, and more
  • Modular design, so you don't need to be an expert in everything to make changes
  • Easy installation with minimal external dependencies

What are the main goals of the Pelias project?

  • Provide accurate search results
  • Work equally well for a small city and the entire planet
  • Be highly configurable, so different use cases can be handled easily and efficiently
  • Provide a friendly, welcoming, helpful community that takes input from people all over the world

Where did Pelias come from?

Pelias was created in 2014 as an early project at Mapzen. After Mapzen's shutdown in 2017, Pelias is now part of the Linux Foundation.

How does it work?

Magic! (Just kidding) Like any geocoder, Pelias combines full text search techniques with knowledge of geography to quickly search over many millions of records, each representing some sort of location on Earth.

The Pelias architecture has three main components and several smaller pieces.

A diagram of the Pelias architecture.

Data importers

The importers filter, normalize, and ingest geographic datasets into the Pelias database. Currently there are six officially supported importers:

We are always discussing supporting additional datasets. Pelias users can also write their own importers, for example to import proprietary data into your own instance of Pelias.

Database

The underlying datastore that does most of the query heavy-lifting and powers our search results. We use Elasticsearch. Currently versions 7 and 8 are supported.

We've built a tool called pelias-schema that sets up Elasticsearch indices properly for Pelias.

Frontend services

This is where the actual geocoding process happens, and includes the components that users interact with when performing geocoding queries. The services are:

  • API: The API service defines the Pelias API, and talks to Elasticsearch or other services as needed to perform queries.
  • Placeholder: A service built specifically to capture the relationship between administrative areas (a catch-all term meaning anything like a city, state, country, etc). Elasticsearch does not handle relational data very well, so we built Placeholder specifically to manage this piece.
  • PIP: For reverse geocoding, it's important to be able to perform point-in-polygon(PIP) calculations quickly. The PIP service is is very good at quickly determining which admin area polygons a given point lies in.
  • Libpostal: Pelias uses the libpostal project for parsing addresses using the power of machine learning. We use a Go service built by the Who's on First team to make this happen quickly and efficiently.
  • Interpolation: This service knows all about addresses and streets. With that knowledge, it is able to supplement the known addresses that are stored directly in Elasticsearch and return fairly accurate estimated address results for many more queries than would otherwise be possible.

Dependencies

These are software projects that are not used directly but are used by other components of Pelias.

There are lots of these, but here are some important ones:

  • model: provide a single library for creating documents that fit the Pelias Elasticsearch schema. This is a core component of our flexible importer architecture
  • wof-admin-lookup: A library for performing administrative lookup using point-in-polygon math. Previously included in each of the importers but now only used by the PIP service.
  • query: This is where most of our actual Elasticsearch query generation happens.
  • config: Pelias is very configurable, and all of it is driven from a single JSON file which we call pelias.json. This package provides a library for reading, validating, and working with this configuration. It is used by almost every other Pelias component
  • dbclient: A Node.js stream library for quickly and efficiently importing records into Elasticsearch

Helpful tools

Finally, while not part of Pelias proper, we have built several useful tools for working with and testing Pelias

Notable examples include:

  • acceptance-tests: A Node.js command line tool for testing a full planet build of Pelias and ensuring everything works. Familiarity with this tool is very important for ensuring Pelias is working. It supports all Pelias features and has special facilities for testing autocomplete queries.
  • compare: A web-based tool for comparing different instances of Pelias (for example a production and staging environment). We have a reference instance at pelias.github.io/compare/
  • dashboard: Another web-based tool for providing statistics about the contents of a Pelias Elasticsearch index such as import speed, number of total records, and a breakdown of records of various types.

Documentation

The main documentation lives in the pelias/documentation repository.

Additionally, the README file in each of the component repositories listed above provides more detail on that piece.

Here's an example API response for a reverse geocoding query
$ curl -s "search.mapzen.com/v1/reverse?size=1&point.lat=40.74358294846026&point.lon=-73.99047374725342&api_key={YOUR_API_KEY}" | json
{
    "geocoding": {
        "attribution": "https://search.mapzen.com/v1/attribution",
        "engine": {
            "author": "Mapzen",
            "name": "Pelias",
            "version": "1.0"
        },
        "query": {
            "boundary.circle.lat": 40.74358294846026,
            "boundary.circle.lon": -73.99047374725342,
            "boundary.circle.radius": 500,
            "point.lat": 40.74358294846026,
            "point.lon": -73.99047374725342,
            "private": false,
            "querySize": 1,
            "size": 1
        },
        "timestamp": 1460736907438,
        "version": "0.1"
    },
    "type": "FeatureCollection",
    "features": [
        {
            "geometry": {
                "coordinates": [
                    -73.99051,
                    40.74361
                ],
                "type": "Point"
            },
            "properties": {
                "borough": "Manhattan",
                "borough_gid": "whosonfirst:borough:421205771",
                "confidence": 0.9,
                "country": "United States",
                "country_a": "USA",
                "country_gid": "whosonfirst:country:85633793",
                "county": "New York County",
                "county_gid": "whosonfirst:county:102081863",
                "distance": 0.004,
                "gid": "geonames:venue:9851011",
                "id": "9851011",
                "label": "Arlington, Manhattan, NY, USA",
                "layer": "venue",
                "locality": "New York",
                "locality_gid": "whosonfirst:locality:85977539",
                "name": "Arlington",
                "neighbourhood": "Flatiron District",
                "neighbourhood_gid": "whosonfirst:neighbourhood:85869245",
                "region": "New York",
                "region_a": "NY",
                "region_gid": "whosonfirst:region:85688543",
                "source": "geonames"
            },
            "type": "Feature"
        }
    ],
    "bbox": [
        -73.99051,
        40.74361,
        -73.99051,
        40.74361
    ]
}

How can I install my own instance of Pelias?

To try out Pelias quickly, use our Docker setup. It uses Docker and docker-compose to allow you to quickly set up a Pelias instance for a small area (by default Portland, Oregon) in under 30 minutes.

Do you offer a free geocoding API?

You can sign up for a trial API key at Geocode Earth. A commercial service has been operated by the core development team behind Pelias since 2014 (previously at search.mapzen.com). Discounts and free plans are available for free and open-source software projects.

What's it built with?

Pelias itself (the import pipelines and API) is written in Node.js, which makes it highly accessible for other developers and performant under heavy I/O. It aims to be modular and is distributed across a number of Node packages, each with its own repository under the Pelias GitHub organization.

For a select few components that have performance requirements that Node.js cannot meet, we prefer to write things in Go. A good example of this is the pbf2json tool that quickly converts OSM PBF files to JSON for our OSM importer.

Elasticsearch is our datastore of choice because of its unparalleled full text search functionality, scalability, and sufficiently robust geospatial support.

Contributing

Gitter

We built Pelias as an open source project not just because we believe that users should be able to view and play with the source code of tools they use, but to get the community involved in the project itself.

Especially with a geocoder with global coverage, it's just not possible for a small team to do it alone. We need you.

Anything that we can do to make contributing easier, we want to know about. Feel free to reach out to us via Github, Gitter, email, or Twitter. We'd love to help people get started working on Pelias, especially if you're new to open source or programming in general.

We have a list of Good First Issues for new contributors.

Both this meta-repo and the API service repo are worth looking at, as they're where most issues live. We also welcome reporting issues or suggesting improvements to our documentation.

The current Pelias team can be found on Github as missinglink and orangejulius.

Members emeritus include:

api's People

Contributors

antoine-de avatar avulfson17 avatar blackmad avatar dianashk avatar echelon9 avatar gitter-badger avatar greenkeeper[bot] avatar greenkeeperio-bot avatar hannesj avatar heffergm avatar hkrishna avatar joxit avatar mihneadb avatar missinglink avatar orangejulius avatar pmezard avatar rabidllama avatar riordan avatar sevko avatar stephenlacy avatar stvno avatar sweco-semhul avatar tadjik1 avatar thismakessand avatar tigerlily-he avatar tpedelose avatar trescube avatar vanessayuenn avatar vesameskanen avatar worace avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

api's Issues

[operations] Domain Root

Currently when you hit the domain root http://pelias.mapzen.com/ it serves a plain white HTML page, this is the same behaviour for 404 error pages.

Recently we have been adding the root URL to documentation, notably here: https://github.com/pelias/pelias#im-a-developer-can-i-get-access-to-the-api

Couple questions:

  1. Should all endpoints return application/json?
  2. What should be served at the domain root for the pelias service?
  3. What should be served in case of 404 errors; what encoding?

outputGenerator tweaks

In the following example, the 'text' should be 'Arbil, Iraq' but due to the way the outputGenerator is written it matches nothing for local and 'Arbil' for regional (the name is also 'Arbil').

If I simply add admin1 to the end of the local array then both local and regional will match 'Arbil' and when the de-duper removes one entry, we are back with the text being 'Arbil'.

Not sure if I described this well, but basically we need to continue checking values in the local and regional arrays until we find a value which is not a duplicate of the name or another admin value, otherwise a POI with { "name": "a", "local_admin": "a", "admin1_abbr": "a", "admin0": "b" } will always just be a when it would be nicer if it were a, b.

Hope that makes sense ;)

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          44.180874,
          36.369878
        ]
      },
      "properties": {
        "id": "1110:adm1:iq:irq:arbil",
        "type": "admin1",
        "layer": "admin1",
        "name": "Arbil",
        "alpha3": "IRQ",
        "admin0": "Iraq",
        "admin1": "Arbil",
        "text": "Arbil"
      }
    },
...

Specific queries result in 5xx error responses and/or timeouts

Example:

time curl -v "http://pelias.mapzen.com/reverse?&input=a&lat=0&lon=0"
* Hostname was NOT found in DNS cache
*   Trying 54.88.134.185...
* Connected to pelias.mapzen.com (54.88.134.185) port 80 (#0)
> GET /reverse?&input=a&lat=0&lon=0 HTTP/1.1
> User-Agent: curl/7.37.1
> Host: pelias.mapzen.com
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< Date: Thu, 26 Mar 2015 17:53:31 GMT
* Server Jetty(9.2.z-SNAPSHOT) is not blacklisted
< Server: Jetty(9.2.z-SNAPSHOT)
< Via: 1.1 Repose (Repose/6.2.1.2)
< Content-Length: 0
< Connection: keep-alive
<
* Connection #0 to host pelias.mapzen.com left intact

real    0m3.954s
user    0m0.004s
sys 0m0.005s

caching strategy / etags

The API etags are messed up because of the date property in the body changing all the time.

The etag should be the same for the same content.

A few fixes come to mind:

  1. move date to header, may be hard for ajax consumers to deal with
  2. remove date, will be hard for frontends to handle stale /suggest requests
  3. manually compute the etag then add the date
  4. something else

Personally; I would prefer to completely drop etags in favour of max-age headers as they are more performant due to making far less network requests and they are well suited for content which doesn't change often.

With any solution there will be an issue with the date property; which exists to ensure that database & network congestion doesn't cause older requests to change the UI after a newer request has already been received and rendered.

Some serious thought needs to go in to ideal cache control strategies; which account for near-side and far-side caching, mobile experience and typeahead UI.

ref: http://vadmyst.blogspot.de/2005/09/server-side-etag-vs-cache-control-max.html

boost results by query locality

A query for 1710 Drew, Houston currently ranks 1710 Drew Place, Claremont, CA higher than 1710 Drew Street, Houston, TX. According to @hkrishna , we only boost matches for admin0, admin1, and admin1_abbr, meaning that the exact locality match doesn't get taken into account at all here. We should factor it into boosting if it doesn't significantly impact query times.

Feedback endpoint

Pelias API should have an endpoint where in, a client could give feedback on what the query was and what result item the user selected/used. This information can be helpful in populating our acceptance test cases and generally useful in refining the search quality.

params:

  • input Ex: 123 Main St, New York
  • resultSelected Ex: 123 Main St, Manhattan, NY (text/object?)
  • resultIndex Ex: 1 (number)

layer aliases

it might be nice to have 'aliases' for the 'layers' param.

eg.
?layers=poi could expand to ['geoname','osmnode','osmway'] and
?layers=admin could expand to ['admin0','admin1','admin2','neighborhood']

this may do away with the need for multiple suggester endpoints?

@hkrishna thoughts?

Search/Filter by Category

Related to pelias/openstreetmap#33 and pelias/pelias#50

This issue deals with using the category mapped on a document. Here are a few use cases:

  • find all airports in a bbox
  • search for a transit station
  • find the 10 nearest mexican restaurants

The API should be able to take a parameter lets say category and based on its value - use it in the queries as a filter. For example:

localhost:3100/search?input=maysville&category=restaurant

should return restaurants that are named maysville

localhost:3100/reverse?lat=0&lon=0&category=restaurant:mexican

should return the nearest mexican restaurants to lat/lon 0,0

Use categories as a scoring function

Each point has a field called category that contains some valuable information, it would be nice to use this information in the scoring algorithm.

We could have a file weighted_categories.yml where we can define what weight each tag value corresponds to.

   airport: 12
   railway: 10
   restaurant: 8
   tourism: 11
   poi:landmark: 10
   ...

This should be a query time boosting easily do-able by just adding a script and using it in the API sorting logic.

Related to pelias/openstreetmap#33 and pelias/pelias#50

pretty-format results with a `pretty` parameter

It'd be great if the API could optionally pretty-format JSON via a parameter like pretty, so that users would be able to see this when curling it:

{
    "bbox": [
        -73.993641,
        40.721969,
        -73.993641,
        40.721969
    ],
    "date": 1427901010432,
    "features": [
        {
            "geometry": {
                "coordinates": [
                    -73.993641,
                    40.721969
                ],
                "type": "Point"
            },
            "properties": {
                "admin0": "United States",
                "admin1": "New York",
                "admin2": "New York",
                "alpha3": "USA",
                "id": "2ed31dafbbb9419da585e24a29550ce5",
                "layer": "openaddresses",
                "local_admin": "Manhattan",
                "locality": "New York",
                "name": "220 Bowery",
                "neighborhood": "Downtown",
                "text": "220 Bowery, Manhattan, New York",
                "type": "openaddresses"
            },
            "type": "Feature"
        }
    ],
    "type": "FeatureCollection"
}

rather than:

{"type":"FeatureCollection","features":[{"type":"Feature","properties":{"id":"2ed31dafbbb9419da585e24a29550ce5","type":"openaddresses","layer":"openaddresses","name":"220 Bowery","alpha3":"USA","admin0":"United St
ates","admin1":"New York","admin2":"New York","local_admin":"Manhattan","locality":"New York","neighborhood":"Downtown","text":"220 Bowery, Manhattan, New York"},"geometry":{"type":"Point","coordinates":[-73.99364
1,40.721969]}}],"bbox":[-73.993641,40.721969,-73.993641,40.721969],"date":1427901003524}

without having to pipe it through something like python -m json.tool. It'd make command-line examples cleaner.

suggest and reverse problems

I'm having a hard time getting the API to work properly, and also understanding which components I actually need. I changed the layers URL in the demo to work with our maps. This works. I also changed the API URL to our installation, but it will only show search results.
Suggest queries don't return anything

{"type":"FeatureCollection","features":[],"date":1427269933035}

reverse actually returns an error
TypeError: Cannot read property 'geometry' of undefined at demo.js:145 at angular.js:8598 at angular.js:12234 at k.$eval (angular.js:13436) at k.$digest (angular.js:13248) at k.$apply (angular.js:13540) at q (angular.js:8884) at u (angular.js:9099) at XMLHttpRequest.E.onreadystatechange (angular.js:9038)angular.js:10683 (anonymous function)angular.js:7858 (anonymous function)angular.js:12242 (anonymous function)angular.js:13436 k.$evalangular.js:13248 k.$digestangular.js:13540 k.$applyangular.js:8884 qangular.js:9099 uangular.js:9038 E.onreadystatechange

This is from a search query:

{"type":"FeatureCollection","features":[{"type":"Feature","properties":{"id":"288998480","layer":"osmway","name":"Wien","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Mariabrunn","text":"Wien"},"geometry":{"type":"Point","coordinates":[16.244355,48.201155]}},{"type":"Feature","properties":{"id":"8091317","layer":"osmnode","name":"Wien Ottakring","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Breitensee","text":"Wien Ottakring, Wien"},"geometry":{"type":"Point","coordinates":[16.311194,48.211817]}},{"type":"Feature","properties":{"id":"8091320","layer":"osmnode","name":"Wien Hernals","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Dornbach","text":"Wien Hernals, Wien"},"geometry":{"type":"Point","coordinates":[16.31483,48.223272]}},{"type":"Feature","properties":{"id":"8091331","layer":"osmnode","name":"Wien Gersthof","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Gersthof","text":"Wien Gersthof, Wien"},"geometry":{"type":"Point","coordinates":[16.329063,48.230864]}},{"type":"Feature","properties":{"id":"8091423","layer":"osmnode","name":"Wien Heiligenstadt","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Nuszdorf","text":"Wien Heiligenstadt, Wien"},"geometry":{"type":"Point","coordinates":[16.365545,48.248981]}},{"type":"Feature","properties":{"id":"9232378","layer":"osmnode","name":"Wien Jedlersdorf","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Neujedlersdorf","text":"Wien Jedlersdorf, Wien"},"geometry":{"type":"Point","coordinates":[16.395964,48.274071]}},{"type":"Feature","properties":{"id":"9232410","layer":"osmnode","name":"Wien Floridsdorf","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Neujedlersdorf","text":"Wien Floridsdorf, Wien"},"geometry":{"type":"Point","coordinates":[16.400016,48.256354]}},{"type":"Feature","properties":{"id":"9232449","layer":"osmnode","name":"Wien Traisengasse","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Brigittenau","text":"Wien Traisengasse, Wien"},"geometry":{"type":"Point","coordinates":[16.383248,48.234871]}},{"type":"Feature","properties":{"id":"9232502","layer":"osmnode","name":"Wien Strebersdorf","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Schwarzlackenausiedlung","text":"Wien Strebersdorf, Wien"},"geometry":{"type":"Point","coordinates":[16.381598,48.285661]}},{"type":"Feature","properties":{"id":"27024450","layer":"osmnode","name":"Wien Simmering","alpha3":"AUT","admin0":"Austria","admin1":"Wien","admin2":"Wien","locality":"Wien","neighborhood":"Simmering","text":"Wien Simmering, Wien"},"geometry":{"type":"Point","coordinates":[16.419459,48.170099]}}],"bbox":[16.244355,48.170099,16.419459,48.285661],"date":1427278983264}

Using the vagrant image is not an option for me, I need to integrate this into our infrastructure properly.

Any pointers what I could be missing?

Address Schema Parameter

Let the user decide what version of output text they would like through the API. Perhaps an additional optional parameter lets say addressParts.

  • &addressParts=all (by default): will return name, local, regional, national as defined in outputSchema.json Ex: Arbil, Lye, West Midlands, GBR
  • &addressParts=local,regional: will return name, local, regional Ex: Arbil, Lye, West Midlands

We can also push this one step further and define formats for mobile &addressParts=mobile in a config file which will cater to smaller screens.

This issue is opened as a product of the ongoing discussions at #93

security: express middleware

investigate https://github.com/evilpacket/helmet and possibly talk to adam.

this ticket is to review the helmet npm module and establish if including some of their functionality in our app will decrease the risk of malicious attack.

security is always important, we are only storing open data which is freely available anyway and no personal data so theft is not an issue but we have a responsibility to ensure security for our users as best we can.

"express is a very thin http framework, the heavy lifting is done by the middleware and their philosophy continues to be to have as little as possible enabled by default"

SearchPhaseExecutionException

On the current master branch with the latest pelias-schema installed, I get the following errors when running the functional tests:

@hkrishna what's going on here!!?

 GET http://localhost:3100/search?input=lake&lat=29.49136&lon=-82.50622 test/ciao/search/success.coffee 
 ✘ valid response
 expected { message: 'SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed; shardFailures {[fDpHOdhCTM6AMkcFNpc1tg][pelias][0]: SearchParseException[[pelias][0]: query[filtered(name.default:lake)->BooleanFilter(+cache(GeoDistanceFilter(center_point, PLANE, 50000.0, 29.49, -82.51)))],from[-1],size[10]: Parse Failure [Failed to parse source [{"query":{"filtered":{"query":{"query_string":{"query":"lake","fields":["name.default"],"default_operator":"OR"}},"filter":{"bool":{"must":[{"geo_distance":{"distance":"50km","distance_type":"plane","optimize_bbox":"indexed","_cache":true,"center_point":{"lat":"29.49","lon":"-82.51"}}}]}}}},"size":10,"sort":["_score",{"_geo_distance":{"center_point":{"lat":29.49136,"lon":-82.50622},"order":"asc","unit":"km"}},{"_script":{"file":"admin_boost","type":"number","order":"desc"}},{"_script":{"file":"population","type":"number","order":"desc"}},{"_script":{"params":{"weights":{"geoname":0,"address":4,"osmnode":6,"osmway":6,"poi-address":8,"neighborhood":10,"local_admin":12,"locality":12,"admin2":12,"admin1":14,"admin0":2}},"file":"weights","type":"number","order":"desc"}}],"track_scores":true}]]]; nested: ElasticsearchIllegalArgumentException[Unable to find on disk script admin_boost]; }]' } to not exist

only reverse-geocode against boundaries at high zoom levels

Would it make sense to only reverse-geocode against administrative boundaries at very high zoom levels? For instance, clicking around the center of the US at zoom level 3 returns:

{
    "geometry": {
        "coordinates": [
            -100.75994,
            38.30442
        ],
        "type": "Point"
    },
    "properties": {
        "admin0": "United States",
        "admin1": "Kansas",
        "admin2": "Scott County",
        "alpha3": "USA",
        "id": "5445311",
        "layer": "geoname",
        "name": "Dry Lake",
        "text": "Dry Lake, Scott County, Kansas",
        "type": "geoname"
    },
    "type": "Feature"
}

I feel like that's too granular, and a country/state result like the following might be more intuitive:

{
    "geometry": {
        "coordinates": [
            0.314297,
            45.153259
        ],
        "type": "Point"
    },
    "properties": {
        "admin0": "United States",
        "alpha3": "USA",
        "id": "329:adm0:us:usa:_",
        "layer": "admin0",
        "name": "United States",
        "text": "United States",
        "type": "admin0"
    },
    "type": "Feature"
}

API (possibly express) swallows runtime exceptions

The API appears to silently swallow runtime exceptions that would otherwise cause it to print a stack trace and crash. For instance, I've changed query/search.js:17 from:

var query = queries.distance( centroid, { size: params.size } );

to

var query = queries.distance( centroid, { size: param.size } );

Note that params.size became param.size, and, because param doesn't exist, you'd expect the process to crash with ReferenceError: param is not defined. When running the API via node index.js, however:

$ curl localhost:3100/search?input=foobar
{"error":{}}

The request obviously errored and the status code is 500, yet the API keeps running without any output besides listening on 3100. If you try to use the query/search module directly, on the other hand, it errors as expected:

$ node -e 'console.log(require("./query/search")({}));'
/home/sevko/src/mapzen/api/query/search.js:17
  var query = queries.distance( centroid, { size: param.size } );
                                                  ^
ReferenceError: param is not defined
    at generate (/home/sevko/src/mapzen/api/query/search.js:17:51)
    at [eval]:1:38
    at Object.exports.runInThisContext (vm.js:74:17)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:460:26)
    at evalScript (node.js:431:25)
    at startup (node.js:90:7)
    at node.js:814:3

/suggest 'layers' params

as per #22 we should allow consumers to filter suggestions by layer on '/suggest' as they do on '/search'.

We have the 'dataset' category context suggester already set up to allow this.

Address details Parameter

If a user wants to just use the text property in the output and doesnt care about the rest - then he/she should be able to request a smaller payload by setting a parameter lets say addressDetails to false

Currently, each point that gets returned by the API looks like the following

{  
   type:"Feature",
   properties:{  
      id:"127099401",
      layer:"osmway",
      name:"Mays",
      alpha3:"GBR",
      admin0:"United Kingdom",
      admin1:"Bournemouth",
      admin2:"Dorset",
      locality:"Bournemouth",
      neighborhood:"Westbourne",
      text:"Mays, Westbourne, Dorset"
   },
   geometry:{  
      type:"Point",
      coordinates:[  
         -1.902019,
         50.721979
      ]
   }
}

with &addressDetails=false parameter, the output is slimmed down to the following

{  
   type:"Feature",
   properties:{  
      text:"Mays, Westbourne, Dorset"
   },
   geometry:{  
      type:"Point",
      coordinates:[  
         -1.902019,
         50.721979
      ]
   }
}

suggest nearby oddities

http://pelias.mapzen.com/suggest/nearby?input=a&lat=51.5328850&lon=-0.0652280

lat/lon is for: London, UK.

I would expect to see more than one result and results more similar to /search

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "id": "6255146",
        "type": "geoname",
        "layer": "geoname",
        "name": "Africa",
        "text": "Africa"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          21.09375,
          7.1881
        ]
      }
    }
  ],
  "bbox": [
    21.09375,
    7.1881,
    21.09375,
    7.1881
  ],
  "date": 1427141000665
}

Bounding box support for /suggest

/suggest should be able to take in a bbox parameter and return suggestions that are contained within the given bounding box.

Here is a use-case provided by @mattkrick in pelias/pelias#81

"For my app, I'd like to autocomplete from a list of schools in a given town. If I use the suggest endpoint, I get suggestions from the other side of the country, but if I use the search endpoint with a bbox, it can't predict things mid-word. I think this could be best solved with a bbox on the suggest endpoint."

resolves: pelias/pelias#81

add logging

use winston or another similar logging library

filter by alpha3

it might be nice to allow consumers to restrict the results returned to only those which lie within an alpha3.

ie. I want an autocomplete for admin records in GBR.

input text interpretation

When entering location strings where the text is delimited by a symbol, eg. 'London, UK' or 'London, ON' the context suggester fails to see the difference between the named component and the admin component(s) which are conventionally delimited using a comma.

We need to start looking at some very basic NLP even if it's just something like only using the tokens before the first comma.

Here's a visual illustration of the problem:
ss

The gifs shows 2 UX issues:

  1. Not reproducible: when clicking a successful match you cannot use the string returned to search and find that same place, you must remove everything after the first comma.
  2. You cannot paste an address string directly in to the input, eg. pasting "Hackney Town Hall, Hackney, Greater London" yields 0 results.

thoughts @hkrishna ?

note, to reproduce the above query bias: https://mapzen.com/pelias#loc=18,51.54503,-0.05639

Specify properties to send back in results

@baldur mentioned that it'd be useful to allow cherry-picking the properties returned by the API to reduce payload size. For instance, you could limit the following:

{
    "geometry": {
        "coordinates": [
            -73.94958,
            40.6501
        ],
        "type": "Point"
    },
    "properties": {
        "admin0": "United States",
        "admin1": "New York",
        "admin2": "Kings County",
        "alpha3": "USA",
        "id": "5110302",
        "layer": "geoname",
        "name": "Brooklyn",
        "text": "Brooklyn, Kings County, New York",
        "type": "geoname"
    },
    "type": "Feature"
}

to

{
    "geometry": {
        "coordinates": [
            -73.94958,
            40.6501
        ],
        "type": "Point"
    },
    "properties": {
        "admin0": "United States",
        "admin1": "New York",
        "admin2": "Kings County",
        "text": "Brooklyn, Kings County, New York",
        "type": "geoname"
    },
    "type": "Feature"
}

Add boundary.country parameter to /search and /reverse endpoints

The geometry of some country polygons makes it difficult to use bbox filtering to exclude neighbouring nations.

It would be nice to have an API param called alpha3 or similar which excluded results from outside that country.

When used in combination with a coarse geocoder it would be very easy to create a coarse search for a single country using the mapzen service.

Inconsistent Autocomplete

This UX issue is somewhat subjective, so I'll try to explain why I think it is a problem.

Consider the gif below:
ss

As the amount of characters are added to the query "Hackney City Farm" we get very different results back, those results differ both in which layers the data comes from; the ratio of local/distant entries and the total amount of items returned.

For all characters up to and including "hackney " we only get administrative entries returned, I personally think this is fine with the exception of a large amount of distant or low population admin entries that appear before your first word is completely formed.

When you reach 8 characters, eg. "hackney c" you seem to get an odd mix of admin entries and places, this is also where some entries from over 6000km appear in the results.

When you type "hackney ci" you don't get any results at all? I find this behaviour very odd and it seems like a bug?

After typing "hackney cit" or more characters the correct local POI is returned.

note, to reproduce the above query bias: https://mapzen.com/pelias#loc=18,51.54503,-0.05639

cc/ @pelias/owners

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.