GithubHelp home page GithubHelp logo

Comments (7)

da115115 avatar da115115 commented on September 27, 2024 1

As you will see in #12, that specific example (EWR) was discussed, and it has been decided (eg, by @aseredyn , @wsteitz, @alexprengere, @pgrandjean, me) that the city served by the Newark airport (EWR) would be New York, and then only Newark and Iselin.

Now, I understand that your issue is more about how to sort in between Newark and Iselin.
Well, IATA (http://www.iata.org/publications/Pages/code-search.aspx) refers to Iselin as the city for EWR and to Newark for the airport. EWR, as a city, does not appear on their public Web page, as the only POR serving Iselin is (and I agree that it is counter-intuitive) ZME, Metropark station, and only airport POR appear on their public Web page.

To recap that specific example:

  1. NYC (New York City) is served by EWR (Newark)
  2. EWR (Iselin / Newark) is/are served by ZME (Metropark station)

There is a unique/primary key for the POR in OpenTravelData, and that primary key is the (IATA code, location type, Geonames ID) combination, to be found as the first column of opentraveldata/optd_por_best_known_so_far.csv.

With the example of EWR, it gives:

  • EWR-A-5101809^EWR^40.69125^-74.17883^NYC,EWR^
  • EWR-C-5099738^EWR^40.57538^-74.32237^EWR^
  • EWR-C-5101798^EWR^40.73566^-74.17237^EWR^

And, how to choose between Iselin and Newark, or among all the cities sharing the same IATA code? Well, it is up to you, on a case by case basis: you will easily admit that, since Iselin is the official IATA city and EWR comes as Newark for almost everyone else, it is hard to pick the best one.

So, to answer the question raised by that issue, all the cities referenced by OpenTravelData for a given IATA code are relevant, and correspond (in a non exhaustive way) to all the cities served by a given travel-related POR (eg, airport, heliport, train/bus station, port). To add new such cities, one can add the corresponding record in the opentraveldata/optd_por_best_known_so_far.csv file.

from opentraveldata.

tadhgpearson avatar tadhgpearson commented on September 27, 2024 1

Also - thanks to your answer we identified an important improvement in our APIs.

Some cities (like EWR) are not associated to their airports with the same code. Therefore, when requesting location data for a specific code, we always have to look for the airport and the city in both tables. We can't just assume the airport with the same location code is always referenced from the city

from opentraveldata.

tadhgpearson avatar tadhgpearson commented on September 27, 2024

Here's a better example - EWR on lines 1072 / 4073
Travel agents will tell you that EWR is the city code for NEWaRk NJ - see, for example, on airportcod.es
In the POR file, we have the choice of Iselin and Newark - and while I don't doubt that theoretically both cities could apply to this code, if you had to select just one, you would select Newark. Is there any way to automate this decision based on the data provided?

from opentraveldata.

tadhgpearson avatar tadhgpearson commented on September 27, 2024

Thanks for the detailed explanation Denis.

Regarding your last questions - I don't think it's hard to pick the most common implementation in these examples.

You can Google it. For example: Google EWR Iselin, and you will get 32k results. Google EWR Newark, you will get 10M. Similarly, AAJ Cajana gives 164k results, AAJ "Awa Dam" gives just 137 results.

Similarly, I guess I could use a word distance score perhaps, to see if more letters of the IATA code exists as a proportion of one name to another. That would work in these two examples, I don't know how it would scale.

It would be better to use signals in the file I could use to select the correct option systematically. Are they any you would suggest?

from opentraveldata.

da115115 avatar da115115 commented on September 27, 2024

You could use your idea of doing a Google search of " ", potentially through the Google search API. That way, we wouldn't have to add anything to OpenTravelData. Because, again, it is a matter of choice. Here, as Iselin is the official city referenced by IATA, some people will want to pick Iselin as the city for EWR. And, in some other cases, Newark will be more natural. So, it is up to every user to choose whatever they prefer.
Adding the Google search popularity, or any other popularity measure, to the optd_por_public.csv file, is doable in theory, but it is not simple to derive/build. The main issue is how to get a popularity/measure for every POR referenced by OpenTravelData. If you come up with some equivalent open/freely accessible data file, we could then make the junction. But, otherwise, it will be difficult.

from opentraveldata.

tadhgpearson avatar tadhgpearson commented on September 27, 2024

Thanks Denis.
So, you've made a good point here - Iselin is the official city referenced by IATA, and Newark is referenced from... ? Is there any way to identify the official city that's been referenced by IATA in the data set?

from opentraveldata.

da115115 avatar da115115 commented on September 27, 2024

Is there any way to identify the official city that's been referenced by IATA in the data set?

As a matter of fact, with that specific EWR example, IATA references only Iselin as a city. I just added Newark as a city for EWR, because 1) it used to be the case before 2) it makes a lot of sense!

Note, though, that IATA may reference several distinct cities for the same IATA code. For instance:

  • RDU for Raleigh and Durham, North Carolina, United States
  • EAP for Basel, Switzerland, and Mulhouse, France. That one is a really convoluted example (nice to test your API)!

from opentraveldata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.