Comments (7)
As you will see in #12, that specific example (EWR) was discussed, and it has been decided (eg, by @aseredyn , @wsteitz, @alexprengere, @pgrandjean, me) that the city served by the Newark airport (EWR) would be New York, and then only Newark and Iselin.
Now, I understand that your issue is more about how to sort in between Newark and Iselin.
Well, IATA (http://www.iata.org/publications/Pages/code-search.aspx) refers to Iselin as the city for EWR and to Newark for the airport. EWR, as a city, does not appear on their public Web page, as the only POR serving Iselin is (and I agree that it is counter-intuitive) ZME, Metropark station, and only airport POR appear on their public Web page.
To recap that specific example:
- NYC (New York City) is served by EWR (Newark)
- EWR (Iselin / Newark) is/are served by ZME (Metropark station)
There is a unique/primary key for the POR in OpenTravelData, and that primary key is the (IATA code, location type, Geonames ID) combination, to be found as the first column of opentraveldata/optd_por_best_known_so_far.csv.
With the example of EWR, it gives:
- EWR-A-5101809^EWR^40.69125^-74.17883^NYC,EWR^
- EWR-C-5099738^EWR^40.57538^-74.32237^EWR^
- EWR-C-5101798^EWR^40.73566^-74.17237^EWR^
And, how to choose between Iselin and Newark, or among all the cities sharing the same IATA code? Well, it is up to you, on a case by case basis: you will easily admit that, since Iselin is the official IATA city and EWR comes as Newark for almost everyone else, it is hard to pick the best one.
So, to answer the question raised by that issue, all the cities referenced by OpenTravelData for a given IATA code are relevant, and correspond (in a non exhaustive way) to all the cities served by a given travel-related POR (eg, airport, heliport, train/bus station, port). To add new such cities, one can add the corresponding record in the opentraveldata/optd_por_best_known_so_far.csv file.
from opentraveldata.
Also - thanks to your answer we identified an important improvement in our APIs.
Some cities (like EWR) are not associated to their airports with the same code. Therefore, when requesting location data for a specific code, we always have to look for the airport and the city in both tables. We can't just assume the airport with the same location code is always referenced from the city
from opentraveldata.
Here's a better example - EWR
on lines 1072 / 4073
Travel agents will tell you that EWR is the city code for NEWaRk NJ - see, for example, on airportcod.es
In the POR file, we have the choice of Iselin and Newark - and while I don't doubt that theoretically both cities could apply to this code, if you had to select just one, you would select Newark. Is there any way to automate this decision based on the data provided?
from opentraveldata.
Thanks for the detailed explanation Denis.
Regarding your last questions - I don't think it's hard to pick the most common implementation in these examples.
You can Google it. For example: Google EWR Iselin, and you will get 32k results. Google EWR Newark, you will get 10M. Similarly, AAJ Cajana gives 164k results, AAJ "Awa Dam" gives just 137 results.
Similarly, I guess I could use a word distance score perhaps, to see if more letters of the IATA code exists as a proportion of one name to another. That would work in these two examples, I don't know how it would scale.
It would be better to use signals in the file I could use to select the correct option systematically. Are they any you would suggest?
from opentraveldata.
You could use your idea of doing a Google search of " ", potentially through the Google search API. That way, we wouldn't have to add anything to OpenTravelData. Because, again, it is a matter of choice. Here, as Iselin is the official city referenced by IATA, some people will want to pick Iselin as the city for EWR. And, in some other cases, Newark will be more natural. So, it is up to every user to choose whatever they prefer.
Adding the Google search popularity, or any other popularity measure, to the optd_por_public.csv file, is doable in theory, but it is not simple to derive/build. The main issue is how to get a popularity/measure for every POR referenced by OpenTravelData. If you come up with some equivalent open/freely accessible data file, we could then make the junction. But, otherwise, it will be difficult.
from opentraveldata.
Thanks Denis.
So, you've made a good point here - Iselin is the official city referenced by IATA, and Newark is referenced from... ? Is there any way to identify the official city that's been referenced by IATA in the data set?
from opentraveldata.
Is there any way to identify the official city that's been referenced by IATA in the data set?
As a matter of fact, with that specific EWR example, IATA references only Iselin as a city. I just added Newark as a city for EWR, because 1) it used to be the case before 2) it makes a lot of sense!
Note, though, that IATA may reference several distinct cities for the same IATA code. For instance:
- RDU for Raleigh and Durham, North Carolina, United States
- EAP for Basel, Switzerland, and Mulhouse, France. That one is a really convoluted example (nice to test your API)!
from opentraveldata.
Related Issues (20)
- Cities referring to multiple geonames entries HOT 3
- Add GOX airport
- Missing airports from IATA
- DGH airport HOT 1
- Some corrections about first administration level codes HOT 1
- Latest iata_airport_list has missing columns HOT 8
- IATA code reassignment - Nice Helicopteres and Starlux HOT 2
- Are there any data dictionaries or descriptions of the data fields?
- Aircraft list not uptodate HOT 1
- Time zone updates for Chile HOT 2
- Addition of Arajet and removal of Asian Air HOT 4
- IATA OPTD airline code mismatch HOT 2
- Duplicate EAP entries HOT 6
- DHA is listed as a heliport HOT 1
- Airport size in optd_por_public.csv HOT 2
- Missing airline T0/TPU HOT 1
- Identify rail IATA codes HOT 1
- Incorrect "Travel Service Slovakia" airline name
- Airline Update - lots of defunct listings HOT 4
- Airport corrections HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opentraveldata.