Comments (7)
I plan to write a script that will automatically fix the order of the city code list in the reference file, to be "PageRank" decreasing. That way users can take the first city code, if they want the most relevant one.
It will not be automated, so newly added por will have to match those semantics in the reference file.
From what I can tell, the biggest affected airports are:
- FLL that will now have MIA as preferred city code
- BWI that will now have WAS as preferred city code
- ICN that will now have SEL as preferred city code (except this was already fixed in #163)
If users preferred the old sort, they just have to sort the cities alphabetically.
from opentraveldata.
There is now a sorting function in the tools/awklib/geo_lib.awk AWK library, invoked from the tools/add_city_name.awk AWK script.
Writing a similar sorting function, so that PageRank values be used rather than alphabetical order, should not be difficult.
from opentraveldata.
The issue is limited to a small number of iata codes. The file attached below provides a suggested change of sequence within the city_code_list based on some manual research and an industry source. This manual sorting can be used to edit optd_best_known_so_far.csv and replace the alphabetic sorting used today. Similar changes in the future can be done by a direct edit in optd_best_known_so_far.csv
The aim is to have the best city code in the first element of the list. The most important change is EWR that should be mapped to NYC. Other changes could use some double check. The airports are sorted by their size in terms of schedule capacity.
from opentraveldata.
All the modifications have been implemented (for the sorting order of the served cities). See the two above changes (85a5198 and a423d06) to confirm that I did not forget any, and that everything is now as you intended.
from opentraveldata.
yes, thank you Denis, I guess we can close the issue.
from opentraveldata.
Note that the list of POR, for a given city, is still sorted by alphabetical order. The PageRank based sorting algorithm still needs to be implemented. So, let's keep that ticket open.
What has been implemented is the sorting order for the list of cities for a given POR. So, it's kind of the dual of the issue above.
from opentraveldata.
So it turns out PageRank decreasing has its limitations. For example, the FLL
airport serves both FLL
and MIA
cities, but MIA
has a higher PageRank. So the PageRank might say the city code for FLL
is MIA
, but IATA says it is FLL
.
Rather than relying on PageRank for sorting, I propose to align with IATA. Re-sorting by PageRank is fairly easy to implement "downstream" for users anyway.
I wrote a script to compare IATA and OPTD city code list when multiple cities are involved. I fixed in 3af5408 the order of about 40 point of references1, to make sure the first city code listed is the one referenced in IATA, regardless of PageRank.
So now we can say that the order of any city code list when multiple cities are involved should reflect IATA2. Consumers of the data that implemented the "downstream PageRank sorting" should probably just remove it, as it now only concerns 23 points of reference: those where the IATA associated city code is not the bigger pagerank3.
I consider this issue closed now. New additions to the reference file should follow the same principle: the first city code should be the right one.
Footnotes
-
All except EAP are fixed. For EAP, the issue was not the order, but a missing entry in GeoNames ("EAP city" does not exist). ↩
-
Unfortunately there are about 60 other city code issues, for cases where only 1 city code is involved. This is out of scope for this issue, as this is does not relate to sorting. ↩
-
For example, applying this sort makes
MIA
the city code ofFLL
, which maybe not what you want. ↩
from opentraveldata.
Related Issues (20)
- Remaining differences with IATA HOT 1
- Cities referring to multiple geonames entries HOT 3
- Add GOX airport
- Missing airports from IATA
- DGH airport HOT 1
- Some corrections about first administration level codes HOT 1
- Latest iata_airport_list has missing columns HOT 8
- IATA code reassignment - Nice Helicopteres and Starlux HOT 2
- Are there any data dictionaries or descriptions of the data fields?
- Aircraft list not uptodate HOT 1
- Time zone updates for Chile HOT 2
- Addition of Arajet and removal of Asian Air HOT 4
- IATA OPTD airline code mismatch HOT 2
- Duplicate EAP entries HOT 6
- DHA is listed as a heliport HOT 1
- Airport size in optd_por_public.csv HOT 2
- Missing airline T0/TPU HOT 1
- Identify rail IATA codes HOT 1
- Incorrect "Travel Service Slovakia" airline name
- Airline Update - lots of defunct listings HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opentraveldata.