GithubHelp home page GithubHelp logo

Comments (7)

alexprengere avatar alexprengere commented on September 26, 2024 1

I plan to write a script that will automatically fix the order of the city code list in the reference file, to be "PageRank" decreasing. That way users can take the first city code, if they want the most relevant one.
It will not be automated, so newly added por will have to match those semantics in the reference file.
From what I can tell, the biggest affected airports are:

  • FLL that will now have MIA as preferred city code
  • BWI that will now have WAS as preferred city code
  • ICN that will now have SEL as preferred city code (except this was already fixed in #163)

If users preferred the old sort, they just have to sort the cities alphabetically.

from opentraveldata.

da115115 avatar da115115 commented on September 26, 2024

There is now a sorting function in the tools/awklib/geo_lib.awk AWK library, invoked from the tools/add_city_name.awk AWK script.

Writing a similar sorting function, so that PageRank values be used rather than alphabetical order, should not be difficult.

from opentraveldata.

aseredyn avatar aseredyn commented on September 26, 2024

The issue is limited to a small number of iata codes. The file attached below provides a suggested change of sequence within the city_code_list based on some manual research and an industry source. This manual sorting can be used to edit optd_best_known_so_far.csv and replace the alphabetic sorting used today. Similar changes in the future can be done by a direct edit in optd_best_known_so_far.csv

The aim is to have the best city code in the first element of the list. The most important change is EWR that should be mapped to NYC. Other changes could use some double check. The airports are sorted by their size in terms of schedule capacity.

city_code_list_mod.txt

from opentraveldata.

da115115 avatar da115115 commented on September 26, 2024

All the modifications have been implemented (for the sorting order of the served cities). See the two above changes (85a5198 and a423d06) to confirm that I did not forget any, and that everything is now as you intended.

from opentraveldata.

aseredyn avatar aseredyn commented on September 26, 2024

yes, thank you Denis, I guess we can close the issue.

from opentraveldata.

da115115 avatar da115115 commented on September 26, 2024

Note that the list of POR, for a given city, is still sorted by alphabetical order. The PageRank based sorting algorithm still needs to be implemented. So, let's keep that ticket open.

What has been implemented is the sorting order for the list of cities for a given POR. So, it's kind of the dual of the issue above.

from opentraveldata.

alexprengere avatar alexprengere commented on September 26, 2024

So it turns out PageRank decreasing has its limitations. For example, the FLL airport serves both FLL and MIA cities, but MIA has a higher PageRank. So the PageRank might say the city code for FLL is MIA, but IATA says it is FLL.

Rather than relying on PageRank for sorting, I propose to align with IATA. Re-sorting by PageRank is fairly easy to implement "downstream" for users anyway.

I wrote a script to compare IATA and OPTD city code list when multiple cities are involved. I fixed in 3af5408 the order of about 40 point of references1, to make sure the first city code listed is the one referenced in IATA, regardless of PageRank.

So now we can say that the order of any city code list when multiple cities are involved should reflect IATA2. Consumers of the data that implemented the "downstream PageRank sorting" should probably just remove it, as it now only concerns 23 points of reference: those where the IATA associated city code is not the bigger pagerank3.

I consider this issue closed now. New additions to the reference file should follow the same principle: the first city code should be the right one.

Footnotes

  1. All except EAP are fixed. For EAP, the issue was not the order, but a missing entry in GeoNames ("EAP city" does not exist).

  2. Unfortunately there are about 60 other city code issues, for cases where only 1 city code is involved. This is out of scope for this issue, as this is does not relate to sorting.

  3. For example, applying this sort makes MIA the city code of FLL, which maybe not what you want.

from opentraveldata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.