GithubHelp home page GithubHelp logo

lquirosd / order_relation_operator Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 1.0 46 KB

Learning to Sort Handwritten Text Lines in Reading Order through Estimated Binary Order Relations

License: MIT License

Shell 7.81% Python 92.19%
document-layout-analysis reading-order handwritten-text-recognition sorting-algorithm

order_relation_operator's People

Contributors

lquirosd avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

bertsky

order_relation_operator's Issues

making scripts run

In trying to reproduce, I stumbled across several problems:

  • in get_data.sh, OHG and ABP datasets have been deactivated (but they are needed for run.sh)
  • in run.sh, the CLI of MLP.py has changed
  • in run.sh, the file RDF.py is nowhere to be found (I guess it should be RTC.py)
  • in the training data PAGE XML files, some regions/lines have no type in @custom; but text_line_features_from_xml falls back to TextRegion, which is not in the configured list of categories and therefore fails with ValueError: 'TextRegion' is not in list
  • also, 2 regions/lines in the OHG dataset have paragraph as type, which likewise fails
  • also, in the calls for the ABP dataset, (since there is no type in @custom) there should be a --level=region, otherwise no categories match at all
  • in RTC.py, the import text_line_dataset.TextLineInMemory should have changed to text_line_dataset.PairsInMemoryDataset
  • in metrics.py, implementation of Kendall's Tau must compensate for the case where there's just 1 sample (to prevent ZeroDivisionError)
  • src/basic_ro.py now requires an additional argument lines|regions|lines_hier, and its 2nd argument contains the level in its name (e.g. test_line.pickle instead of test.pickle)
  • in MLP.py, when region level was introduced, the number of features changed (from len(categories) + 6 to len(categories) + 7), so the shape of the augmentation mask must as well
  • in run.sh, the plot commands do not work; it looks like plot_reading_order.py should be replaced with the new CLI plot_reading_order_2.py; but also, in both files, the calculation of the center coordinates from the feature vectors is wrong (the line-level calculation has been commented out in favour of a region-level calculation)
  • how do I train a hierarchical model? In run.sh, there's no --hierarchical...

Spoiler: I have already worked on workarounds and fixes, so I am likely going to create a PR, but I would really appreciate some directions in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.