GithubHelp home page GithubHelp logo

vintagentleman / ud_old_church_slavonic-proiel Goto Github PK

View Code? Open in Web Editor NEW

This project forked from universaldependencies/ud_old_church_slavonic-proiel

0.0 0.0 0.0 4.51 MB

Old Church Slavonic data from the PROIEL project.

License: Other

ud_old_church_slavonic-proiel's Introduction

Summary

The Old Church Slavonic (OCS) UD treebank is based on the Old Church Slavonic data from the PROIEL treebank and contains the text of the Codex Marianus New Testament translation.

Introduction

The Old Church Slavonic (OCS) UD treebank is based on the Old Church Slavonic data from the PROIEL treebank, which is maintained at the Department of Philosophy, Classics, History of Arts and Ideas at the University of Oslo. The conversion is based on the 20180408 release of the PROIEL treebank available from https://github.com/proiel/proiel-treebank/releases. The original annotators are acknowledged in the files available there. The conversion code is available in the Rubygem proiel-cli, https://github.com/proiel/proiel-cli.

The treebank contains the text of the Codex Marianus New Testament translation. The original annotation guidelines are available at http://folk.uio.no/daghaug/syntactic_guidelines.pdf. The text and tokenization comes from Corpus Cyrillo-Methodianum Helsingiense http://www.helsinki.fi/slaavilaiset/ccmh/marianus.html.

Acknowledgements

The data have been automatically converted to the UD scheme by Dag Haug. Thanks to all the original annotators!

Data splits

The development set consists of Matthew 5, 6, 21, and 24, Mark 5 and 6, Luke 4, 7, 8 and 18, and John 10, 11, 17, 19. The test data consists of Matthew 7, 8, 22, 23 and 25, Mark 7, 8 and 15, Luke 5, 9, 10 and 19, and John 12, 13, 18, 21.

References

Dag T. T. Haug and Marius L. Jøhndal. 2008. 'Creating a Parallel Treebank of the Old Indo-European Bible Translations'. In Caroline Sporleder and Kiril Ribarov (eds.). Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008) (2008), pp. 27-34.

Changelog

  • 2021-05-15 v2.8
    • Undocumented feature Strength replaced with Variant=Short as in other Slavic treebanks.
    • Undocumented Aspect=Res converted to VerbForm=PartRes|Tense=Past as in the Old East Slavic treebanks.
    • Negative copula не.бꙑти lemmatized as бꙑти, added Polarity=Neg.
  • 2018-07-01 v2.2
    • Repository renamed from UD_Old_Church_Slavonic to UD_Old_Church_Slavonic-PROIEL.
  • 2017-03-01 v2.0
    • The treebank was converted to UDv2 and the data splits were changed.
  • 2015-11-15 v1.2
    • Initial release in Universal Dependencies.

=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v1.2 License: CC BY-NC-SA 3.0 Includes text: yes Genre: bible Lemmas: converted from manual UPOS: converted from manual XPOS: manual native Features: converted from manual Relations: converted from manual Contributors: Haug, Dag Contributing: elsewhere Contact: [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.