Comments (6)
That sounds good to me yes. I was hoping to do releases today if you think ucto is ready?
from ucto.
--textclass
actually does seem useful still as a convenient shortcut, don't you think?
from ucto.
--textclass actually does seem useful still as a convenient shortcut, don't you think?
Yes, it is, although it is in fact unwise to use it :)
But it is widely uses i fear
from ucto.
I am still a user of --textclass especially when working with OCRed text which needs to be post-corrected before tokenization.
from ucto.
Good to know. The functionality wouldn't be lost anyway, as there's --inputclass and --outputclass, but we'll keep --textclass as a shortcut for setting both at once so nothing will change here.
from ucto.
Thanks for the further info! I'll check out whether it would make more sense for me to use --inputclass and --outputclass instead.
from ucto.
Related Issues (20)
- passthru mode should not be combined with other language options
- ucto creates invalid folia HOT 2
- Update debian package for v0.21
- Byte-order mark followed by space or tab results in Folia error HOT 7
- is this correct handling of FoLiA paragraphs with embedded Part nodes? HOT 4
- -T full option produces invalid FoLiA HOT 1
- Tokenization of t-style element that has font_typeface Feature HOT 19
- Validation of ucto output fails due to space character in FoLiA output from Piereling HOT 7
- ucto sometimes misses out on the <t> for <p> HOT 3
- IDs in UCTO in concert with tei2folia HOT 3
- Language detection default for 'unknown' language HOT 9
- Ucto with 'detectlanguages' : failure HOT 3
- Ucto aborts on FoLiA creation
- Question: Concatenating word parts at soft hyphens HOT 77
- Develop a tokenizer for Premodern Slavic
- Implement (soft)hyphen handling in Ucto analogues to foliautils
- Ucto fails on some UTF-8 characters in tei2folia generated FoLiA HOT 12
- add a batch option HOT 6
- Setting -m in container does not supress punctuation-based sentence splitting HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ucto.