highfestiva / bcp47.py Goto Github PK
View Code? Open in Web Editor NEWBCP47 LCID language codes, plain and simple
License: MIT License
BCP47 LCID language codes, plain and simple
License: MIT License
It seems this might require an update since few languages aren't showing up that shows up in language_tags. Please consider as i am not very familiar with correcting/updating.
'ar-XA' should come up with key like Arabic
'cmn-CN' should come up with key like -- 'Mandarin Chinese', 'China'
'cmn-TW' should come up with key like -- 'Mandarin Chinese', 'Taiwan, Province of China'
'sr-RS' should come up with key like -- 'Serbian', 'Serbia'
'yue-HK' should come up with key like -- 'Yue Chinese', 'Cantonese', 'Hong Kong'
For now, commented the import of bcp47 and using language_tags although it seems bcp47 is showing up as newer one.
Example here:
https://github.com/highfestiva/bcp47.py/blob/master/bcp47/bcp47.py#L42
Is this normal or a typo?
Howdy!
Examining the ecosystem for available BCP47 implementations, I've been highly discouraged by my findings. Not one implements the tag specification properly, not one processes Unicode extensions (e.g. timezone, currency, etc.), and this package is only a mapping of English-localized names to two-component abbreviated tags.
An entire package for that one dictionary, which is of limited use and does not actually make use of the source data, which maps Microsoft Language ID integers (LCID; expressed generally as two-byte hex codes) to language tags. The one publicly accessible dependant project on GitHub ("Used By") uses it entirely as an enum of possible tag strings, without any form of parsing, meaning that project will reject perfectly valid language tags.
The example I'm attempting to parse: en-CA-u-tz-cator-cu-CAD
โ a single string can contain 100% of the language, region, locale, timezone, currency, calendar system, week-starts-on, accounting negative numbers preference (-10
v. (10)
), and so forth, so on.
I would like to propose a transfer of maintainership for this package, if you'd be OK with that? My plan would be to make a final point release of this existing codebase to add a DeprecationWarning
generated on import (with advice to pin versions if needed), and then to implement a proper, conformant BCP47 Language Tag parser/generator and datatype representation, similar to what I did with the uri
package, sourcing languages, regions, and localization of the labels/names for these from the official CLDR dataset.
Thank you for your consideration!
โ A hopeful Alice.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.