Comments (13)
Hi David
For reference here's some background first, telling you stuff you probably already know.
The multiple keysym support which we have in libxkbcommon is single key -> multiple keysyms; the biggest use case for it I'd say is unicode combining characters (in case there isn't a precomposed one to use).
Compose is the "dual": multiple keys -> unicode string or keysym. This allows to input many more keysyms than you'd want to have in a layout/shift level.
X does this entirely on the client side in the input method. For the X input method, all the supporting code is in libX11. On most systems the keysym sequence -> string mappings are defined in files at /usr/share/X11/locale/ . The particular file used is locale-dependent, but almost all of them are exactly like /usr/share/X11/locale/en_US.UTF-8/Compose :
http://cgit.freedesktop.org/xorg/lib/libX11/tree/nls/en_US.UTF-8/Compose.pre
You can define Multi_Key with an XKB option like "compose:ralt". As key presses come in, the resulting keysyms are fed into some state machine which produces the string on the right when the appropriate sequence is matched.
Preferably there is some indication to the user that he's in the middle of a match. Otherwise it can be confusing, why pressing e.g. ` (which produces say XKB_KEY_dead_macron) but nothing happens.
The format itself is regular and easy to parse, here's the grammar:
http://cgit.freedesktop.org/xorg/lib/libX11/tree/modules/im/ximcp/imLcPrs.c?id=6cb02b166361200da35ba14f52cd9aaa493eb0ea#n63
(I've never seen the MODIFIER part being used though).
But there are many different implementations of the concept. Some relevant ones are:
Gtk uses a hard-codes table derived at some point from libX11 Compose:
https://git.gnome.org/browse/gtk+/tree/gtk/gtkimcontextsimpleseqs.h
Qt5 has its own implementation, which does read libX11 Compose files:
https://qt.gitorious.org/qt/qtbase/source/1d039184543c3c1079a56e98ca22d9774166ed3f:src/plugins/platforminputcontexts/compose
Hard-coded table for Wayland input methods:
http://cgit.freedesktop.org/wayland/weston/tree/clients/weston-simple-im.c
from libxkbcommon.
Now for support in libxkbcommon.
As seen above, most client won't be using it. So clearly we shouldn't force it on the user. Luckily this support, I think, is mostly orthogonal to what libxkbcommon currently does, which is to produce keysyms. So if we want it in libxkbcommon we can put it in a separate helper library/header.
Also, any proper support for Compose would require some adjustments and support from the application. The issues you mentioned are some of what I mean, but different applications will want to handle those differently. So libxbkcommon shouldn't act as a middle man but just provide the boring stuff:
- Finding the appropriate Compose files: looking at the appropriate path, locale, ~/.XCompose, includes, etc. (Btw, since the Compose files are shipped with libX11 we'd probably want to move them to xkeyboard-config or similar).
- Parsing the file into some data structure which allows fast sequence matching and querying. Hopefully we can find something which doesn't use too much memory.
- (Optional) The state machine, which is fed keysyms and reports matches/prefixes or whatever else is of interest.
Basically providing the mechanism while still allowing the application/toolkit to do whatever it wants.
And this can all be entirely separate from the keymap handling part, or in a new library entirely. Once there's a decent API it should be relatively straightforward to code.
from libxkbcommon.
Thanks a lot for the info. The format is indeed regular, should be easy to write a parser by hand. I'd really like to see that added to xkbcommon as otherwise everyone needs to implement it on their own (and I see no reason for that). Obviously, it should be optional. I was thinking of something like:
- xkb_context looks up the compose-path additionally to include-paths. I don't see any real reason to not re-use the same context for keymaps and compose, do you?
- An xkb_compose object to load a given compose-file, query it and iterate over the single rules
- Maybe a state-machine that takes keysyms and returns keysyms. But it's fairly trivial so maybe not worth it.
I will be working on a rough API this week and send it to the list.
Thanks Ran!
from libxkbcommon.
Oh so you'll be working on it, great :)
I had actually written a parser for this file format about a year ago, and your comment prompted me to brush it up and integrate with current libxkbcommon code. I've also added some boilerplate API in order to run some tests; it's based almost entirely on the xkb_keymap / xkb_state duo. Unfortunately I think it's still missing some stuff, and I never got to writing the data structure for the sequences (which ought to be the fun part). So it's just a constant-sized array now :)
Anyway I've put it on a branch now, feel free to do (or not) whatever you want with it.
https://github.com/bluetech/libxkbcommon/commits/compose
If you post something to the list, I'll take some time to review it.
from libxkbcommon.
Awesome! I hate writing lexers/parsers and you obviously have a fable for it :) Will pick that up but might have to wait for the weekend.
from libxkbcommon.
I had some time yesterday between exams, and Gatis reminded me about this. So had some fun hacking on this. I got the basic trie to work and wrote some tests. Needs lots of work, and the API is just provisional so I can see something happen :). But I'll try to iterate on this when I get some more time.
I put the branch in the main repo now:
https://github.com/xkbcommon/libxkbcommon/commits/compose
from libxkbcommon.
OK, since I ran out of ideas for improving the API, I tried to use it with kmscon (as a "real life" example). It's here:
bluetech/kmscon@b61e07a
The relevant part is in uxkb_dev_process(). Doesn't look too bad to me, and works nicely as far as my testing goes.
There's still stuff I ignored (like modifiers in Compose files -- not sure if it's even worth a look) and non-UTF-8 Compose files (which would require parsing a bunch more libX11 files and messing with locales and iconv..).
from libxkbcommon.
Some comments:
- I kinda feel bad for having done nothing on it even though I promised.. Sorry!
- Your example looks wrong: "<`> key (the dead key) and then may produce the symbol
ú" shouldn't this be <´>? - Why is this picked by the locale? I know, X11 legacy.. but this seems weird to me. Shouldn't it be part of the RMLVO? Anyhow, I guess we cannot change it.
- I dislike the *_get_one_sym() thing. We have the multiple-keysym API and I tried to use it consistently. No-one else does, I know.. but I do! For a proof-of-concept your patch seems fine, though.
- https://bugs.freedesktop.org/show_bug.cgi?id=67167 << ugh? That's annoying.. but I'd prefer if we do that in kmscon instead of falling back to get_one_sym()..
- Can you add xkb_compose_state_get_syms()?
- what does xkb_compose_get_utf8() do? Is this because "Compose" files define UTF8 output? And you just try to map it to keysyms internally? What happens if the UTF8 output uses combining-characters? What is the maximum size to pass to this function? If there's not maximum, I'd prefer it if it returns an allocated zero-terminated buffer instead. Or use a state-internal buffer that is overwritten on each call to get_utf8()..
Otherwise the API looks really nice! It should be very easy to integrate into input-methods (if they don't want to do that themselves..) and it also provides an easy way for people that don't want input-methods.
Thanks a lot Ran!
David
from libxkbcommon.
(replying from my mail client, hope it comes out fine...)
On Thu, Feb 13, 2014 at 03:08:26AM -0800, David Herrmann wrote:
Some comments:
- I kinda feel bad for having done nothing on it even though I promised.. Sorry!
- Your example looks wrong: "<`> key (the dead key) and then may produce the symbol
ú" shouldn't this be <´>?
Unless I'm misunderstanding your question, I think it's fine. Here's the sequence:
<dead_acute> <u> : "ú" uacute # LATIN SMALL LETTER U WITH ACUTE
And the us(intl) keymap has this:
key <TLDE> { [dead_grave, dead_tilde, grave, asciitilde ] }; }
- Why is this picked by the locale? I know, X11 legacy.. but this seems weird to me. Shouldn't it be part of the RMLVO? Anyhow, I guess we cannot change it.
Since the locale is baked into the file search procedure, and into the file format itself (with %L include-statement expansion), I must use it. Since I didn't want to do setlocale() or other tricks from within the
library, I made it an explicit parameter to the functions. If you don't want to setlocale(), you can what I did in the kmscon patch.
But even though no one likes locales, I think just using the name is convenient, it provides a reasonable default without needing explicit configuration (and most map to en_US.UTF-8 anyway). I would actually not have considered it too unreasonable if the RMLVO used it as well...
If you want, you can add an configuration, so the user can explicitly choose which file to use. For normal applications, you wouldn't need to, since ~/.XCompose takes priority. But I'm not sure if $HOME is relevant for kmscon.
- I dislike the *_get_one_sym() thing. We have the multiple-keysym API and I tried to use it consistently. No-one else does, I know.. but I do! For a proof-of-concept your patch seems fine, though.
Hmm, OK, so what we have now is:
single key -> single keysym (basic keymap)
single key -> multiple keysyms (extended xkbcommon keymaps)
multiple keys -> single keysym (basic compose)
What you're proposing is:
multiple keys -> multiple keysyms (extended xkbcommon compose)
That's very flexible :)
I suppose it certainly makes sense to have combining characters in Compose files, i.e., non-precomposed (sorry) unicode characters. Also the format naturally extends itself for that (just allow multiple keysyms on the right-hand side).
But:
- I don't think it will ever by used.
- There's already the utf8 thing with partly overlapping functionality.
- Complicates the API, most people use the get_one_sym() variant or otherwise handle just the single-keysym case.
- If we support it generically the trie will consume more memory.
So my feeling is it's not worth it, and can be added later if we want it. But I can add a get_syms() which always returns 1 or 0 for consistency, that makes sense. What do you think?
- https://bugs.freedesktop.org/show_bug.cgi?id=67167 << ugh? That's annoying.. but I'd prefer if we do that in kmscon instead of falling back to get_one_sym()..
The RFC period on that one has ended I'm afraid :)
I still plan to fix the few keymaps which need it, so this becomes entirely a non-issue. For kmscon I would just ignore it, but it fitted nicely with the compose stuff, so I didn't see a reason not to. But you
can do whatever you feel like here.
- Can you add xkb_compose_state_get_syms()?
- what does xkb_compose_get_utf8() do? Is this because "Compose" files define UTF8 output? And you just try to map it to keysyms internally? What happens if the UTF8 output uses combining-characters? What is the maximum size to pass to this function? If there's not maximum, I'd prefer it if it returns an allocated zero-terminated buffer instead. Or use a state-internal buffer that is overwritten on each call to get_utf8()..
Yes, each sequence can result in either a string, a keysym, or both. If there isn't a string I return the keysym's utf8 representation, but most of the time they have both. See Xutf8LookupString(3) for what Xlib does there. (There's no string -> keysym mapping though).
And yes, the string can be arbitrary; try adding this to your ~/.XCompose, run xterm and type 1:
<1> : "hello"
I'll fix the function some way.
Otherwise the API looks really nice! It should be very easy to integrate into input-methods (if they don't want to do that themselves..) and it also provides an easy way for people that don't want input-methods.
Nah, don't think IMs would want to use this, they have their own. Maybe some lightweight fallback one (and they'd need more API for sure).
Thanks for the comments!
Ran
from libxkbcommon.
Status update: the xkbcommon-compose implementation is complete, I've updated the https://github.com/xkbcommon/libxkbcommon/commits/compose branch.
Next I'll send for comments, and try to split the Compose files from libX11 to a separate repo.
from libxkbcommon.
This looks really cool! I will play around with it later and report back if anything goes unexpected.
Regarding the compose files: Why not include them in xkeyboard-config? It's not really a keyboard configuration, but the compose files are pretty useless if used without keyboards.. so kinda co-dependent.
from libxkbcommon.
Both xkeyboard-config and "xlocale-config" can conceivably be usefel independently, so I wouldn't call them co-dependent. Also, the compose files must drag along some Xlib locale nonsense (and Xlib will depend on it). So I don't see either the Xlib or xkeyboard-config maintainers going for that. Path of least resistance...
from libxkbcommon.
The branch has been merged now; should be a part of v0.5.0.
from libxkbcommon.
Related Issues (20)
- `xkb_keysym_to_utf8()` requires output buffer size to be at least 7, but it should be 5 HOT 1
- keyboard layout change to CTRL-Shift/ALT-Shift brakes other shortcuts starting with CTRL-Shift/ALT-Shift HOT 9
- symbol map fails with recent lld (aka broken test for -Wl,--version-script) HOT 4
- Support for Arbitrary Modifiers (Accessibility Keybindings) HOT 10
- Modern Composition HOT 10
- Keymap file failed to load: us,mao,us- HOT 1
- Add support for conditional comments HOT 5
- Aliases HOT 16
- Can't input «angle quotes» into applications linked with libxkbcommon 1.5.0 running in GNOME linked with libxkbcommon 1.6.0 HOT 4
- Fatal error when $XDG_CONFIG_HOME/XCompose is a directory HOT 4
- question: how to map from xcb key press/release event state field to xkb_mod_mask_t HOT 2
- Support for pointer keys? HOT 8
- no usable option to activate horizontal number keys on Azerty keyboard HOT 2
- ! include line in rules file does not require a newline HOT 1
- LatchGroup action is not supported
- Types parser allows type definitions with garbage
- xkb_keysym_from_name returns upper case keysym for case-insensitive match in a few cases HOT 4
- Set Caps Lock to non-toggling without assigning it to Shift key, or make "last used layout" shortcut HOT 9
- Compose key does not always work HOT 11
- Documentation on how to create/install custom layouts needed HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libxkbcommon.