alacritty / vte Goto Github PK
View Code? Open in Web Editor NEWParser for virtual terminal emulators
Home Page: https://docs.rs/vte/
License: Apache License 2.0
Parser for virtual terminal emulators
Home Page: https://docs.rs/vte/
License: Apache License 2.0
Would it be possible to use vte
to read user input (keyboard / mouse) in raw mode ?
I am doing experiments here.
And it appears that some key sequences are not reported like:
Thanks.
Sorry, I was searching issues related with sixel with my phone and I don't know how a new one got created instead.
For clap and related applications, I've been looking into ANSI parsers. None quite meet my needs so I was looking at writing my own (or forking vte and turning it into what I want) but I figured I'd reach out first in case there is a way to make this work within vte and its of interest.
vte seems like it is optimized for developers for use in alacritty. Performance would most be noticed in highly interactive applications which would have a high escape code to text ratio. Since every control code needs to be processed to render correctly for alacritty, vte cares more about the control code side of printable control codes like \n
. As the build times and binary size are likely a drop in bucket for alacritty, I'm assuming they haven't been optimized.
My care abouts are the opposite of the above. My applications are static and likely to have a low escape code to text ratio and I would want to treat all printable control codes as text. Technically, this can all be handled with vte's design but the char-by-char processing won't be the most optimal. Instead I'd want to deal with slices of text. clap
is also a heavily used crate and there is a lot of interest in managing the build times and compile sizes. I could likely get away without the proc macro generated state tables and replace them with a function with matches and have little performance hit but massive compile time and binary size improvements.
What I'm trying to decide is how much to contribute vte
to handle both cases or if its better to go my own route. Thoughts?
(sorry, wasn't there if there was a better medium to reach out)
I am trying to use vte
to parse APC commands, which it seems to ignore by setting the state to SosPmApcString
and then Anywhere
through the duration of the command. Would it be possible to adapt the Parser
interface to support parsing these commands?
When using Крас\x1bниво
only Крас
is processed with input
, while rest terminals(vte, xterm) output Красиво
. On the other side Hello\x1bworld
will process with input
Hello
and orld
only consuming w
.
Running tmux | target/debug/examples/parselog
always outputs simply the following:
[print] '['
[print] 'e'
[print] 'x'
[print] 'i'
[print] 't'
[print] 'e'
[print] 'd'
[print] ']'
[execute] 0a
All actions inside tmux
are ignored. Maybe this is actually a expected behavior and I just don't understand how tmux interacts with the terminal.
From what I can find, escape sequence parameters are "optional numeric values separated by semicolons".
As the values are separated by semicolons, it makes sense that if a sequence ends with a semicolon, the terminal should expect another parameter. Because the parameter is an optional number, it should default to 0.
For example, the code \e[4;m
should have the same effect as \e[4;0m
, that is, none at all.
I tested this specific sequence in kitty, termite, xterm, st and alacritty, and only alacritty draws the following text underlined.
It should be a simple fix and I will include it in my PR for #22
The utf8parse
package could use a test validating the implementation. My thought about how to do this is to find some complicated UTF-8 file and compare results of parsing it with this package's parser and the standard library's std::str::from_utf8()
.
Implement a utf8parse::Receiver
for something wrapping a String
. Read the UTF-8 test file into a buffer. for each byte in the buffer, advance()
the utf8parse::Parser
. After running all bytes through the parser, let the string be the actual
value. Now, call String::from_utf8()
on the test file buffer; let that be the expected
value. Assert that those values are equal.
Hi!
What do I do with testes\demo.vte
? How do I know I didn't break anything?
Found with afl.rs.
Stacktrace:
thread 'pty reader' panicked at 'index out of bounds: the len is 16 but the index is 16', /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:205:17
stack backtrace:
...
11: rust_begin_unwind
at src/libstd/panicking.rs:375
12: core::panicking::panic_fmt
at src/libcore/panicking.rs:84
13: core::panicking::panic_bounds_check
at src/libcore/panicking.rs:62
14: vte::Parser::perform_action
at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:205
15: vte::Parser::perform_state_change
at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:151
16: vte::Parser::advance
at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:128
17: alacritty_terminal::ansi::Processor::advance
at ./alacritty_terminal/src/ansi.rs:141
18: alacritty_terminal::event_loop::EventLoop<T,U>::pty_read
at ./alacritty_terminal/src/event_loop.rs:245
19: alacritty_terminal::event_loop::EventLoop<T,U>::spawn::{{closure}}
at ./alacritty_terminal/src/event_loop.rs:364
Test case:
Unminimized test case: test.vte.zip
The input I'm working with is already a sequence of char
s containing Unicode codepoints. Converting these into a utf8 byte stream so that they can then be converted right back into codepoints seems wasteful. Could we have an alternate API which takes a char
directly, and skips over the uft8 encoding stage?
This type is used in the set_mouse_cursor_icon
, but given that downstream users need to implement the trait the trait they need a way to export it, but we don't provide a way to do so, so they need to add a crate into their Cargo.toml
.
On MacOS, F1 key is reported as ESC-O-P
([27, 79, 80]
) sequence.
But vte
reports two events:
vte % cargo run --example parselog
^[OP
[esc_dispatch] intermediates=[], ignore=false, byte=4f
[print] 'P'
Could vte
be extended to report a single event ?
Thanks.
https://en.wikipedia.org/wiki/C0_and_C1_control_codes#C1_control_codes_for_general_use
Next character invokes a graphic character from the G2 or G3 graphic sets respectively.
alacritty 0.4.1
It seems parameter(s) are not reset at the end of an DCS. Some examples:
"hello" should be colored red, but isn't:
echo -e '\eP1X\e\\\e[31mhello\e[m'
"hello" should not be colored, but is:
echo -e '\eP31X\e\\\e[mhello\e[m'
"hello" should not be bold, but is:
echo -e '\eP1X\e\\\e[mhello\e[m'
All the above works in xterm and urxvt.
For a real-world example, see https://codeberg.org/dnkl/page, it uses the (not yet standardized) BSU/ESU DCS sequences and it is completely broken in Alacritty (do cat <large-file> | page
and scroll).
Trying to build with rustc 1.50.0-nightly (c919f490b 2020-11-17)
, I get the following error:
Checking vte v0.9.0 (/tmp/vte)
error[E0658]: mutable references are not allowed in constant functions
--> src/table.rs:9:1
|
9 | / generate_state_changes!(state_changes, {
10 | | Anywhere {
11 | | 0x18 => (Ground, Execute),
12 | | 0x1a => (Ground, Execute),
... |
170 | | }
171 | | });
| |___^
|
= note: see issue #57349 <https://github.com/rust-lang/rust/issues/57349> for more information
= help: add `#![feature(const_mut_refs)]` to the crate attributes to enable
= note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
error[E0015]: calls in constant functions are limited to constant functions, tuple structs and tuple variants
--> src/table.rs:9:1
|
9 | / generate_state_changes!(state_changes, {
10 | | Anywhere {
11 | | 0x18 => (Ground, Execute),
12 | | 0x1a => (Ground, Execute),
... |
170 | | }
171 | | });
| |___^
|
= note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0015, E0658.
For more information about an error, try `rustc --explain E0015`.
error: could not compile `vte`
To learn more, run the command again with --verbose.
This also breaks the alacritty-git build for me.
I've been doing some fuzzing of vte, and discovered that the input b"\x1bP;;;:::::;;:::::;;;;;;;;;;;;;0;;::;;p\x1b"
causes ParamsIter to loop forever.
I'm trying to get my head around the escape code format enough to troubleshoot more usefully, but maybe you can figure out what's going on before I can.
I'm looking at using vte for a use case where I want to translate invalid UTF-8 bytes into Unicode replacement characters, however vte seem to silently swallow some invalid UTF-8 bytes. For example, if I feed it input consisting of the byte 0x90, it produces no events.
Would it make sense to add Execute
rules to the Ground
table for 0x90 and other formerly special C1 codes?
Would it make sense to introduce something like a InvalidUtf8
action, to fill in the Ground
table in general?
Hi, I was eager to benchmark your table-based utf8 parsing approach against the standard library implementation, so I did:
https://github.com/ConnyOnny/utf8perf
If my testing setup is not wrong (see main.rs) it seems branching is not everything.
How it works is available at https://invisible-island.net/xterm/ctlseqs/ctlseqs.html .
This is used to query for terminfo features and not relying on the actual files. It's supported by notcursors, most modern terms, etc. It could be sort of good when you ssh into system without TERM info you have and it could help terminals not having their terminfo spread to work with at least modern toolkits.
The trait interface limits the error handling considerably when doing any advanced work in the callbacks. E.g. when doing file I/O the errors cannot be easily returned without employing temporary state.
Exposing an enum of the states (that would match the input parameters to the trait's functions) that can be accessed via a similar function to advance
would allow this alternative use case.
This does increase the API surface area. However, the implementation can be reworked so that internally it uses the enum and only at the API surface is it decided whether to return that value directly vs calling the appropriate function.
I was looking into implementing styled underlines into alacritty, like what you might see in kitty or gnome vte based terminals - that is normal underline, double underline, curly underline (undercurl), and have it have a different color than the foreground.
Looking at the kitty implementation (spec), the terminal emulator has to handle not only semicolon separated parameters (\e[0;4m
), but also colon separated parameters (\e[4:3m
).
I'm not an expert on the standard defining how those sequences should be handled, so I was going by kitty's source code. What's being done in their codebase is not applicable to this repo, because they have the entire sequence buffered up until the command byte ('m' in this case) and then parse it, dispatching every semicolon separated parameter as it's own sequence, with special cases for color parameters. In that way, a command like \e[0;4;58:2::186:93:0m
would be split into three function calls: handle 0m
, handle 4m
, handle 58:2::186:93:0m
.
This could be implemented in the vte crate (I think) by adding a dimension to the params
array, and multiple calls to the Performer implementor. Not sure what the performance cost would be though.
On crates.io, the crate doesn't contain the license files, would you please add them?
Thanks.
Hello all! Back in #111 symlinks got made to include the license files in the crate, but the latest released version 0.1.1 doesn't include them, which can be seen on https://docs.rs/crate/vte_generate_state_changes/latest/source/. Maybe an old version of cargo was used to make the package that didn't include them? I checked locally and using the latest version works:
% cd vte_generate_state_changes
% cargo package
...
% ls ../target/package/vte_generate_state_changes-0.1.1
Cargo.toml Cargo.toml.orig LICENSE-APACHE LICENSE-MIT src target
So maybe all that needs to happen is to bump the version and regenerate the package?
It's used to report whether mode is supported/set, etc.
This is used by sync updates as a way to detect the feature support. See https://gist.github.com/christianparpart/d8a62cc1ab659194337d73e399004036?permalink_comment_id=3946967#feature-detection
The mode could still be used without it, but it probably worth adding support for this escape in general.
A concerned citizen, although apparently not concerned enough to use github, has noticed something that concerns them.
Some binaries, named vim10m_match
and vim10m_table
, have made it into the vte 0.3.3
release on crates.io, visible https://docs.rs/crate/vte/0.3.3/source/
These binaries then get vendor
'd into the actual rustc
source downloads.
Could you please do a release which.. doesn't have random binaries, and other unpublished code, in?
In the current state, a control sequence that is supplied more than the maximum number of parameters weirdly smashes the trailing ones into a big number, behaving as if the parameter separators didn't exist. For example, the control sequence \e[38;2;0;255;0;48;2;255;0;0;1;2;3;4;5;6;0mHello
gets parsed as [38, 2, 0, 255, 0, 48, 2, 255, 0, 0, 1, 2, 3, 4, 5, 60]
- notice the 60 at the end.
AFAIK, the standard doesn't define any maximum number of parameters. Since I couldn't find any canonical way handle the excess of parameters, this is the behavior of other terminal emulators (data collected mostly by trying it on my machine and browsing code):
(Couldn't figure out what xterm does, seems to me like it just craps its pants and doesn't give a damn)
I'm inclined to implement the first option, but I don't have a strong opinion on it. The current way just doesn't seem "correct" in any shape or form and the VTE crate is the only one where I've seen this behavior so far.
I am writing an OS and I was in need of an ANSI parser. I started to write my own, and then was pointed at your excellent crate. Thank you!
I note that const MAX_OSC_RAW: usize = 1024
and that is a little large for me. Would you accept a PR to make it a const-generic? Something like:
/// Parser for raw _VTE_ protocol which delegates actions to a [`Perform`]
///
/// [`Perform`]: trait.Perform.html
#[derive(Default)]
pub struct Parser<const OSC_RAW_SIZE: usize = MAX_OSC_RAW> {
state: State,
intermediates: [u8; MAX_INTERMEDIATES],
intermediate_idx: usize,
params: Params,
param: u16,
#[cfg(feature = "no_std")]
osc_raw: ArrayVec<u8, OSC_RAW_SIZE>,
#[cfg(not(feature = "no_std"))]
osc_raw: Vec<u8>,
osc_params: [(usize, usize); MAX_OSC_PARAMS],
osc_num_params: usize,
ignoring: bool,
utf8_parser: utf8::Parser,
}
The const-generic parameter will be there even in no-std mode, but as it has a default, I don't think it'll matter?
my understanding is that the introducer for a device control string should follow the same syntax as a csi sequence, which means that the final character is used to determine the operation to perform. the hook
method in the Perform
trait doesn't receive the final byte though (like csi_dispatch
does), so this doesn't appear to be possible.
(it's possible that i'm confused about something here - i don't really have a need for this, i just noticed it when trying to add some debug logs for the parsing process.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.