alacritty / vte Goto Github PK

View Code? Open in Web Editor NEW

231.0 11.0 55.0 186 KB

Parser for virtual terminal emulators

Home Page: https://docs.rs/vte/

License: Apache License 2.0

Rust 100.00%

rust vte terminal parser

vte's Issues

Read user input

Would it be possible to use vte to read user input (keyboard / mouse) in raw mode ?
I am doing experiments here.
And it appears that some key sequences are not reported like:

Shift-Tab
Alt-Enter
Alt-Backspace

Thanks.

Sixel

Sorry, I was searching issues related with sixel with my phone and I don't know how a new one got created instead.

For clap and related applications, I've been looking into ANSI parsers. None quite meet my needs so I was looking at writing my own (or forking vte and turning it into what I want) but I figured I'd reach out first in case there is a way to make this work within vte and its of interest.

vte seems like it is optimized for developers for use in alacritty. Performance would most be noticed in highly interactive applications which would have a high escape code to text ratio. Since every control code needs to be processed to render correctly for alacritty, vte cares more about the control code side of printable control codes like \n. As the build times and binary size are likely a drop in bucket for alacritty, I'm assuming they haven't been optimized.

My care abouts are the opposite of the above. My applications are static and likely to have a low escape code to text ratio and I would want to treat all printable control codes as text. Technically, this can all be handled with vte's design but the char-by-char processing won't be the most optimal. Instead I'd want to deal with slices of text. clap is also a heavily used crate and there is a lot of interest in managing the build times and compile sizes. I could likely get away without the proc macro generated state tables and replace them with a function with matches and have little performance hit but massive compile time and binary size improvements.

What I'm trying to decide is how much to contribute vte to handle both cases or if its better to go my own route. Thoughts?

(sorry, wasn't there if there was a better medium to reach out)

Support for APC?

I am trying to use vte to parse APC commands, which it seems to ignore by setting the state to SosPmApcString and then Anywhere through the duration of the command. Would it be possible to adapt the Parser interface to support parsing these commands?

Different processing of cyrillic text compared to ascii

When using Крас\x1bниво only Крас is processed with input, while rest terminals(vte, xterm) output Красиво. On the other side Hello\x1bworld will process with input Hello and orld only consuming w.

Parselog for tmux doesn't work as expected

Running tmux | target/debug/examples/parselog always outputs simply the following:

[print] '['
[print] 'e'
[print] 'x'
[print] 'i'
[print] 't'
[print] 'e'
[print] 'd'
[print] ']'
[execute] 0a

All actions inside tmux are ignored. Maybe this is actually a expected behavior and I just don't understand how tmux interacts with the terminal.

Interpret a trailing semicolon as an implicit '0' parameter

From what I can find, escape sequence parameters are "optional numeric values separated by semicolons".

As the values are separated by semicolons, it makes sense that if a sequence ends with a semicolon, the terminal should expect another parameter. Because the parameter is an optional number, it should default to 0.

For example, the code \e[4;m should have the same effect as \e[4;0m, that is, none at all.

I tested this specific sequence in kitty, termite, xterm, st and alacritty, and only alacritty draws the following text underlined.

It should be a simple fix and I will include it in my PR for #22

Add test for UTF-8 Parsing

The utf8parse package could use a test validating the implementation. My thought about how to do this is to find some complicated UTF-8 file and compare results of parsing it with this package's parser and the standard library's std::str::from_utf8().

Approach

Implement a utf8parse::Receiver for something wrapping a String. Read the UTF-8 test file into a buffer. for each byte in the buffer, advance() the utf8parse::Parser. After running all bytes through the parser, let the string be the actual value. Now, call String::from_utf8() on the test file buffer; let that be the expected value. Assert that those values are equal.

Links

how do I run the tests?

Hi!
What do I do with testes\demo.vte? How do I know I didn't break anything?

Out of bounds panic in vte::Parser::perform_action

Found with afl.rs.

Stacktrace:

thread 'pty reader' panicked at 'index out of bounds: the len is 16 but the index is 16', /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:205:17
stack backtrace:
  ...
  11: rust_begin_unwind
             at src/libstd/panicking.rs:375
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:84
  13: core::panicking::panic_bounds_check
             at src/libcore/panicking.rs:62
  14: vte::Parser::perform_action
             at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:205
  15: vte::Parser::perform_state_change
             at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:151
  16: vte::Parser::advance
             at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:128
  17: alacritty_terminal::ansi::Processor::advance
             at ./alacritty_terminal/src/ansi.rs:141
  18: alacritty_terminal::event_loop::EventLoop<T,U>::pty_read
             at ./alacritty_terminal/src/event_loop.rs:245
  19: alacritty_terminal::event_loop::EventLoop<T,U>::spawn::{{closure}}
             at ./alacritty_terminal/src/event_loop.rs:364

Test case:
Unminimized test case: test.vte.zip

Support for direct Unicode input

The input I'm working with is already a sequence of chars containing Unicode codepoints. Converting these into a utf8 byte stream so that they can then be converted right back into codepoints seems wasteful. Could we have an alternate API which takes a char directly, and skips over the uft8 encoding stage?

Re-export `CursorIcon`

This type is used in the set_mouse_cursor_icon, but given that downstream users need to implement the trait the trait they need a way to export it, but we don't provide a way to do so, so they need to add a crate into their Cargo.toml.

SS3 sequence

On MacOS, F1 key is reported as ESC-O-P ([27, 79, 80]) sequence.
But vte reports two events:

vte % cargo run --example parselog
^[OP
[esc_dispatch] intermediates=[], ignore=false, byte=4f
[print] 'P'

Could vte be extended to report a single event ?
Thanks.

https://en.wikipedia.org/wiki/C0_and_C1_control_codes#C1_control_codes_for_general_use

Next character invokes a graphic character from the G2 or G3 graphic sets respectively.

DCS parameters not reset at unhook

alacritty 0.4.1

It seems parameter(s) are not reset at the end of an DCS. Some examples:

"hello" should be colored red, but isn't:

echo -e '\eP1X\e\\\e[31mhello\e[m'

"hello" should not be colored, but is:

echo -e '\eP31X\e\\\e[mhello\e[m'

"hello" should not be bold, but is:

echo -e '\eP1X\e\\\e[mhello\e[m'

All the above works in xterm and urxvt.

For a real-world example, see https://codeberg.org/dnkl/page, it uses the (not yet standardized) BSU/ESU DCS sequences and it is completely broken in Alacritty (do cat <large-file> | page and scroll).

build failure with nightly (mutable references are not allowed in constant functions)

Trying to build with rustc 1.50.0-nightly (c919f490b 2020-11-17), I get the following error:

    Checking vte v0.9.0 (/tmp/vte)
error[E0658]: mutable references are not allowed in constant functions
   --> src/table.rs:9:1
    |
9   | / generate_state_changes!(state_changes, {
10  | |     Anywhere {
11  | |         0x18 => (Ground, Execute),
12  | |         0x1a => (Ground, Execute),
...   |
170 | |     }
171 | | });
    | |___^
    |
    = note: see issue #57349 <https://github.com/rust-lang/rust/issues/57349> for more information
    = help: add `#![feature(const_mut_refs)]` to the crate attributes to enable
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0015]: calls in constant functions are limited to constant functions, tuple structs and tuple variants
   --> src/table.rs:9:1
    |
9   | / generate_state_changes!(state_changes, {
10  | |     Anywhere {
11  | |         0x18 => (Ground, Execute),
12  | |         0x1a => (Ground, Execute),
...   |
170 | |     }
171 | | });
    | |___^
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0015, E0658.
For more information about an error, try `rustc --explain E0015`.
error: could not compile `vte`

To learn more, run the command again with --verbose.

This also breaks the alacritty-git build for me.

Infinite loop in ParamsIter

I've been doing some fuzzing of vte, and discovered that the input b"\x1bP;;;:::::;;:::::;;;;;;;;;;;;;0;;::;;p\x1b" causes ParamsIter to loop forever.

I'm trying to get my head around the escape code format enough to troubleshoot more usefully, but maybe you can figure out what's going on before I can.

Handling invalid UTF-8 bytes

I'm looking at using vte for a use case where I want to translate invalid UTF-8 bytes into Unicode replacement characters, however vte seem to silently swallow some invalid UTF-8 bytes. For example, if I feed it input consisting of the byte 0x90, it produces no events.

Would it make sense to add Execute rules to the Ground table for 0x90 and other formerly special C1 codes?

Would it make sense to introduce something like a InvalidUtf8 action, to fill in the Ground table in general?

utf8 parsing performance

Hi, I was eager to benchmark your table-based utf8 parsing approach against the standard library implementation, so I did:
https://github.com/ConnyOnny/utf8perf

If my testing setup is not wrong (see main.rs) it seems branching is not everything.

Add support for XTGETTCAP

How it works is available at https://invisible-island.net/xterm/ctlseqs/ctlseqs.html .

This is used to query for terminfo features and not relying on the actual files. It's supported by notcursors, most modern terms, etc. It could be sort of good when you ssh into system without TERM info you have and it could help terminals not having their terminfo spread to work with at least modern toolkits.

Provide enum interface for next parser action

The trait interface limits the error handling considerably when doing any advanced work in the callbacks. E.g. when doing file I/O the errors cannot be easily returned without employing temporary state.

Exposing an enum of the states (that would match the input parameters to the trait's functions) that can be accessed via a similar function to advance would allow this alternative use case.

This does increase the API surface area. However, the implementation can be reworked so that internally it uses the enum and only at the API surface is it decided whether to return that value directly vs calling the appropriate function.

Implement colon separated CSI parameters

I was looking into implementing styled underlines into alacritty, like what you might see in kitty or gnome vte based terminals - that is normal underline, double underline, curly underline (undercurl), and have it have a different color than the foreground.

Looking at the kitty implementation (spec), the terminal emulator has to handle not only semicolon separated parameters (\e[0;4m), but also colon separated parameters (\e[4:3m).

I'm not an expert on the standard defining how those sequences should be handled, so I was going by kitty's source code. What's being done in their codebase is not applicable to this repo, because they have the entire sequence buffered up until the command byte ('m' in this case) and then parse it, dispatching every semicolon separated parameter as it's own sequence, with special cases for color parameters. In that way, a command like \e[0;4;58:2::186:93:0m would be split into three function calls: handle 0m, handle 4m, handle 58:2::186:93:0m.

This could be implemented in the vte crate (I think) by adding a dimension to the params array, and multiple calls to the Performer implementor. Not sure what the performance cost would be though.

Please add the license files to the published utf8parse crate

On crates.io, the crate doesn't contain the license files, would you please add them?

Thanks.

Publish a new version of vte_generate_state_changes to include license files

Hello all! Back in #111 symlinks got made to include the license files in the crate, but the latest released version 0.1.1 doesn't include them, which can be seen on https://docs.rs/crate/vte_generate_state_changes/latest/source/. Maybe an old version of cargo was used to make the package that didn't include them? I checked locally and using the latest version works:

% cd vte_generate_state_changes
% cargo package
...
% ls ../target/package/vte_generate_state_changes-0.1.1
Cargo.toml  Cargo.toml.orig  LICENSE-APACHE  LICENSE-MIT  src  target

So maybe all that needs to happen is to bump the version and regenerate the package?

Add support for DECRQM

It's used to report whether mode is supported/set, etc.

This is used by sync updates as a way to detect the feature support. See https://gist.github.com/christianparpart/d8a62cc1ab659194337d73e399004036?permalink_comment_id=3946967#feature-detection

The mode could still be used without it, but it probably worth adding support for this escape in general.

Unexpected binaries in the published crate

A concerned citizen, although apparently not concerned enough to use github, has noticed something that concerns them.
Some binaries, named vim10m_match and vim10m_table, have made it into the vte 0.3.3 release on crates.io, visible https://docs.rs/crate/vte/0.3.3/source/

These binaries then get vendor'd into the actual rustc source downloads.

Could you please do a release which.. doesn't have random binaries, and other unpublished code, in?

Migrate to 2018 edition

Decide on the correct way of handling excess control sequence parameters

In the current state, a control sequence that is supplied more than the maximum number of parameters weirdly smashes the trailing ones into a big number, behaving as if the parameter separators didn't exist. For example, the control sequence \e[38;2;0;255;0;48;2;255;0;0;1;2;3;4;5;6;0mHello gets parsed as [38, 2, 0, 255, 0, 48, 2, 255, 0, 0, 1, 2, 3, 4, 5, 60] - notice the 60 at the end.

AFAIK, the standard doesn't define any maximum number of parameters. Since I couldn't find any canonical way handle the excess of parameters, this is the behavior of other terminal emulators (data collected mostly by trying it on my machine and browsing code):

Trash the entire control sequence - st, kitty, GNOME VTE (kitty seems to then print the sequence as normal printable characters)
Trash the excess parameters - apparently DEC VT100+ compatible emulators do this
After reaching the maximum capacity, instead of appending, overwrite the last parameter with new ones - Konsole (not a great solution, however their limit is 4096 parameters, so you won't run into this all that often)

(Couldn't figure out what xterm does, seems to me like it just craps its pants and doesn't give a damn)

I'm inclined to implement the first option, but I don't have a strong opinion on it. The current way just doesn't seem "correct" in any shape or form and the VTE crate is the only one where I've seen this behavior so far.

Limiting size of osc_raw in no-std mode

I am writing an OS and I was in need of an ANSI parser. I started to write my own, and then was pointed at your excellent crate. Thank you!

I note that const MAX_OSC_RAW: usize = 1024 and that is a little large for me. Would you accept a PR to make it a const-generic? Something like:

/// Parser for raw _VTE_ protocol which delegates actions to a [`Perform`]
///
/// [`Perform`]: trait.Perform.html
#[derive(Default)]
pub struct Parser<const OSC_RAW_SIZE: usize = MAX_OSC_RAW> {
    state: State,
    intermediates: [u8; MAX_INTERMEDIATES],
    intermediate_idx: usize,
    params: Params,
    param: u16,
    #[cfg(feature = "no_std")]
    osc_raw: ArrayVec<u8, OSC_RAW_SIZE>,
    #[cfg(not(feature = "no_std"))]
    osc_raw: Vec<u8>,
    osc_params: [(usize, usize); MAX_OSC_PARAMS],
    osc_num_params: usize,
    ignoring: bool,
    utf8_parser: utf8::Parser,
}

The const-generic parameter will be there even in no-std mode, but as it has a default, I don't think it'll matter?

hook method appears to be missing the final character

my understanding is that the introducer for a device control string should follow the same syntax as a csi sequence, which means that the final character is used to determine the operation to perform. the hook method in the Perform trait doesn't receive the final byte though (like csi_dispatch does), so this doesn't appear to be possible.

(it's possible that i'm confused about something here - i don't really have a need for this, i just noticed it when trying to add some debug logs for the parsing process.)

alacritty / vte Goto Github PK

vte's Issues

Approach

Links

Recommend Projects

Recommend Topics

Recommend Org

Jobs