GithubHelp home page GithubHelp logo

alacritty / vte Goto Github PK

View Code? Open in Web Editor NEW
231.0 11.0 55.0 186 KB

Parser for virtual terminal emulators

Home Page: https://docs.rs/vte/

License: Apache License 2.0

Rust 100.00%
rust vte terminal parser

vte's Issues

Read user input

Would it be possible to use vte to read user input (keyboard / mouse) in raw mode ?
I am doing experiments here.
And it appears that some key sequences are not reported like:

  • Shift-Tab
  • Alt-Enter
  • Alt-Backspace

Thanks.

Sixel

Sorry, I was searching issues related with sixel with my phone and I don't know how a new one got created instead.

Misc clap needs

For clap and related applications, I've been looking into ANSI parsers. None quite meet my needs so I was looking at writing my own (or forking vte and turning it into what I want) but I figured I'd reach out first in case there is a way to make this work within vte and its of interest.

vte seems like it is optimized for developers for use in alacritty. Performance would most be noticed in highly interactive applications which would have a high escape code to text ratio. Since every control code needs to be processed to render correctly for alacritty, vte cares more about the control code side of printable control codes like \n. As the build times and binary size are likely a drop in bucket for alacritty, I'm assuming they haven't been optimized.

My care abouts are the opposite of the above. My applications are static and likely to have a low escape code to text ratio and I would want to treat all printable control codes as text. Technically, this can all be handled with vte's design but the char-by-char processing won't be the most optimal. Instead I'd want to deal with slices of text. clap is also a heavily used crate and there is a lot of interest in managing the build times and compile sizes. I could likely get away without the proc macro generated state tables and replace them with a function with matches and have little performance hit but massive compile time and binary size improvements.

What I'm trying to decide is how much to contribute vte to handle both cases or if its better to go my own route. Thoughts?

(sorry, wasn't there if there was a better medium to reach out)

Support for APC?

I am trying to use vte to parse APC commands, which it seems to ignore by setting the state to SosPmApcString and then Anywhere through the duration of the command. Would it be possible to adapt the Parser interface to support parsing these commands?

Parselog for tmux doesn't work as expected

Running tmux | target/debug/examples/parselog always outputs simply the following:

[print] '['
[print] 'e'
[print] 'x'
[print] 'i'
[print] 't'
[print] 'e'
[print] 'd'
[print] ']'
[execute] 0a

All actions inside tmux are ignored. Maybe this is actually a expected behavior and I just don't understand how tmux interacts with the terminal.

Interpret a trailing semicolon as an implicit '0' parameter

From what I can find, escape sequence parameters are "optional numeric values separated by semicolons".

As the values are separated by semicolons, it makes sense that if a sequence ends with a semicolon, the terminal should expect another parameter. Because the parameter is an optional number, it should default to 0.

For example, the code \e[4;m should have the same effect as \e[4;0m, that is, none at all.

I tested this specific sequence in kitty, termite, xterm, st and alacritty, and only alacritty draws the following text underlined.

It should be a simple fix and I will include it in my PR for #22

Add test for UTF-8 Parsing

The utf8parse package could use a test validating the implementation. My thought about how to do this is to find some complicated UTF-8 file and compare results of parsing it with this package's parser and the standard library's std::str::from_utf8().

Approach

Implement a utf8parse::Receiver for something wrapping a String. Read the UTF-8 test file into a buffer. for each byte in the buffer, advance() the utf8parse::Parser. After running all bytes through the parser, let the string be the actual value. Now, call String::from_utf8() on the test file buffer; let that be the expected value. Assert that those values are equal.

Links

Out of bounds panic in vte::Parser::perform_action

Found with afl.rs.

Stacktrace:

thread 'pty reader' panicked at 'index out of bounds: the len is 16 but the index is 16', /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:205:17
stack backtrace:
  ...
  11: rust_begin_unwind
             at src/libstd/panicking.rs:375
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:84
  13: core::panicking::panic_bounds_check
             at src/libcore/panicking.rs:62
  14: vte::Parser::perform_action
             at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:205
  15: vte::Parser::perform_state_change
             at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:151
  16: vte::Parser::advance
             at /home/tibor/.cargo/registry/src/github.com-1ecc6299db9ec823/vte-0.7.0/src/lib.rs:128
  17: alacritty_terminal::ansi::Processor::advance
             at ./alacritty_terminal/src/ansi.rs:141
  18: alacritty_terminal::event_loop::EventLoop<T,U>::pty_read
             at ./alacritty_terminal/src/event_loop.rs:245
  19: alacritty_terminal::event_loop::EventLoop<T,U>::spawn::{{closure}}
             at ./alacritty_terminal/src/event_loop.rs:364

Test case:
Unminimized test case: test.vte.zip

Support for direct Unicode input

The input I'm working with is already a sequence of chars containing Unicode codepoints. Converting these into a utf8 byte stream so that they can then be converted right back into codepoints seems wasteful. Could we have an alternate API which takes a char directly, and skips over the uft8 encoding stage?

Re-export `CursorIcon`

This type is used in the set_mouse_cursor_icon, but given that downstream users need to implement the trait the trait they need a way to export it, but we don't provide a way to do so, so they need to add a crate into their Cargo.toml.

DCS parameters not reset at unhook

alacritty 0.4.1

It seems parameter(s) are not reset at the end of an DCS. Some examples:

"hello" should be colored red, but isn't:

echo -e '\eP1X\e\\\e[31mhello\e[m'

"hello" should not be colored, but is:

echo -e '\eP31X\e\\\e[mhello\e[m'

"hello" should not be bold, but is:

echo -e '\eP1X\e\\\e[mhello\e[m'

All the above works in xterm and urxvt.

For a real-world example, see https://codeberg.org/dnkl/page, it uses the (not yet standardized) BSU/ESU DCS sequences and it is completely broken in Alacritty (do cat <large-file> | page and scroll).

build failure with nightly (mutable references are not allowed in constant functions)

Trying to build with rustc 1.50.0-nightly (c919f490b 2020-11-17), I get the following error:

    Checking vte v0.9.0 (/tmp/vte)
error[E0658]: mutable references are not allowed in constant functions
   --> src/table.rs:9:1
    |
9   | / generate_state_changes!(state_changes, {
10  | |     Anywhere {
11  | |         0x18 => (Ground, Execute),
12  | |         0x1a => (Ground, Execute),
...   |
170 | |     }
171 | | });
    | |___^
    |
    = note: see issue #57349 <https://github.com/rust-lang/rust/issues/57349> for more information
    = help: add `#![feature(const_mut_refs)]` to the crate attributes to enable
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0015]: calls in constant functions are limited to constant functions, tuple structs and tuple variants
   --> src/table.rs:9:1
    |
9   | / generate_state_changes!(state_changes, {
10  | |     Anywhere {
11  | |         0x18 => (Ground, Execute),
12  | |         0x1a => (Ground, Execute),
...   |
170 | |     }
171 | | });
    | |___^
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0015, E0658.
For more information about an error, try `rustc --explain E0015`.
error: could not compile `vte`

To learn more, run the command again with --verbose.

This also breaks the alacritty-git build for me.

Infinite loop in ParamsIter

I've been doing some fuzzing of vte, and discovered that the input b"\x1bP;;;:::::;;:::::;;;;;;;;;;;;;0;;::;;p\x1b" causes ParamsIter to loop forever.

I'm trying to get my head around the escape code format enough to troubleshoot more usefully, but maybe you can figure out what's going on before I can.

Handling invalid UTF-8 bytes

I'm looking at using vte for a use case where I want to translate invalid UTF-8 bytes into Unicode replacement characters, however vte seem to silently swallow some invalid UTF-8 bytes. For example, if I feed it input consisting of the byte 0x90, it produces no events.

Would it make sense to add Execute rules to the Ground table for 0x90 and other formerly special C1 codes?

Would it make sense to introduce something like a InvalidUtf8 action, to fill in the Ground table in general?

Add support for XTGETTCAP

How it works is available at https://invisible-island.net/xterm/ctlseqs/ctlseqs.html .

This is used to query for terminfo features and not relying on the actual files. It's supported by notcursors, most modern terms, etc. It could be sort of good when you ssh into system without TERM info you have and it could help terminals not having their terminfo spread to work with at least modern toolkits.

Provide enum interface for next parser action

The trait interface limits the error handling considerably when doing any advanced work in the callbacks. E.g. when doing file I/O the errors cannot be easily returned without employing temporary state.

Exposing an enum of the states (that would match the input parameters to the trait's functions) that can be accessed via a similar function to advance would allow this alternative use case.

This does increase the API surface area. However, the implementation can be reworked so that internally it uses the enum and only at the API surface is it decided whether to return that value directly vs calling the appropriate function.

Implement colon separated CSI parameters

I was looking into implementing styled underlines into alacritty, like what you might see in kitty or gnome vte based terminals - that is normal underline, double underline, curly underline (undercurl), and have it have a different color than the foreground.

Looking at the kitty implementation (spec), the terminal emulator has to handle not only semicolon separated parameters (\e[0;4m), but also colon separated parameters (\e[4:3m).

I'm not an expert on the standard defining how those sequences should be handled, so I was going by kitty's source code. What's being done in their codebase is not applicable to this repo, because they have the entire sequence buffered up until the command byte ('m' in this case) and then parse it, dispatching every semicolon separated parameter as it's own sequence, with special cases for color parameters. In that way, a command like \e[0;4;58:2::186:93:0m would be split into three function calls: handle 0m, handle 4m, handle 58:2::186:93:0m.

This could be implemented in the vte crate (I think) by adding a dimension to the params array, and multiple calls to the Performer implementor. Not sure what the performance cost would be though.

Publish a new version of vte_generate_state_changes to include license files

Hello all! Back in #111 symlinks got made to include the license files in the crate, but the latest released version 0.1.1 doesn't include them, which can be seen on https://docs.rs/crate/vte_generate_state_changes/latest/source/. Maybe an old version of cargo was used to make the package that didn't include them? I checked locally and using the latest version works:

% cd vte_generate_state_changes
% cargo package
...
% ls ../target/package/vte_generate_state_changes-0.1.1
Cargo.toml  Cargo.toml.orig  LICENSE-APACHE  LICENSE-MIT  src  target

So maybe all that needs to happen is to bump the version and regenerate the package?

Unexpected binaries in the published crate

A concerned citizen, although apparently not concerned enough to use github, has noticed something that concerns them.
Some binaries, named vim10m_match and vim10m_table, have made it into the vte 0.3.3 release on crates.io, visible https://docs.rs/crate/vte/0.3.3/source/

These binaries then get vendor'd into the actual rustc source downloads.

Could you please do a release which.. doesn't have random binaries, and other unpublished code, in?

Decide on the correct way of handling excess control sequence parameters

In the current state, a control sequence that is supplied more than the maximum number of parameters weirdly smashes the trailing ones into a big number, behaving as if the parameter separators didn't exist. For example, the control sequence \e[38;2;0;255;0;48;2;255;0;0;1;2;3;4;5;6;0mHello gets parsed as [38, 2, 0, 255, 0, 48, 2, 255, 0, 0, 1, 2, 3, 4, 5, 60] - notice the 60 at the end.

AFAIK, the standard doesn't define any maximum number of parameters. Since I couldn't find any canonical way handle the excess of parameters, this is the behavior of other terminal emulators (data collected mostly by trying it on my machine and browsing code):

  • Trash the entire control sequence - st, kitty, GNOME VTE (kitty seems to then print the sequence as normal printable characters)
  • Trash the excess parameters - apparently DEC VT100+ compatible emulators do this
  • After reaching the maximum capacity, instead of appending, overwrite the last parameter with new ones - Konsole (not a great solution, however their limit is 4096 parameters, so you won't run into this all that often)

(Couldn't figure out what xterm does, seems to me like it just craps its pants and doesn't give a damn)

I'm inclined to implement the first option, but I don't have a strong opinion on it. The current way just doesn't seem "correct" in any shape or form and the VTE crate is the only one where I've seen this behavior so far.

Limiting size of osc_raw in no-std mode

I am writing an OS and I was in need of an ANSI parser. I started to write my own, and then was pointed at your excellent crate. Thank you!

I note that const MAX_OSC_RAW: usize = 1024 and that is a little large for me. Would you accept a PR to make it a const-generic? Something like:

/// Parser for raw _VTE_ protocol which delegates actions to a [`Perform`]
///
/// [`Perform`]: trait.Perform.html
#[derive(Default)]
pub struct Parser<const OSC_RAW_SIZE: usize = MAX_OSC_RAW> {
    state: State,
    intermediates: [u8; MAX_INTERMEDIATES],
    intermediate_idx: usize,
    params: Params,
    param: u16,
    #[cfg(feature = "no_std")]
    osc_raw: ArrayVec<u8, OSC_RAW_SIZE>,
    #[cfg(not(feature = "no_std"))]
    osc_raw: Vec<u8>,
    osc_params: [(usize, usize); MAX_OSC_PARAMS],
    osc_num_params: usize,
    ignoring: bool,
    utf8_parser: utf8::Parser,
}

The const-generic parameter will be there even in no-std mode, but as it has a default, I don't think it'll matter?

hook method appears to be missing the final character

my understanding is that the introducer for a device control string should follow the same syntax as a csi sequence, which means that the final character is used to determine the operation to perform. the hook method in the Perform trait doesn't receive the final byte though (like csi_dispatch does), so this doesn't appear to be possible.

(it's possible that i'm confused about something here - i don't really have a need for this, i just noticed it when trying to add some debug logs for the parsing process.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.