GithubHelp home page GithubHelp logo

Comments (10)

spyoungtech avatar spyoungtech commented on August 16, 2024 1

I think what I'll be planning to do is have a method which will take unicode characters and convert them to the unicode sequences in AHK scripts. E.g. characters like '\u2F72' in Python strings, will translate to the ahk unicode sequence like {U+2F72}

In the meantime, it should be possible to use the .send and .send_input methods with the ahk unicode strings directly.

ahk.send_input('Hello unicode {U+2F72}')

Eventually, maybe the .type method will have the capability to do this for you, or maybe a .type_unicode method will be added.

from ahk.

ClericPy avatar ClericPy commented on August 16, 2024 1

ahk.image_search's image path could not use chinese words either
ahk.find_window's title can be set '中文-chinese-words'.encode('gbk')

will autohotkey_unicode version work for this?

and...
_run_script
https://github.com/spyoungtech/ahk/blob/master/ahk/script.py#L55

        script_bytes = bytes(script_text, 'utf-8')
                return result.stdout.decode()

could this support encoding arg?
bytes(script_text, 'gbk')
sometimes set encoding with gbk will fix mass output issue

from ahk.

spyoungtech avatar spyoungtech commented on August 16, 2024 1

Yeah, this makes sense.

From what I understand, the version of AHK that I test against is already unicode compatible. I've been trying to figure out the behavior of how AHK interprets the encoding sent to it.

For instance, when reading from a file, AHK seems to read UTF-8 unicode just fine. When sent via stdin to the subprocess, it seems that there is some sort of encoding mismatch. I suspect a locale or system-default encoding is being used.

When sending the following as UTF-8 encoded bytes to the subprocess:

SendInput ⽲

AutoHotkey ends up sending â½² (which indicates to me it chose to interpret the bytes as cp1252 encoding, rather than UTF-8)

from ahk.

spyoungtech avatar spyoungtech commented on August 16, 2024 1

What's important is to identify how AHK does encoding detection. I believe it may rely on Windows semantics, which will used the preferred encoding for the locale if the bytes make sense in that encoding.

For example you can use locale.getpreferredencoding() to identify the locale encoding. However, this may not necessarily be the encoding that AHK uses. In the above case, the bytes which represetn the UTF-8 character is ambiguous with the locale encoding CP1252, which would interpret the bytes as â½². I'm not sure the behavior would be the same if the bytes sent were not valid in CP1252.

So knowing what the preferred encoding is alone may not be enough. I'll have to do some more testing around this and maybe inquire with some folks more knowledgable of Windows/AHK's behavior when reading from stdin.

from ahk.

kymikoloco avatar kymikoloco commented on August 16, 2024 1

(Edit: Oops, didn't realize this was a Python wrapper, this probably won't solve anything, but it might help someone like me who stumbled upon this issue first :) )

Try saving the *.ahk file as UTF-8 with BOM to make sure AHK sends your UTF-8 text as the correct encoding.

https://www.reddit.com/r/AutoHotkey/comments/9zz9q3/why_do_hotstrings_sometimes_dont_work_depending/

AutoHotkey treats files as ANSI unless it has a very good reason not to. A file in UTF-8 encoding will work sometimes, but only if it has the Byte-Order-Mark (BOM) explicitly declared.

Encoding AHK treats as
ANSI/ASCII ANSI
UTF-8 (no BOM) ANSI
UTF-8-BOM Unicode
UTF-16 Unicode

UTF-8-BOM is the preferred encoding for AutoHotkey scripts because it is a variable-byte-encoding, which means when a character can be encoded using fewer bytes it encodes it with fewer bytes. A script with only standard ASCII/ANSI characters will only be as large as a regular ASCII/ANSI file (plus like 3 bytes for the BOM), then any Unicode characters will take up an extra byte or two.

from ahk.

spyoungtech avatar spyoungtech commented on August 16, 2024

Hmm. A quick search seems to indicate that AutoHotkey itself does not have great support for unicode. If you are wanting to use the send commands to send chinese input, that may be a limitation of AutoHotkey.

There are some documented workarounds for this in the AHK forums.

If I can find a good workaround for sending unicode that can be cleanly implemented, I'd be happy to add it as a function in the library.

from ahk.

ClericPy avatar ClericPy commented on August 16, 2024

Is there some way set global encoding setting for different countries, instead of utf-8

from ahk.

spyoungtech avatar spyoungtech commented on August 16, 2024

Yeah. Earlier on in this project, the implementation of calling AHK scripts was to write the .ahk script to a temporary file and then call the AHK executable, providing the temp filename as an argument. I think with that earlier implementation, a lot of the unicode stuff wasn't an issue.

However, the current implementation is to pass the script text (as bytes) to the AHK process via stdin, avoiding the need to write out the generated script to a file (avoiding things like permissions errors or failing to cleanup tempfiles)

For whatever reason, AHK doesn't seem to use UTF-8 when the script text is passed through stdin. 🤷‍♀️

from ahk.

spyoungtech avatar spyoungtech commented on August 16, 2024

So, the BOM doesn't work when passing script as stdin, but there is a flag to set the codepage, which seems promising. See: #132

from ahk.

spyoungtech avatar spyoungtech commented on August 16, 2024

This should be essentially resolved with #132

One major exception is when using daemon mode. In which case, you must workaround this by using unicode sequences (e.g. {U+nnnn}) to ensure unicode characters are correctly understood.

from ahk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.