GithubHelp home page GithubHelp logo

cann't read unicode correctly about pymem HOT 6 CLOSED

wkingnet avatar wkingnet commented on June 30, 2024
cann't read unicode correctly

from pymem.

Comments (6)

wkingnet avatar wkingnet commented on June 30, 2024

update: Problem has been solved

I wrote a small piece of code to convert the UTF16 encoding, maybe you can use it in the future too.

I think UTF8 or other encodings are applicable, you only need to replace ('utf-16') in the code

temp = b''
while pm.read_bytes(address, 1) != b'\x00':
    temp = temp + pm.read_bytes(address, 1)
    address += 1
print(temp.decode('utf-16'))

from pymem.

wkingnet avatar wkingnet commented on June 30, 2024

update: Problem has been solved

I wrote a small piece of code to convert the UTF16 encoding, maybe you can use it in the future too.

I think UTF8 or other encodings are applicable, you only need to replace ('utf-16') in the code

temp = b''
while pm.read_bytes(address, 1) != b'\x00':
    temp = temp + pm.read_bytes(address, 1)
    address += 1
print(temp.decode('utf-16'))

this code still have a problem.

result = b''
while pymem_instance.read_bytes(address, 1) != b'\x00':
    result = result + pymem_instance.read_bytes(address, 1)
    print(result)
    address += 1

If the string is all composed of UTF16 is OK. But if the string is composed of a mixture of UTF16 and ASCII, it will still read errors.

The reason for the error is that pymem's read_byte() automatically saves the bytes as ASCII codes, Although what I created in python is a byte variable.

I took two screenshots, the difference is only the CE display text encoding.

111

222

from pymem.

srounet avatar srounet commented on June 30, 2024

Thank you for the reporting, the value is returned as a c_char.

https://docs.python.org/3/library/ctypes.html#ctypes.create_string_buffer

Maybe there should be an alternative that does not break everything within the read_bytes function ?

from pymem.

wkingnet avatar wkingnet commented on June 30, 2024

Thank you for the reporting, the value is returned as a c_char.

https://docs.python.org/3/library/ctypes.html#ctypes.create_string_buffer

Maybe there should be an alternative that does not break everything within the read_bytes function ?

If read_bytes function has been using binary to save data like b'\x07\x86\xF8\x66', then everything will be fine. Because you can use decode('GBK/UTF8/16/LADIN') to decode into any kind of encoding

from pymem.

wkingnet avatar wkingnet commented on June 30, 2024

update code, now the code can correctly handle UTF16 and ascii mixed encoding

result = ""
address  # a memory address
while True:
    _temp = pymem.read_bytes(address, 2)
    if _temp == b'\x00\x00':
        break
    else:
        try:
            _temp = _temp.decode('utf-16')
        except UnicodeDecodeError:
            logger.warning(f'UTF-16 decode error, replace with empty str')
            _temp = ascii(_temp)
        finally:
            result += _temp
            _temp_address += 2

from pymem.

StarrFox avatar StarrFox commented on June 30, 2024

closing this assuming it was fixed by #121

please comment if not

from pymem.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.