GithubHelp home page GithubHelp logo

Comments (5)

aundro avatar aundro commented on July 28, 2024 1

Agreed. I'll also fix get_named_type btw, which suffers from a similar affliction.

from src.

aundro avatar aundro commented on July 28, 2024

Thanks a lot for the thorough analysis, and detailed explanation!
Indeed, ATM py_get_numbered_type will attempt building a str - which, as you mentioned, is conceptually wrong.
That was obviously not a problem in Py2, since both bytes and str were the same thing. But now it is a problem.

However, I'm worried that changing that now, might break other scripts that expect that the returned tuple contains strings, not bytes.
Therefore, I wonder if the safest fix wouldn't be to allow Unicode replacement characters wherever there is a decoding issue.

Thoughts?

EDIT: I spoke too soon. It's only the comments that are retrieved as strings. I don't think that is conceptually wrong (do you think it is?)
However, not being able to retrieve the entire type because of that, is not acceptable (and I'm looking into a fix)

from src.

aundro avatar aundro commented on July 28, 2024

…but then returning the comment as a string, means that set_numbered_type won't work anymore since it expects a bytes object. Hmm. I guess it means we must change it to a bytes object as you suggested.

from src.

arizvisa avatar arizvisa commented on July 28, 2024

Hey Arnaud,

Yeah, I also believe that maybe changing to bytes would be the best thing. But, I'm not sure if there's other places in IDAPython where the "cmt" and "fieldcmts" fields from tinfo_t gets decoded by IDAPython. However, I think as long as we're _always_ passing bytes as input to the tinfo_t (via tinfo_t.deserialize, set_numbered_type, or perhaps another similar API) things should be okay.

In my own libraries, I always make sure to wrap those two APIs (tinfo_t.deserialize and set_numbered_type) with str.encode('utf-8') so that they're always submitted as bytes. This way when I retrieve them withtinfo_t.serialize or get_numbered_type they can be stored in a variable as either str or bytes before being encoded to bytes when using it. I've been using it like this for a while and haven't personally encountered any issues... Hopefully this is the right call.

from src.

aundro avatar aundro commented on July 28, 2024

@arizvisa fixed with 6142005

(Please contact us on support if you want a new build!)

from src.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.