GithubHelp home page GithubHelp logo

Comments (5)

mkmik avatar mkmik commented on June 16, 2024

Currently there are 2 main reasons why V7 doesn't store null-terminated strings internally:

  1. JS strings can contain 0 bytes in them.
  2. Short strings are embedded in the NaN payload of ieee 754 doubles. Especially strings of length 6 cannot have a terminator.

Furthermore, the GC can relocate memory so the validity of these pointers is very short, making a copy often necessary anyways (though not in your example and many similar uses).

That said, the point you make is valid and we should think of a way of avoiding copying strings, especially those of non trivial (4-8 bytes) length.

from v7.

mkmik avatar mkmik commented on June 16, 2024

Clarification: We could include a tailing 0 byte in V7 strings, with a modest storage cost, but it would mean that we cannot fit short 6 byte strings in val_t values.

A possible solution would be this API:

char *v7_to_string(struct v7 *v7, val_t *v, char buf[7], size_t *sizep);

(and copy all 6 byte strings into the user provided buf, null terminate and return it)

but that's perhaps an overkill just to enable us to encode 6 char long strings in val_t values given that we can do other things, such as intern strings, use a dictionary of common strings, etc.

The null termination per se is not likely to be a big cost, but the relative cost of loosing compact encoding for a 6 char string (stuff like the 'length' property name, stringyfied numbers from 100000-99999) is quite high (130% more); but I guess usability is quite important as well.

from v7.

mkmik avatar mkmik commented on June 16, 2024

At least, there could be preprocessor option to add or not to add the termination symbol, but I would not recommend to proceed with such way, because different implemented custom interfaces (released by different developers) may require different options and it will not be possible to link them alltogether.

@kkutsner, with an API that always returns the length when getting a pointer to a string and always requires a length to be passed when creating JS strings, adding a null terminator byte in the underlying storage wouldn't break users, right?

I.e., if a module requires null termination, it can be enabled globally even if other parts of the code don't know about it. The only side effect would be the overall footprint.

Am I forgetting something?

from v7.

kkutsner avatar kkutsner commented on June 16, 2024

@mmikulicic Thank you for the response its pretty detailed - I really spent not so much time to look into "internals". But looking at your comments I see the reasons of current design. Anyway, there are few comments - just my quick thoughts:

  • JS strings can contain 0 bytes in them. For this case, I can suggest to recognise the case when passed string is empty and always return a pointer to some global predefined constant. It might be no sense to store each empty string as separate data chunk.
  • Short strings are embedded in the NaN payload of ieee 754 doubles. Sounds really cool. Need to look at it in details.
  • GC can relocate memory so the validity of these pointers is very short. This is expectable. I do not talk about storing the retrieved string pointers between different calls of the same method, but it will be useful to guarantee that string pointer will be valid until the method is completed.
  • ...adding a null terminator byte in the underlying storage wouldn't break users I'm pretty sure, this is right. At least something similar I saw in Lua interpreter.

from v7.

mkmik avatar mkmik commented on June 16, 2024

JS strings can contain 0 bytes in them.

For this case, I can suggest to recognise the case when passed string is empty and always return a pointer to some global predefined constant. It might be no sense to store each empty string as separate data chunk.

I meant that this is a valid JS string: `"hello\0world". It's length is 11.

Zero length strings are not allocated in data chunks as they fall into the same class as short strings and are embedded in JS values.

GC can relocate memory so the validity of these pointers is very short.

This is expectable. I do not talk about storing the retrieved string pointers between different calls of the same method, but it will be useful to guarantee that string pointer will be valid until the method is completed.

The runtime guarantees that any string pointer is valid while executing user C code, between calls of the V7 runtime.

There are some V7 functions that are guaranteed to never cause memory allocation (e.g. v7_is_string) and will be documented accordingly. Others can cause allocations even if not expected, for example v7_get might invoke a getter function while reading a property.

from v7.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.