Comments (5)
Currently there are 2 main reasons why V7 doesn't store null-terminated strings internally:
- JS strings can contain 0 bytes in them.
- Short strings are embedded in the NaN payload of ieee 754 doubles. Especially strings of length 6 cannot have a terminator.
Furthermore, the GC can relocate memory so the validity of these pointers is very short, making a copy often necessary anyways (though not in your example and many similar uses).
That said, the point you make is valid and we should think of a way of avoiding copying strings, especially those of non trivial (4-8 bytes) length.
from v7.
Clarification: We could include a tailing 0 byte in V7 strings, with a modest storage cost, but it would mean that we cannot fit short 6 byte strings in val_t
values.
A possible solution would be this API:
char *v7_to_string(struct v7 *v7, val_t *v, char buf[7], size_t *sizep);
(and copy all 6 byte strings into the user provided buf, null terminate and return it)
but that's perhaps an overkill just to enable us to encode 6 char long strings in val_t
values given that we can do other things, such as intern strings, use a dictionary of common strings, etc.
The null termination per se is not likely to be a big cost, but the relative cost of loosing compact encoding for a 6 char string (stuff like the 'length' property name, stringyfied numbers from 100000-99999) is quite high (130% more); but I guess usability is quite important as well.
from v7.
At least, there could be preprocessor option to add or not to add the termination symbol, but I would not recommend to proceed with such way, because different implemented custom interfaces (released by different developers) may require different options and it will not be possible to link them alltogether.
@kkutsner, with an API that always returns the length when getting a pointer to a string and always requires a length to be passed when creating JS strings, adding a null terminator byte in the underlying storage wouldn't break users, right?
I.e., if a module requires null termination, it can be enabled globally even if other parts of the code don't know about it. The only side effect would be the overall footprint.
Am I forgetting something?
from v7.
@mmikulicic Thank you for the response its pretty detailed - I really spent not so much time to look into "internals". But looking at your comments I see the reasons of current design. Anyway, there are few comments - just my quick thoughts:
- JS strings can contain 0 bytes in them. For this case, I can suggest to recognise the case when passed string is empty and always return a pointer to some global predefined constant. It might be no sense to store each empty string as separate data chunk.
- Short strings are embedded in the NaN payload of ieee 754 doubles. Sounds really cool. Need to look at it in details.
- GC can relocate memory so the validity of these pointers is very short. This is expectable. I do not talk about storing the retrieved string pointers between different calls of the same method, but it will be useful to guarantee that string pointer will be valid until the method is completed.
- ...adding a null terminator byte in the underlying storage wouldn't break users I'm pretty sure, this is right. At least something similar I saw in Lua interpreter.
from v7.
JS strings can contain 0 bytes in them.
For this case, I can suggest to recognise the case when passed string is empty and always return a pointer to some global predefined constant. It might be no sense to store each empty string as separate data chunk.
I meant that this is a valid JS string: `"hello\0world". It's length is 11.
Zero length strings are not allocated in data chunks as they fall into the same class as short strings and are embedded in JS values.
GC can relocate memory so the validity of these pointers is very short.
This is expectable. I do not talk about storing the retrieved string pointers between different calls of the same method, but it will be useful to guarantee that string pointer will be valid until the method is completed.
The runtime guarantees that any string pointer is valid while executing user C code, between calls of the V7 runtime.
There are some V7 functions that are guaranteed to never cause memory allocation (e.g. v7_is_string
) and will be documented accordingly. Others can cause allocations even if not expected, for example v7_get
might invoke a getter function while reading a property.
from v7.
Related Issues (20)
- V7 ScriptBASIC integration on Windows 7 32 bit HOT 41
- unit_test fails HOT 3
- v7_next_prop - No property attributes being returned HOT 10
- Script BASIC HOT 9
- v7.c: line:29703 /* TODO(dfrank) : add getter/setter support */ HOT 1
- How to load the byte code generated by v7_compile? HOT 4
- Can't call js function from C HOT 13
- Make user_data property as _V7_PROPERTY_OFF_HEAP HOT 1
- Problem running a basic test program HOT 2
- Segmentation Faults 2017-05-09
- Can I run under iOS? And Is it thread safe? HOT 2
- v7 support ajax?
- Trouble passing "true", "false" and "null" HOT 3
- v7_array_length issue HOT 1
- link errors since commit 8d8d8ce5... HOT 1
- Developer Centre down?
- C program crashes when trying to retrieve one of struct members as object member if below 0 HOT 4
- v7 Optimization tips
- Linker errors: Undefined symbols for architecture x86_64
- JSON.parse error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from v7.