Comments (18)
Agreed we should probably just state that strings and byte sequences are mutable.
Mutable strings are very annoying in programming languages but I don't think there's much of a problem for them in specs. Maybe we should try to solicit wider opinions though especially since I can't remember why they're so bad in programming languages.
from infra.
I guess one downside is that if we have an actual JavaScript string coming from a JavaScript program and we treat it as mutable, that is nonsensical.
from infra.
For statically compiled languages strings are immutable since if you include a given string in multiple source files, the linker will resolve them down to a single copy in the final binary. They also end up in the read-only data section of the executable. Makes sense to state strings and byte-sequences are mutable for JS.
from infra.
Strings are not mutable in JS. This is about what we do in standards with strings.
But that does bring up an interesting point, if JavaScript strings (as defined by Infra in due course) become mutable, does that mean IDL always needs to copy? It might be better if we match JavaScript after all...
from infra.
How is this web observable? The native string type used inside Blink and WebKit is immutable, so is the JS string. As long as specs never expose the mutability in some object identity way I'm not sure it matters what's in the spec, though it doesn't really match how many implementations work.
from infra.
Repeating some discussion Elliott and I had offline:
This is not about anything web observable really. It's about whether we write our specs as "lowercase x" or "set x to the result of lowercasing x". Most specs seem to do the former. The question at hand is whether we should explicitly state that spec-strings are mutable so things like that work, or if we should try to move the spec ecosystem away from it and toward the latter style.
Besides Elliott's point about Blink and WebKit using immutable string types and how this means a mutable string type in specs make spec <-> implementation translation harder, he reminded me why mutable strings in programming langauges are scary. It's because they can result in spooky action at a distance. I.e. you could pass a string down through many algorithms and then one of them mutates it, and all the others are now affected. That's pretty bad.
from infra.
Right, passing mutable references around should be done very carefully. There's a difference between that and having a mutable reference that's tightly scoped.
In any case, I don't have a strong opinion about whether we should allow the "lowercase x" thing. Either way, the "set y to be x" pattern has gotchas that people need to watch out for: lowercasing x may or may not cause y to also be lowercase, depending on how it's done. And if there is no aliasing, then you can't tell apart in-place lowercasing and copying lowercasing...
from infra.
It's observable in the sense that it defies logic, depending on how IDL is defined. We've already established that certain objects pass through IDL so any JavaScript references to them can observe changes that happen in the specification algorithm for the IDL method or attribute, such as detaching an ArrayBuffer object.
Now, if I define a method that takes a DOMString x as "ASCII lowercase x", the result of mutable strings on the inside and immutable strings on the outside without IDL copying the input would result in some kind of logic error.
from infra.
I think there is a strong implication, which we should perhaps make explicit, that https://heycam.github.io/webidl/#es-to-DOMString and friends copy.
from infra.
@bzbarsky if that's acceptable that would certainly make things easier as we wouldn't have to change much (unless mutable strings are a problem waiting to happen), but it feels a bit like cheating.
That doesn't discount the potential for confusion of course, but we have embraced other subtle differences from JavaScript and as long as everything is defined in detail I'm okay with that.
from infra.
It would be great if everyone here could leave a short reply that is one of these:
- Mutable (aka please make copying at the IDL boundary explicit and be done with it)
- Immutable (aka please stick to the same constraints as programmers have to and don't encourage unnecessary copying)
That would unblock changes to Infra and IDL (which should start adopting the various types defined by Infra). Thanks!
from infra.
So for example in https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points
this
Append that code point to the end of /result/.
would need to be something like
Set /result/ to /result/ concatenated with that code point.
?
from infra.
That is what immutable would end up requiring yes. (I've since found lots of places in the URL Standard that assume mutable strings and basically treat strings like lists, with appending and prepending being available. So personally I'm leaning towards mutable, even though immutable does seem cleaner.)
from infra.
OK. Immutable does seem cleaner in that the spec will more closely map to an implementation, which seems like it would be easier to reason about. OTOH I'm not aware of any cases where pretending strings are mutable in specs have caused bugs or problems.
from infra.
Feedback from smaug---- (intentionally not used @): "the reason mutable [in specs] is fine, IMO, is that it probably makes specs easier to read".
from infra.
We have IMO three options:
- Mutable strings
- Immutable strings
- Something subtle where we say that within an algorithm strings are mutable, but when you pass them to another algorithm, a copy is made. (Or, when you receive them, implicitly at the top of your algorithm?) Thus changes in the original algorithm do not propagate to others.
In practice my guess is that (3) matches existing specs and peoples' intuitions. We'd have to hand-wave in Infra about "pass to another algorithm", perhaps defining that later if we want (e.g. as part of #92).
(1) might be simpler than (3) if we do a survey of specs and find that in fact strings are never updated after being handed off to other algorithms. I guess the main thing to look for would be "in parallel" algorithms operating on the same string? Or passing in e.g. keys from a map somewhere.
(1) can later be changed to (3) in a "non-breaking" fashion if we discover that it's a problem in practice.
If we do (3) we may want to add an explicit note about the parallel between this system and C++'s std::string + const std::string&
. Not sure though.
(2)'s only downside is extra spec verbosity, but in a way that is familiar for programmers, so I am not sure it is that bad.
So in conclusion I am OK with any of these. I did want to point out (3) as an explicit option though.
from infra.
I rather like (3), actually. It's basically the "mutable stringbuilder, immutable string" model, but with implicit coercions between them...
from infra.
I'd like to vote for @domenic's (3). However, the interaction with #139 is interesting, in that it's definitely right for a variable holding a Document
or an object
to be an alias, and it seems confusing for an algorithm parameter to be different from inlining the algorithm and storing that parameter as a variable.
As @annevk suggested in #139, maybe this is the difference between value and reference types, like how WebIDL says some types are "always passed by value". Strings would be value types, so variables and parameters holding them would copy, unless the specification explicitly says to make a reference ("Let path be a reference to foo.path"). Then I think we'd want to use Rust's rule that you can mutate a string as long as you have the only reference to it.
from infra.
Related Issues (20)
- Multiple assignment list unwrapping
- ECMA262 completion records are not interpreted appropriately HOT 6
- What happens when an implementation-defined limit is reached?
- Clarify termination scope HOT 4
- Clarify USV handling in "serialize a JavaScript value to JSON bytes" HOT 1
- Consider defining "failure" HOT 2
- Expand on list indexing syntax
- Conditional abort edge case HOT 2
- Forgiving base64 HOT 8
- "do while" loops HOT 5
- Iteration and append methods for byte sequence used in standards are undefined HOT 1
- Skipping an item while iterating is undefined HOT 1
- Define monkeypatching HOT 3
- Consider defining "unique internal value"
- Explicit unions HOT 1
- definitions of "Continue" and "Break" should be clear that they apply to the innermost loop
- Define ordering for sets and maps? HOT 4
- Define remainder (and/or modulo) HOT 2
- Peek operation for stacks
- "Parse JSON to Infra" algorithms shouldn't require a current JS realm HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from infra.