GithubHelp home page GithubHelp logo

Comments (18)

domenic avatar domenic commented on June 5, 2024

Agreed we should probably just state that strings and byte sequences are mutable.

Mutable strings are very annoying in programming languages but I don't think there's much of a problem for them in specs. Maybe we should try to solicit wider opinions though especially since I can't remember why they're so bad in programming languages.

from infra.

domenic avatar domenic commented on June 5, 2024

I guess one downside is that if we have an actual JavaScript string coming from a JavaScript program and we treat it as mutable, that is nonsensical.

from infra.

adanilo avatar adanilo commented on June 5, 2024

For statically compiled languages strings are immutable since if you include a given string in multiple source files, the linker will resolve them down to a single copy in the final binary. They also end up in the read-only data section of the executable. Makes sense to state strings and byte-sequences are mutable for JS.

from infra.

annevk avatar annevk commented on June 5, 2024

Strings are not mutable in JS. This is about what we do in standards with strings.

But that does bring up an interesting point, if JavaScript strings (as defined by Infra in due course) become mutable, does that mean IDL always needs to copy? It might be better if we match JavaScript after all...

from infra.

esprehn avatar esprehn commented on June 5, 2024

How is this web observable? The native string type used inside Blink and WebKit is immutable, so is the JS string. As long as specs never expose the mutability in some object identity way I'm not sure it matters what's in the spec, though it doesn't really match how many implementations work.

@bzbarsky

from infra.

domenic avatar domenic commented on June 5, 2024

Repeating some discussion Elliott and I had offline:

This is not about anything web observable really. It's about whether we write our specs as "lowercase x" or "set x to the result of lowercasing x". Most specs seem to do the former. The question at hand is whether we should explicitly state that spec-strings are mutable so things like that work, or if we should try to move the spec ecosystem away from it and toward the latter style.

Besides Elliott's point about Blink and WebKit using immutable string types and how this means a mutable string type in specs make spec <-> implementation translation harder, he reminded me why mutable strings in programming langauges are scary. It's because they can result in spooky action at a distance. I.e. you could pass a string down through many algorithms and then one of them mutates it, and all the others are now affected. That's pretty bad.

from infra.

bzbarsky avatar bzbarsky commented on June 5, 2024

Right, passing mutable references around should be done very carefully. There's a difference between that and having a mutable reference that's tightly scoped.

In any case, I don't have a strong opinion about whether we should allow the "lowercase x" thing. Either way, the "set y to be x" pattern has gotchas that people need to watch out for: lowercasing x may or may not cause y to also be lowercase, depending on how it's done. And if there is no aliasing, then you can't tell apart in-place lowercasing and copying lowercasing...

from infra.

annevk avatar annevk commented on June 5, 2024

It's observable in the sense that it defies logic, depending on how IDL is defined. We've already established that certain objects pass through IDL so any JavaScript references to them can observe changes that happen in the specification algorithm for the IDL method or attribute, such as detaching an ArrayBuffer object.

Now, if I define a method that takes a DOMString x as "ASCII lowercase x", the result of mutable strings on the inside and immutable strings on the outside without IDL copying the input would result in some kind of logic error.

from infra.

bzbarsky avatar bzbarsky commented on June 5, 2024

I think there is a strong implication, which we should perhaps make explicit, that https://heycam.github.io/webidl/#es-to-DOMString and friends copy.

from infra.

annevk avatar annevk commented on June 5, 2024

@bzbarsky if that's acceptable that would certainly make things easier as we wouldn't have to change much (unless mutable strings are a problem waiting to happen), but it feels a bit like cheating.

That doesn't discount the potential for confusion of course, but we have embraced other subtle differences from JavaScript and as long as everything is defined in detail I'm okay with that.

from infra.

annevk avatar annevk commented on June 5, 2024

It would be great if everyone here could leave a short reply that is one of these:

  • Mutable (aka please make copying at the IDL boundary explicit and be done with it)
  • Immutable (aka please stick to the same constraints as programmers have to and don't encourage unnecessary copying)

That would unblock changes to Infra and IDL (which should start adopting the various types defined by Infra). Thanks!

from infra.

zcorpan avatar zcorpan commented on June 5, 2024

So for example in https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points

this

Append that code point to the end of /result/.

would need to be something like

Set /result/ to /result/ concatenated with that code point.

?

from infra.

annevk avatar annevk commented on June 5, 2024

That is what immutable would end up requiring yes. (I've since found lots of places in the URL Standard that assume mutable strings and basically treat strings like lists, with appending and prepending being available. So personally I'm leaning towards mutable, even though immutable does seem cleaner.)

from infra.

zcorpan avatar zcorpan commented on June 5, 2024

OK. Immutable does seem cleaner in that the spec will more closely map to an implementation, which seems like it would be easier to reason about. OTOH I'm not aware of any cases where pretending strings are mutable in specs have caused bugs or problems.

from infra.

annevk avatar annevk commented on June 5, 2024

Feedback from smaug---- (intentionally not used @): "the reason mutable [in specs] is fine, IMO, is that it probably makes specs easier to read".

from infra.

domenic avatar domenic commented on June 5, 2024

We have IMO three options:

  1. Mutable strings
  2. Immutable strings
  3. Something subtle where we say that within an algorithm strings are mutable, but when you pass them to another algorithm, a copy is made. (Or, when you receive them, implicitly at the top of your algorithm?) Thus changes in the original algorithm do not propagate to others.

In practice my guess is that (3) matches existing specs and peoples' intuitions. We'd have to hand-wave in Infra about "pass to another algorithm", perhaps defining that later if we want (e.g. as part of #92).

(1) might be simpler than (3) if we do a survey of specs and find that in fact strings are never updated after being handed off to other algorithms. I guess the main thing to look for would be "in parallel" algorithms operating on the same string? Or passing in e.g. keys from a map somewhere.

(1) can later be changed to (3) in a "non-breaking" fashion if we discover that it's a problem in practice.

If we do (3) we may want to add an explicit note about the parallel between this system and C++'s std::string + const std::string&. Not sure though.

(2)'s only downside is extra spec verbosity, but in a way that is familiar for programmers, so I am not sure it is that bad.

So in conclusion I am OK with any of these. I did want to point out (3) as an explicit option though.

from infra.

bzbarsky avatar bzbarsky commented on June 5, 2024

I rather like (3), actually. It's basically the "mutable stringbuilder, immutable string" model, but with implicit coercions between them...

from infra.

jyasskin avatar jyasskin commented on June 5, 2024

I'd like to vote for @domenic's (3). However, the interaction with #139 is interesting, in that it's definitely right for a variable holding a Document or an object to be an alias, and it seems confusing for an algorithm parameter to be different from inlining the algorithm and storing that parameter as a variable.

As @annevk suggested in #139, maybe this is the difference between value and reference types, like how WebIDL says some types are "always passed by value". Strings would be value types, so variables and parameters holding them would copy, unless the specification explicitly says to make a reference ("Let path be a reference to foo.path"). Then I think we'd want to use Rust's rule that you can mutate a string as long as you have the only reference to it.

from infra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.