After the writeup in #39 for @marcelklehr, and a bit of completely unrelated stewing on things, I figured out a way to completely remove SingleNodes from use in the codebase, the model, and the serialization standard.
Maps
This was the big reason to have SingleNodes in the first place, they were just reused in Lists for convenience. But we can make a better approach analogous to BigTable's use of consecutive, immutable SSTables (which fit the CTree immutability model quite nicely).
A map will be defined by its set of keys (in the representation sense). So if I want to create a map representing { "hello" : "world", "goodbye" : "strudel" }, I will start with the following instruction:
[2, [...], 8, ["hello","goodbye"]]
This object now has three childsets open:
- Childset 0 : Competing values for key "hello".
- Childset 1 : Competing values for key "goodbye".
- Childset 2 : Extension merge. Only accepts maps.
Note that the order of the childset-to-key associations, is defined by the order of the keys in the map definition instruction (this array is the map's immutable value). We can now insert into these childset in the traditional way - using integer positions and the node key of the actual value. For example:
[1, [..., 8, "{\\"hello\\,893893892"], 1, "strudel"]
Which inserts a string "strudel" into the map at position 1, thus setting the value for representation key "goodbye". These childsets have competition semantics - only the object with the "winning" key will be used.
Childsets with no children, or at deleted positions, do not contribute a key presence to the flattened representation at all. In the above example, we have so far created a map, and set the value for "goodbye", even though we have an unused space for setting the value for "hello". The flattened rep, currently, would be { "goodbye": "strudel"}. If position 1 were marked deleted, the rep would become {}.
Map extensions work by ordered-replace semantics. All the maps in the extension childset are ordered from least- to most-successful (sorted by keys). For each one, the flat rep is computed, and used to update the parent map's flat-rep-in-progress. This is similar to a plain old dict.update(), except that keys that are explicitly non-present in the child map (whether from deletion, or an empty childset) are removed from the parent rep-in-progress. Thus, a key may be removed and set again multiple times if there is heavy competition in the extension childset.
In pseudocode, this can be expressed pretty simply:
def flatten(map):
temp = {}
for i in range(len(map)):
if (not map.deletions[i]) and len(map.children[i]):
temp[map.keys[i]] = map.children[i].flatten()
for extension in map.children[len(map)]:
temp.update(flatten(extension))
for k in extension.deleted_keys:
del temp[k]
return temp
In the real version, of course, this function would be a method of the Map Node class, and for performance, you would collect child nodes in temp first (and flatten them as a final step), to avoid flattening things you won't actually use in the final result.
Lists
Lists never "needed" SingleNodes in the first place. It's not that much code to have them enforce nodekey constraints themselves, and you might as well, since that's the only place you'd be using it anymore. Because a list can be extended from any position, and any element in a list can be marked deleted, all SingleNode-based replacement ever did was violate the SPOT rule, by defining multiple ways to replace a value beyond delete-and-insert.
Not much to say about this, really. Position childsets would simply hold the direct value. Unset positions would not be present in the flatrep.