Comments (8)
New Lifetimes
After a bit of thought, I've come to the following design for the Lifetime
declaration.
For Visitor Methods: StringLifetime
Static
: The value lives for the lifetime of the program.Stack
: The value lives on the stack.Heap
: The value lives on the heap.Owned
: The value lives on the stack or heap and is managed by some entity.
For Access Methods: ValueLifetime
Static
: The value lives for the lifetime of the program.Heap
: The value lives on the heap.Owned
: The value lives on the stack or heap and is managed by some entity.
Visitor Methods
Usage
Stack
values must be copied.Static
andHeap
values can be copied or used directly.Owned
values must be copied unless the visitor knows that the value's lifetime is safe, in which case it may be used directly.
Deallocation
Static
andStack
values never need to be deallocated.- In the visitor,
Heap
values never have to be deallocated. In the deserializer,Heap
values must be deallocated if the visitor didn't use it. - In the visitor,
Owned
values never have to be deallocated. In the deserializer,Owned
values can:- Be leaked if the visitor used it.
- Be deallocated immediately if the visitor did not use it and the deserializer manages it.
- Be leaked if the visitor did not use it and either the deserializer manages it but wants to defer cleanup until later or some other entity manages the value.
Access Methods
Usage
Same as "Visitor Methods - Usage".
Deallocation
Same as "Visitor Methods - Deallocation" except that in the visitor, Heap
values must be deallocated upon error or if the value is not returned.
Edit: Static
messes up getty.de.free
, so I'm removing it from the proposal. How often does one return a static string directly for deserialization anyways...
from getty.
New Indices
One problem with the current solution for the deserializer portion of the issue is that it only enables visitors to use a single, contiguous slice from visitString
's input. If the visitor wants to use non-contiguous portions of input
as part of its final value, there needs to be a way for the visitor to specify the starting and ending positions of each slice it uses.
To fix this, we can modify the return type of visitString
to become the following:
// The starting and ending positions of a string.
pub const Range = struct {
start: usize,
end: usize,
};
// Indicates to the deserializer which parts of `input` were used directly by
// the visitor.
//
// If a visitor uses the entirety of `input` as part of its final value, then the
// `Single` variant can be used, which doesn't require any extra allocations.
// Otherwise, the visitor will need to allocate a slice for `Multiple`, which the
// deserializer will clean up afterwards.
pub const Used = union(enum) {
Single: Range,
Multiple: []const Range,
};
// The new return type of `visitString`.
//
// If `used` is `null`, then the visitor did not use `input` as part of its final
// return value.
pub fn Return(comptime T: type) type {
return struct {
value: T,
used: ?Used = null,
};
}
For example, suppose a user wants to deserialize "John Doe"
into the type struct { first: []u8, last: []u8 }
, where first
is John
and last
is Doe
. As it is now, they'd be force to either make a copy of both John
and Doe
, or use one of the names directly and copy the other. However, with this new return type, they can assign input[0..4]
to first
and input[5..]
to last
, allocate a slice for the used
field in the return value, populate it appropriately, and they're done!
from getty.
As a sort of sanity check (and to give me a bit of motivation to work on this), I did some very simple deserialization benchmarking with some JSON data that had lots of strings. I deserialized the input data into a struct 100,000 times with both Getty JSON and std.json
, and as expected Getty JSON was much slower.
const std = @import("std");
const json = @import("json");
const c_ally = std.heap.c_allocator;
const T = []struct {
id: []const u8,
type: []const u8,
name: []const u8,
ppu: f64,
batters: struct {
batter: []struct {
id: []const u8,
type: []const u8,
},
},
topping: []struct {
id: []const u8,
type: []const u8,
},
};
const input =
\\[
\\ {
\\ "id": "0001",
\\ "type": "donut",
\\ "name": "Cake",
\\ "ppu": 0.55,
\\ "batters": {
\\ "batter": [
\\ {
\\ "id": "1001",
\\ "type": "Regular"
\\ },
\\ {
\\ "id": "1002",
\\ "type": "Chocolate"
\\ },
\\ {
\\ "id": "1003",
\\ "type": "Blueberry"
\\ },
\\ {
\\ "id": "1004",
\\ "type": "Devil's Food"
\\ }
\\ ]
\\ },
\\ "topping": [
\\ {
\\ "id": "5001",
\\ "type": "None"
\\ },
\\ {
\\ "id": "5002",
\\ "type": "Glazed"
\\ },
\\ {
\\ "id": "5005",
\\ "type": "Sugar"
\\ },
\\ {
\\ "id": "5007",
\\ "type": "Powdered Sugar"
\\ },
\\ {
\\ "id": "5006",
\\ "type": "Chocolate with Sprinkles"
\\ },
\\ {
\\ "id": "5003",
\\ "type": "Chocolate"
\\ },
\\ {
\\ "id": "5004",
\\ "type": "Maple"
\\ }
\\ ]
\\ },
\\ {
\\ "id": "0002",
\\ "type": "donut",
\\ "name": "Raised",
\\ "ppu": 0.55,
\\ "batters": {
\\ "batter": [
\\ {
\\ "id": "1001",
\\ "type": "Regular"
\\ }
\\ ]
\\ },
\\ "topping": [
\\ {
\\ "id": "5001",
\\ "type": "None"
\\ },
\\ {
\\ "id": "5002",
\\ "type": "Glazed"
\\ },
\\ {
\\ "id": "5005",
\\ "type": "Sugar"
\\ },
\\ {
\\ "id": "5003",
\\ "type": "Chocolate"
\\ },
\\ {
\\ "id": "5004",
\\ "type": "Maple"
\\ }
\\ ]
\\ },
\\ {
\\ "id": "0003",
\\ "type": "donut",
\\ "name": "Old Fashioned",
\\ "ppu": 0.55,
\\ "batters": {
\\ "batter": [
\\ {
\\ "id": "1001",
\\ "type": "Regular"
\\ },
\\ {
\\ "id": "1002",
\\ "type": "Chocolate"
\\ }
\\ ]
\\ },
\\ "topping": [
\\ {
\\ "id": "5001",
\\ "type": "None"
\\ },
\\ {
\\ "id": "5002",
\\ "type": "Glazed"
\\ },
\\ {
\\ "id": "5003",
\\ "type": "Chocolate"
\\ },
\\ {
\\ "id": "5004",
\\ "type": "Maple"
\\ }
\\ ]
\\ }
\\]
;
fn stdJson() !void {
for (0..100_000) |_| {
const output = try std.json.parseFromSlice(T, c_ally, input, .{});
defer output.deinit();
}
}
fn gettyJson() !void {
for (0..100_000) |_| {
const output = try json.fromSlice(c_ally, T, input);
defer json.de.free(c_ally, output, null);
}
}
fn gettyJsonArena() !void {
for (0..100_000) |_| {
var arena = std.heap.ArenaAllocator.init(c_ally);
const arena_ally = arena.allocator();
defer arena.deinit();
_ = try json.fromSlice(arena_ally, T, input);
}
}
pub fn main() !void {
//try gettyJson();
//try gettyJsonArena();
//try stdJson();
}
$ hyperfine --warmup 5 ./getty ./getty-arena ./std
Benchmark 1: ./getty
Time (mean ± σ): 718.4 ms ± 3.7 ms [User: 713.4 ms, System: 3.5 ms]
Range (min … max): 715.6 ms … 727.9 ms 10 runs
Benchmark 2: ./getty-arena
Time (mean ± σ): 628.7 ms ± 1.5 ms [User: 622.5 ms, System: 4.6 ms]
Range (min … max): 626.6 ms … 631.3 ms 10 runs
Benchmark 3: ./std
Time (mean ± σ): 482.7 ms ± 1.5 ms [User: 476.2 ms, System: 4.9 ms]
Range (min … max): 481.2 ms … 486.4 ms 10 runs
Summary
./std ran
1.30 ± 0.01 times faster than ./getty-arena
1.49 ± 0.01 times faster than ./getty
The unnecessary allocations are, surely, the main factor for Getty's slowness. However, I should note that std.json.parseFromSlice
uses an arena allocator implicitly whereas json.fromSlice
does not, and you can see from the results that using an arena does indeed make a difference.
With or without an arena, though, Getty's still much slower. So it's time to get started on this issue!
from getty.
Removed accepted label for now since, as pointed out by fredi, there are major issues with implementing this kind of thing, which I've unfortunately had the chance to run into in my own branch.
For one thing, the multiple ranges idea was a total non-starter. I think my brain farted when I came up with that. The input for visitString
is always a slice, which is a single allocation. Ya can't just free individual pieces of a single allocation. So now we have no way of letting visitors use only parts of input
.
Another issue is getty.de.free
. How will it be able to know if the strings of a value, especially if they have different lifetimes, should be freed or not?
from getty.
Before I can implement the lifetime optimizations, I had to do a bit of general allocation work beforehand. The above, merged PR implements that allocation work.
In summary, Getty now uses an arena internally for all allocations. This simplifies visitors and deserialization blocks as they no longer have to worry about freeing values and allows end users to free everything whenever they want. Additionally, the arena is passed to the methods of Deserializer implementations so they're simplified a tad as well.
Big thanks to fredi for discussing all this with me and steering me in the right direction :D
from getty.
The last half consists of the lifetime work, which consists of two parts:
- Add
StringLifetime
andValueLifetime
types. - Update
visitString
to return not only the produced value but also an indicator of whether or not the input string was used directly in the final, returned value.
Lifetimes
The lifetime types will be more or less the same as what I've already proposed:
-
For
visitString
,StringLifetime
will have the following variants:Stack
: The value is on the stack and its lifetime is shorter than the deserialization process.- These values must be copied.
Heap
: The value is on the heap and its lifetime is tied to the arena allocator returned to the end user.- These values can be copied or used directly (Getty's default behavior will be to use them directly).
- If there's an error or the value is copied, do not free the value in the visitor. Either the deserializer will free it or it will be freed by the arena at the end.
Managed
: The value is on the stack or heap and its lifetime is managed by an entity that is not the arena allocator returned to the end user.- These values should be copied, but can be used directly if the user knows that the value's lifetime is safe or acceptable.
- If there's an error or the value is copied, do not free the value in the visitor. The managing entity will free it later on.
-
For access methods (e.g.,
nextKeySeed
),ValueLifetime
will have the following variants:Heap
: The value is on the heap and its lifetime is tied to the arena allocator returned to the end user.- These values can be copied or used directly (Getty's default behavior will be to use them directly).
- If there's an error or the value is copied, free the value in the visitor if you want. It's part of the arena returned to the user so you could also just leak it and it'll be cleaned up at the end.
Managed
: The value is on the stack or heap and its lifetime is managed by an entity that is not the arena allocator returned to the end user.- These values should be copied, but can be used directly if the user knows that the value's lifetime is safe or acceptable.
- If there's an error or the value is copied, do not free the value in the visitor. The managing entity will free it later on.
visitString
's Return Type
Before, I proposed returning from visitString
the produced value and a slice indicating what part of the input
parameter was used.
With the arena, the slice is no longer necessary. A simple bool
will suffice, where true
means that input
was used directly.
from getty.
Performance updates after arena changes (using same benchmarking code):
$ hyperfine --warmup 5 ./getty ./std
Benchmark 1: ./getty
Time (mean ± σ): 688.8 ms ± 1.9 ms [User: 684.7 ms, System: 3.0 ms]
Range (min … max): 685.8 ms … 692.3 ms 10 runs
Benchmark 2: ./std
Time (mean ± σ): 484.0 ms ± 2.7 ms [User: 480.7 ms, System: 2.2 ms]
Range (min … max): 481.3 ms … 489.1 ms 10 runs
Summary
./std ran
1.42 ± 0.01 times faster than ./getty
Slightly slower than getty-arena
was, but faster than the old getty
version. Since there's no lifetimes right now, Getty doesn't free anything at all, including struct keys. Perhaps keeping around all that cruft all the time is slowing things down?
from getty.
Performance update after some optimizations in getty-json (no peeking, always allocating strings, heap branch first).
Note that std's runtime has increased overall b/c we're now correctly passing in .alloc_always
. Beforehand, it was returning a slice into the scanner's input which, if we hadn't used a static string input, would be completely invalid.
$ hyperfine --warmup 5 ./getty ./std
Benchmark 1: ./getty
Time (mean ± σ): 678.8 ms ± 1.3 ms [User: 675.0 ms, System: 2.9 ms]
Range (min … max): 676.3 ms … 680.7 ms 10 runs
Benchmark 2: ./std
Time (mean ± σ): 573.4 ms ± 2.8 ms [User: 567.5 ms, System: 4.6 ms]
Range (min … max): 569.5 ms … 578.8 ms 10 runs
Summary
./std ran
1.18 ± 0.01 times faster than ./getty
We shaved off around 10ms.
from getty.
Related Issues (20)
- Large types with many fields hits backwards branches limit HOT 3
- Struct field without default value gets freed causing segmentation fault, even when "skipped" HOT 4
- Move logic in `getty.de.free` into Deserialization Blocks
- Fix one-pointer string deserialization HOT 2
- Build failure when depending on both getty and getty-json HOT 1
- Add support for 'untagged' attribute
- Add `isVariantAllocated` method to `UnionAccess`
- Cannot deserialize into an untagged union pointer HOT 1
- Introduce a `skip_if_null` attribute
- Add support for standard library types HOT 6
- Add `isElementAllocated`/`isValueAllocated`/`isPayloadAllocated` functions to SeqAccess/MapAccess/VariantAccess interface HOT 5
- Document expected usage of `free`
- Document allocation model
- Fully unit test (de)serialize functions in blocks
- Add block tests for Ignored
- Remove redundant doc comments
- (De)Ser maps into tuple slices HOT 1
- Replace `https` dependency URLs with `git+https`
- Autodocs shows internal testing module API instead of Getty's
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from getty.