GithubHelp home page GithubHelp logo

Comments (8)

ibokuri avatar ibokuri commented on June 11, 2024

New Lifetimes

After a bit of thought, I've come to the following design for the Lifetime declaration.

For Visitor Methods: StringLifetime

  • Static: The value lives for the lifetime of the program.
  • Stack: The value lives on the stack.
  • Heap: The value lives on the heap.
  • Owned: The value lives on the stack or heap and is managed by some entity.

For Access Methods: ValueLifetime

  • Static: The value lives for the lifetime of the program.
  • Heap: The value lives on the heap.
  • Owned: The value lives on the stack or heap and is managed by some entity.

Visitor Methods

Usage

  • Stack values must be copied.
  • Static and Heap values can be copied or used directly.
  • Owned values must be copied unless the visitor knows that the value's lifetime is safe, in which case it may be used directly.

Deallocation

  • Static and Stack values never need to be deallocated.
  • In the visitor, Heap values never have to be deallocated. In the deserializer, Heap values must be deallocated if the visitor didn't use it.
  • In the visitor, Owned values never have to be deallocated. In the deserializer, Owned values can:
    • Be leaked if the visitor used it.
    • Be deallocated immediately if the visitor did not use it and the deserializer manages it.
    • Be leaked if the visitor did not use it and either the deserializer manages it but wants to defer cleanup until later or some other entity manages the value.

Access Methods

Usage

Same as "Visitor Methods - Usage".

Deallocation

Same as "Visitor Methods - Deallocation" except that in the visitor, Heap values must be deallocated upon error or if the value is not returned.


Edit: Static messes up getty.de.free, so I'm removing it from the proposal. How often does one return a static string directly for deserialization anyways...

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

New Indices

One problem with the current solution for the deserializer portion of the issue is that it only enables visitors to use a single, contiguous slice from visitString's input. If the visitor wants to use non-contiguous portions of input as part of its final value, there needs to be a way for the visitor to specify the starting and ending positions of each slice it uses.

To fix this, we can modify the return type of visitString to become the following:

// The starting and ending positions of a string.
pub const Range = struct {
    start: usize,
    end: usize,
};

// Indicates to the deserializer which parts of `input` were used directly by
// the visitor.
//
// If a visitor uses the entirety of `input` as part of its final value, then the
// `Single` variant can be used, which doesn't require any extra allocations.
// Otherwise, the visitor will need to allocate a slice for `Multiple`, which the
// deserializer will clean up afterwards.
pub const Used = union(enum) {
    Single: Range,
    Multiple: []const Range,
};

// The new return type of `visitString`.
//
// If `used` is `null`, then the visitor did not use `input` as part of its final
// return value.
pub fn Return(comptime T: type) type {
    return struct {
        value: T,
        used: ?Used = null,
    };
}

For example, suppose a user wants to deserialize "John Doe" into the type struct { first: []u8, last: []u8 }, where first is John and last is Doe. As it is now, they'd be force to either make a copy of both John and Doe, or use one of the names directly and copy the other. However, with this new return type, they can assign input[0..4] to first and input[5..] to last, allocate a slice for the used field in the return value, populate it appropriately, and they're done!

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

As a sort of sanity check (and to give me a bit of motivation to work on this), I did some very simple deserialization benchmarking with some JSON data that had lots of strings. I deserialized the input data into a struct 100,000 times with both Getty JSON and std.json, and as expected Getty JSON was much slower.

const std = @import("std");
const json = @import("json");

const c_ally = std.heap.c_allocator;

const T = []struct {
    id: []const u8,
    type: []const u8,
    name: []const u8,
    ppu: f64,
    batters: struct {
        batter: []struct {
            id: []const u8,
            type: []const u8,
        },
    },
    topping: []struct {
        id: []const u8,
        type: []const u8,
    },
};

const input =
    \\[
    \\    {
    \\        "id": "0001",
    \\        "type": "donut",
    \\        "name": "Cake",
    \\        "ppu": 0.55,
    \\        "batters": {
    \\            "batter": [
    \\                {
    \\                    "id": "1001",
    \\                    "type": "Regular"
    \\                },
    \\                {
    \\                    "id": "1002",
    \\                    "type": "Chocolate"
    \\                },
    \\                {
    \\                    "id": "1003",
    \\                    "type": "Blueberry"
    \\                },
    \\                {
    \\                    "id": "1004",
    \\                    "type": "Devil's Food"
    \\                }
    \\            ]
    \\        },
    \\        "topping": [
    \\            {
    \\                "id": "5001",
    \\                "type": "None"
    \\            },
    \\            {
    \\                "id": "5002",
    \\                "type": "Glazed"
    \\            },
    \\            {
    \\                "id": "5005",
    \\                "type": "Sugar"
    \\            },
    \\            {
    \\                "id": "5007",
    \\                "type": "Powdered Sugar"
    \\            },
    \\            {
    \\                "id": "5006",
    \\                "type": "Chocolate with Sprinkles"
    \\            },
    \\            {
    \\                "id": "5003",
    \\                "type": "Chocolate"
    \\            },
    \\            {
    \\                "id": "5004",
    \\                "type": "Maple"
    \\            }
    \\        ]
    \\    },
    \\    {
    \\        "id": "0002",
    \\        "type": "donut",
    \\        "name": "Raised",
    \\        "ppu": 0.55,
    \\        "batters": {
    \\            "batter": [
    \\                {
    \\                    "id": "1001",
    \\                    "type": "Regular"
    \\                }
    \\            ]
    \\        },
    \\        "topping": [
    \\            {
    \\                "id": "5001",
    \\                "type": "None"
    \\            },
    \\            {
    \\                "id": "5002",
    \\                "type": "Glazed"
    \\            },
    \\            {
    \\                "id": "5005",
    \\                "type": "Sugar"
    \\            },
    \\            {
    \\                "id": "5003",
    \\                "type": "Chocolate"
    \\            },
    \\            {
    \\                "id": "5004",
    \\                "type": "Maple"
    \\            }
    \\        ]
    \\    },
    \\    {
    \\        "id": "0003",
    \\        "type": "donut",
    \\        "name": "Old Fashioned",
    \\        "ppu": 0.55,
    \\        "batters": {
    \\            "batter": [
    \\                {
    \\                    "id": "1001",
    \\                    "type": "Regular"
    \\                },
    \\                {
    \\                    "id": "1002",
    \\                    "type": "Chocolate"
    \\                }
    \\            ]
    \\        },
    \\        "topping": [
    \\            {
    \\                "id": "5001",
    \\                "type": "None"
    \\            },
    \\            {
    \\                "id": "5002",
    \\                "type": "Glazed"
    \\            },
    \\            {
    \\                "id": "5003",
    \\                "type": "Chocolate"
    \\            },
    \\            {
    \\                "id": "5004",
    \\                "type": "Maple"
    \\            }
    \\        ]
    \\    }
    \\]
;

fn stdJson() !void {
    for (0..100_000) |_| {
        const output = try std.json.parseFromSlice(T, c_ally, input, .{});
        defer output.deinit();
    }
}

fn gettyJson() !void {
    for (0..100_000) |_| {
        const output = try json.fromSlice(c_ally, T, input);
        defer json.de.free(c_ally, output, null);
    }
}

fn gettyJsonArena() !void {
    for (0..100_000) |_| {
        var arena = std.heap.ArenaAllocator.init(c_ally);
        const arena_ally = arena.allocator();
        defer arena.deinit();

        _ = try json.fromSlice(arena_ally, T, input);
    }
}

pub fn main() !void {
    //try gettyJson();
    //try gettyJsonArena();
    //try stdJson();
}
$ hyperfine --warmup 5 ./getty ./getty-arena ./std
Benchmark 1: ./getty
  Time (mean ± σ):     718.4 ms ±   3.7 ms    [User: 713.4 ms, System: 3.5 ms]
  Range (minmax):   715.6 ms727.9 ms    10 runs

Benchmark 2: ./getty-arena
  Time (mean ± σ):     628.7 ms ±   1.5 ms    [User: 622.5 ms, System: 4.6 ms]
  Range (minmax):   626.6 ms631.3 ms    10 runs

Benchmark 3: ./std
  Time (mean ± σ):     482.7 ms ±   1.5 ms    [User: 476.2 ms, System: 4.9 ms]
  Range (minmax):   481.2 ms486.4 ms    10 runs

Summary
  ./std ran
    1.30 ± 0.01 times faster than ./getty-arena
    1.49 ± 0.01 times faster than ./getty

The unnecessary allocations are, surely, the main factor for Getty's slowness. However, I should note that std.json.parseFromSlice uses an arena allocator implicitly whereas json.fromSlice does not, and you can see from the results that using an arena does indeed make a difference.

With or without an arena, though, Getty's still much slower. So it's time to get started on this issue!

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

Removed accepted label for now since, as pointed out by fredi, there are major issues with implementing this kind of thing, which I've unfortunately had the chance to run into in my own branch.

For one thing, the multiple ranges idea was a total non-starter. I think my brain farted when I came up with that. The input for visitString is always a slice, which is a single allocation. Ya can't just free individual pieces of a single allocation. So now we have no way of letting visitors use only parts of input.

Another issue is getty.de.free. How will it be able to know if the strings of a value, especially if they have different lifetimes, should be freed or not?

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

Before I can implement the lifetime optimizations, I had to do a bit of general allocation work beforehand. The above, merged PR implements that allocation work.

In summary, Getty now uses an arena internally for all allocations. This simplifies visitors and deserialization blocks as they no longer have to worry about freeing values and allows end users to free everything whenever they want. Additionally, the arena is passed to the methods of Deserializer implementations so they're simplified a tad as well.

Big thanks to fredi for discussing all this with me and steering me in the right direction :D

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

The last half consists of the lifetime work, which consists of two parts:

  • Add StringLifetime and ValueLifetime types.
  • Update visitString to return not only the produced value but also an indicator of whether or not the input string was used directly in the final, returned value.

Lifetimes

The lifetime types will be more or less the same as what I've already proposed:

  • For visitString, StringLifetime will have the following variants:

    • Stack: The value is on the stack and its lifetime is shorter than the deserialization process.
      • These values must be copied.
    • Heap: The value is on the heap and its lifetime is tied to the arena allocator returned to the end user.
      • These values can be copied or used directly (Getty's default behavior will be to use them directly).
      • If there's an error or the value is copied, do not free the value in the visitor. Either the deserializer will free it or it will be freed by the arena at the end.
    • Managed: The value is on the stack or heap and its lifetime is managed by an entity that is not the arena allocator returned to the end user.
      • These values should be copied, but can be used directly if the user knows that the value's lifetime is safe or acceptable.
      • If there's an error or the value is copied, do not free the value in the visitor. The managing entity will free it later on.
  • For access methods (e.g., nextKeySeed), ValueLifetime will have the following variants:

    • Heap: The value is on the heap and its lifetime is tied to the arena allocator returned to the end user.
      • These values can be copied or used directly (Getty's default behavior will be to use them directly).
      • If there's an error or the value is copied, free the value in the visitor if you want. It's part of the arena returned to the user so you could also just leak it and it'll be cleaned up at the end.
    • Managed: The value is on the stack or heap and its lifetime is managed by an entity that is not the arena allocator returned to the end user.
      • These values should be copied, but can be used directly if the user knows that the value's lifetime is safe or acceptable.
      • If there's an error or the value is copied, do not free the value in the visitor. The managing entity will free it later on.

visitString's Return Type

Before, I proposed returning from visitString the produced value and a slice indicating what part of the input parameter was used.

With the arena, the slice is no longer necessary. A simple bool will suffice, where true means that input was used directly.

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

Performance updates after arena changes (using same benchmarking code):

$ hyperfine --warmup 5 ./getty ./std
Benchmark 1: ./getty
  Time (mean ± σ):     688.8 ms ±   1.9 ms    [User: 684.7 ms, System: 3.0 ms]
  Range (minmax):   685.8 ms692.3 ms    10 runs

Benchmark 2: ./std
  Time (mean ± σ):     484.0 ms ±   2.7 ms    [User: 480.7 ms, System: 2.2 ms]
  Range (minmax):   481.3 ms489.1 ms    10 runs

Summary
  ./std ran
    1.42 ± 0.01 times faster than ./getty

Slightly slower than getty-arena was, but faster than the old getty version. Since there's no lifetimes right now, Getty doesn't free anything at all, including struct keys. Perhaps keeping around all that cruft all the time is slowing things down?

from getty.

ibokuri avatar ibokuri commented on June 11, 2024

Performance update after some optimizations in getty-json (no peeking, always allocating strings, heap branch first).

Note that std's runtime has increased overall b/c we're now correctly passing in .alloc_always. Beforehand, it was returning a slice into the scanner's input which, if we hadn't used a static string input, would be completely invalid.

$ hyperfine --warmup 5 ./getty ./std
Benchmark 1: ./getty
  Time (mean ± σ):     678.8 ms ±   1.3 ms    [User: 675.0 ms, System: 2.9 ms]
  Range (minmax):   676.3 ms680.7 ms    10 runs

Benchmark 2: ./std
  Time (mean ± σ):     573.4 ms ±   2.8 ms    [User: 567.5 ms, System: 4.6 ms]
  Range (minmax):   569.5 ms578.8 ms    10 runs

Summary
  ./std ran
    1.18 ± 0.01 times faster than ./getty

We shaved off around 10ms.

from getty.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.