GithubHelp home page GithubHelp logo

ubpf needs to support maps about ubpf HOT 8 CLOSED

iovisor avatar iovisor commented on July 22, 2024 1
ubpf needs to support maps

from ubpf.

Comments (8)

jpsamaroo avatar jpsamaroo commented on July 22, 2024

I think this would be great to have! Having these maps also implies that we'll want to have helpers for each map, so I guess we'll need to provide a "kernel-compatible" ext_func set.

from ubpf.

Alan-Jowett avatar Alan-Jowett commented on July 22, 2024

I think it might make more sense to provide a function like:

/*
 * Register a map.
 *
 *  Set the address of the map in the lddw instruction based on the name resolved from the 
 *  symbol section of the ELF file.
 *
 * 'name' should be a string with a lifetime longer than the VM.
 *
 * Returns 0 on success, -1 on error.
 */
int ubpf_register_map(struct ubpf_vm *vm, const char *name, void *map);

This would do the following:

  1. During loading of the ELF, process the relocations.
  2. If the relocation points to the .maps section.
  3. Look up symbol for the relocation.
  4. Look up symbol -> map in internal table.
  5. Set map address in lddw instruction.

x64 instruction then is just a load of the immediate value. Any helpers that use maps then get passed the address of the map (this is similar to what Linux jitter does, AFAIK)

from ubpf.

jpsamaroo avatar jpsamaroo commented on July 22, 2024

What about hash maps (and other non-contiguous data structures)? We'd still need special helpers to process them. Also, contiguous array access is usually done first via helper (to get the address of the element passed), so to support programs generated by tools focused on the Linux kernel's implementation, we'd need these helpers available.

from ubpf.

Alan-Jowett avatar Alan-Jowett commented on July 22, 2024

See BPF syscall, maps, verifier, samples, llvm

I am proposing we adopt something similar to the process that Alexi describes in the email thread, but with ubpf_load_elf replacing the instruction with a lddw with the address directly.

In the design listed, bpf_lookup_elem kernel helper function accepts the address of the map, not a FD. So we need the jitter to emit code with the address.

from ubpf.

Alan-Jowett avatar Alan-Jowett commented on July 22, 2024

After researching this, eBPF it appears as if the Linux kernel implements this via a BPF_LD_MAP_FD pseudo instruction, where the jitter replaces imm value with the address of the map.

from ubpf.

YutaroHayakawa avatar YutaroHayakawa commented on July 22, 2024

Just a comment

I think if we support maps in ubpf, it's worth to think about how we implement the map it self, especially about the concurrency. Of course, it depends on the execution environment, but I guess it would be more generic if it support multi threading environment.

In Linux-kernel implementation, there are various form of concurrency for maps. For example, array map basically doesn't perform any locking during lookup/update for performance. On the other hand, hash map takes lock when update/delete.

And the most notable stuff is the eBPF program in kernel is basically running under RCU (e.g. https://elixir.bootlin.com/linux/latest/source/drivers/net/veth.c#L580). This strong assumption makes possible to do the operation like below.

elem = bpf_map_lookup_elem(&map, &key);
...
// in-place update
elem->foo = 0xdeadbeef;

It is possible that another thread deletes the map element between lookup and in-place update, but thanks to the RCU, the memory for the element will never released before this program finish running. So nothing will be broken.

There is a project which implements RCU-like feature in user space (https://liburcu.org). However, this is LGPL so, not compatible with ubpf. Concurrency-kit is a good candidate I think, it's 2-clause-BSD (http://concurrencykit.org). "epoch" feature of it can be used like RCU. For implementation, notice both of them requires to register some context per thread, means maybe we need to enforce users to register it or intercept pthread_create or something...

Actually, I tried to do the same thing for generic-ebpf project (https://github.com/generic-ebpf/generic-ebpf). This was the pitfall I've met.

Good luck :)

from ubpf.

Alan-Jowett avatar Alan-Jowett commented on July 22, 2024

I think it make sense to decouple the maps part from the jitter part of ubpf.

Taking a look at the docs here eBPF maps it seems like it would make sense to implement this as follows:

  1. Consumer of ubpf provides a map_create callback passing (context, map_definition).
  2. Parse the .maps section in the ELF, invoking the map_create callback for each map, storing symbol->map_fd
  3. During the relocation, when we hit a map relocation, modify byte code setting the lddw map pseudo instruction and setting the fd in the instruction.

The consumer of the ubpf can then provide their own map implementation as they see fit via the map_create callback and the bpf_map_* helper functions.

from ubpf.

Alan-Jowett avatar Alan-Jowett commented on July 22, 2024

Going to build this on top of this PR as I need the code to lookup sections by name.

from ubpf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.