eomii / rules_ll Goto Github PK

View Code? Open in Web Editor NEW

73.0 4.0 8.0 4.01 MB

An Upstream Clang/LLVM-based toolchain for contemporary C++ and heterogeneous programming

Home Page: https://ll.eomii.org

License: Other

Starlark 85.86% Shell 1.35% Nix 12.78%

clang clang-tidy llvm cuda hip gpu-programming bazel build-system openmp sanitizers

rules_ll's Introduction

`rules_ll`

An upstream Clang/LLVM-based toolchain for contemporary C++ and heterogeneous programming.

This project interleaves Nix and Bazel with opinionated Starlark rules for C++.

Builds running within rules_ll-compatible workspaces achieve virtually perfect cache hit rates across machines, using C++ toolchains often several major versions ahead of most other remote execution setups.

The ll_* rules use a toolchain purpose-built around Clang/LLVM. You can't combine ll_* and cc_* targets at the moment, but you can still build cc_* projects in rules_ll-workspaces to leverage the remote execution setup and share caches.

✨ Setup

Install nix with flakes.
Create a rules_ll compatible workspace. To keep the development shell in sync with the rules_ll Bazel module, pin the flake to a specific commit:
```
git init
nix flake init -t github:eomii/rules_ll/<commit>
```
The default toolchains include C++ and HIP for AMDGPU. If you also want to target NVPTX devices (Nvidia GPUs), make sure to read the CUDA license and set comment.allowUnfree and config.cudaSupport in flake.nix.

Warning

Don't use the tags or releases from the GitHub repository. They're an artifact from old versions of rules_ll and probably in a broken state. We'll remove them at some point. Use a pinned commit instead.

Enter the rules_ll development shell:
```
nix develop
```

Tip

Strongly consider setting up direnv so that you don't need to remember running nix develop to enter the flake and exit to exit it.

Consider setting up at least a local remote cache as described in the remote execution guide.

🔗 Links

🚀 C++ modules

Use the interfaces and exposed_interfaces attributes to build C++ modules. C++ modules guide.

load(
    "@rules_ll//ll:defs.bzl",
    "ll_binary",
    "ll_library",
)

ll_library(
    name = "mymodule",
    srcs = ["mymodule_impl.cpp"],
    exposed_interfaces = {
        "mymodule_interface.cppm": "mymodule",
    },
    compile_flags = ["-std=c++20"],
)

ll_binary(
    name = "main",
    srcs = ["main.cpp"],
    deps = [":mymodule"],
)

🧹 Clang-tidy

Build compilation databases to use Clang-Tidy as part of your workflows and CI pipelines. Clang-Tidy guide.

load(
   "@rules_ll//ll:defs.bzl",
   "ll_compilation_database",
)

filegroup(
    name = "clang_tidy_config",
    srcs = [".clang-tidy"],
)

ll_compilation_database(
   name = "compile_commands",
   targets = [
      ":my_very_tidy_target",
   ],
   config = ":clang_tidy_config",
)

😷 Sanitizers

Integrate sanitizers in your builds with the sanitize attribute. Sanitizers guide.

load(
    "@rules_ll//ll:defs.bzl",
    "ll_binary",
)

ll_binary(
    name = "sanitizer_example",
    srcs = ["totally_didnt_shoot_myself_in_the_foot.cpp"],
    sanitize = ["address"],
)

🧮 CUDA and HIP

Use CUDA and HIP without any manual setup. CUDA and HIP guide.

load(
    "@rules_ll//ll:defs.bzl",
    "ll_binary",
)

ll_binary(
    name = "cuda_example",
    srcs = ["look_mum_no_cuda_setup.cu"],
    compilation_mode = "cuda_nvptx",  # Or "hip_nvptx". Or "hip_amdgpu".
    compile_flags = [
        "--std=c++20",
        "--offload-arch=sm_70",  # Your GPU model.
    ],
)

📜 License

Licensed under the Apache 2.0 License with LLVM exceptions.

This repository uses overlays and automated setups for the CUDA toolkit and HIP. Using compilation_mode for heterogeneous toolchains implies acceptance of their licenses.

rules_ll's People

Contributors

Stargazers

Watchers

Forkers

aaronmondal h13035 spamdoodler jaroeichler jannisfengler silvanshade gmh5225 mimed95

rules_ll's Issues

Migrate to zlib-ng

Since zlib still hasn't addressed madler/zlib#633 in almost a year we should consider it unsupported and deprecated.

I've already sent https://reviews.llvm.org/D143320 but that'll take some time to get into LLVM main due to the official overlay not yet using bzlmod by default. However, we can already use the patch in rules_ll.

We should probably also aim to upstream our zlib-ng buildfile to the BCR.

Toolchain transitions need to be more flexible

We already use extensive toolchain transitions to handle our various compilation_modes. It looks like this is not enough anymore.

Our current approach is limited in the following ways:

We cannot create dynamic libraries in the bootstrap toolchain as linking is not supported there. This means that we can't provide builds for dynamic libc++ and friends.
We cannot use ll_binary tools in genrules since that requires the ll_binary to be in exec configuration. We need some way to transition from the compilation_mode-specific target configurations to an exec configuration. This is not supported at the moment.

We need to be careful that opening the toolchains up to handle such cases only leads to excessive rebuilds when absolutely necessary. Otherwise users may end up building LLVM several times just to get a trivial ll_binary working in a genrule. We may also need better platform support to tackle this elegantly.

Things work at the moment because we can fall back to rules_cc for exec tools. This is a very undesirable limitation of the current implementation.

Remote execution images too hard to customize

The only remote execution image currently provided is the default image which we use for the tests and pin in rbe/default/config/BUILD.

The default image includes openssl because the examples require it. This is not ideal. Since all the toolchain and container auto generation can be difficult to grasp we should provide a straightforward, documented way to customize it.

The `ll init` command is a bit whacky

To make sure that we don't accidentally destroy people's workspaces the current ll init only appends some contents to files. If one runs the command more than once this will lead to duplicate code in those files which can look somewhat buggy.

We should probably factor the command out into a separate shell script and add more flexibility/checking/whatever to improve its user experience. This should be an actual shell script instead of a nix string template so that we can properly run linters on it. This likely requires changing the structure of the command in a way that every variable is passed as an external argument. I'm thinking something along the lines of invoking it like this in the flake:

''${ll} \
  --bazelversion=${./.bazelversion} \
  --module=${./examples/MODULE.bazel} \
  --bazelrc=${./examples/.bazelrc}''

Upstream parts of rules_ll into the original Clang/LLVM Bazel overlay

It may be desirable for non-rules_ll users to get bzlmod support for the original Clang/LLVM Bazel overlay. The files whose contents we may be able to upstream are ll/extensions.bzl, MODULE.bzl and .bazelrc. Ideally, bzlmod users should be able to import llvm-project via the bazel-central-registry.

rules_ll specific extensions should remain in this repository and the bazel-eomii-registry.

Linking in WSL uses system libraries

When attempting to adjust linking paths of libraries such as OpenSSL or libcrypto under WSL Ubuntu 22.04 with the "-L" flag the build will use the system libraries instead of the provided ones by the development environment.

Readd vale

As part of the transition to the flake-based workflow we removed Vale.

Getting it to run again is slightly tricky, as we need an additional config step before we can run the vale binary. Let's try to make things work again in a reproducible manner, i.e. ideally without having to rely on vales irreproducible autoinstaller.

rules_ll fails if gcc is used to build the Clang/LLVM based toolchain from upstream

One of the main goals of rules_ll is to build a Clang/LLVM based toolchain from upstream. This should work with Clang and GCC.
One error occurs when running the examples with GCC as default compiler:

error: zlib.h: no such file or directory
Which can be fixed with installing the libz-dev package.

After installing the missing headers, the build fails with following error message:

ERROR: /root/.cache/bazel/_bazel_root/79c7c71f78facf0e35780b9a06528730/external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/BUILD.bazel:164:11: Compiling llvm/lib/Support/Process.cpp [for tool] failed: (Exit 1): gcc failed: error executing command (from target @@rules_ll.override.llvm_project_overlay.llvm-project//llvm:Support) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 70 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
In file included from external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Process.cpp:107:
external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Unix/Process.inc: In static member function 'static size_t llvm::sys::Process::GetMallocUsage()':
external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Unix/Process.inc:93:20: error: aggregate 'llvm::sys::Process::GetMallocUsage()::mallinfo2 mi' has incomplete type and cannot be defined
93 | struct mallinfo2 mi;
| ^~
external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Unix/Process.inc:94:10: error: '::mallinfo2' has not been declared
94 | mi = ::mallinfo2();
| ^~~~~~~~~
Target //format_example:format_example failed to build

The system is Ubuntu 20.04.4 LTS, the GCC version is 9.4.0

Rework inclusion handling

Quoted includes and angle includes need to be clearly separated. (-iquote, -isystem, -I, -idirafter -isystem-after).
There should be no implicit header includes. If something uses an unusual include path it should be specified manually. We should adhere to the C++ standard.
There should be a mechanism similar to strip_prefix in rules_cc. Otherwise we have to manually specify includes for external repositories. There were reasons for not implementing strip_prefix like in rules_cc. I will post an update when I remember the details. Currently, the compiler is invoked at the top-level of the action sandbox. If we were to move it into the build subdirectory within that sandbox we will need to change the way inclusions of external headers are handled (maybe prefix with ../../ or something like that).
Extra care needs to be taken that system headers do not accidentally include library headers named like system headers. This is an issue arising from wrong include order.

Migrate CUDA imports to new variants in nixpkgs

NixOS/nixpkgs#224646 (comment) mentioned that the way we currently import CUDA from nix is outdated. We should change imports from the outdated

pkgs.cudaPackages.cudatoolkit

cudaPackages.{lib,cuda_foo}

@JannisFengler @SpamDoodler This might make WSL compatibility work.

Add libxml2-dev to bazel registry

Building Clang from upstream depends on libxml2-dev. libxml2-dev should be added as external dependency in the bazel-eomii-registry and to the dependencies for rules_ll.

Vale pre-commit hook not reporting on error-free files

Vale pre-commit hooks don't work properly at the moment. Tracking progress in errata-ai/vale#575.

Borrowing an Nvidia GPU

Hi Aaron, do you have time to meet in the city center now, so I can borrow you an Nvidia GPU to fix the cuda tool chain errors?
Best regards,
Jannis

ld.lld is unable to find libraries

ld.lld is unable to find libraries when building with Ubuntu 22.04 and gcc 11.2.0.

ERROR: /home/ubuntu/rules_ll/examples/format_example/BUILD.bazel:3:10: LlLinkExecutable format_example/format_example failed: (Exit 1): ld.lld failed: error executing command (from target //format_example:format_example) bazel-out/k8-fastbuild/bin/external/@rules_ll.override/ll/ld.lld --color-diagnostics '-dynamic-linker=/lib64/ld-linux-x86-64.so.2' --lto-O3 --pie --nostdlib -L/usr/lib64 -lm -ldl -lpthread -lc ... (remaining 11 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
ld.lld: error: unable to find library -lm
ld.lld: error: unable to find library -ldl
ld.lld: error: unable to find library -lpthread
ld.lld: error: unable to find library -lc
Target //format_example:format_example failed to build

I set the symlinks for crt*.o and Scrt*.o manual, but all the libraries are in /usr/lib/x86_64-linux-gnu and not in /usr/lib64 .
Maybe it makes sense to to set /usr/lib/x86_64-linux-gnu the default?

Document new remote execution toolchains

This is a nuanced topic requiring dedicated docs and architecture explanations.

Consider building our own remote execution service

While we the new remote execution workflows are very efficient, we are still running gigantic builds compared to most other projects. This means that we quickly fall out of the "free" or "open source" tiers of remote execution services. Self-hosting might be inevitable 😅

For a full setup there is buildfarm. Regarding remote caching there is the pretty good bazel-remote, but it might also be fun to try wrapping dragonflydb with the remote-api gRPC calls and use that as cache.

The remote-apis are fairly straightforward, so we could also build an entire stack ourselves.

Consider importing BUILD.bazel overlays as static variables

bazelbuild/bazel#14659 prevents us from leveraging our registry.

We can probably work around this by moving the third-party-overlays/*.BUILD.bazel to static variables.

Handle position independent code compilation

Blocks #4.

Document usage with local CUDA

This is already possible, but we should probably document the workflow. I suspect that this is especially relevant for WSL2 users because the WSL CUDA driver tends to differ from the one we package in rules_ll#unfree.

Maybe we should add explicit checks that set certain rpath values for WSL as well?

Embedded device support (Steamdeck)

I started testing Steamdeck support for GPU (and CPU) code execution. My excuse is that Teslas run on similar APU architecture and I am thinking that performace gained here is worth looking into.

Too many shared libraries in heterogeneous toolchains

After adding support for shared libraries our heterogeneous toolchains broke. The new shared library linking causes us to blindly link all of CUDA's shared libraries which is of course not what we want. Instead of rewriting the linking logic, we may want to consider rewriting the CUDA-related build files and/or making the toolchains finer-grained in the sense that static and shared libraries are more clearly separated.

Sanitizers incompatible with `depends_on_llvm`

We are missing #include <sanitizer/msan_interface.h>. Probably caused by drift from upstream. Should be easy to fix.

osx support

title

Support multiple targets in `compilation_database`

Using the compilation_database rule is too clunky otherwise.

😱 Docs for `ll.defs` unreadable

The docs for ll.defs are messed up. somehow we have some super long lines in there 🤣

:scream: Clang tidy too slow

Heterogeneous code takes forever to check. This is likely caused by all the CUDA and HIP headers we have to include. There should be some builtin default setting to exclude these headers from the checks.

Cannot run tests in CI

Attempts to run the tests in CI via remote execution currently doesn't work because Bazel doesn't like to run in a nix-built container. build and run works, but test doesn't, most likely due to bazelbuild/bazel#12579.

Technically it's already decent coverage if just builds pass, but many issues arise from dynamic linking behavior and are only visible during runtime. So at the moment we'd either have to run all examples manually without the ll_test wrappers, or only run a bazel build cpp without running anything.

Another option would be to build a custom Bazel which we distribute as part of rules_ll. Building a custom Bazel against an LLVM toolchain and statically linking libc++ could be an option that keeps things portable between CI and regular usage, but it might lead to issues for non-nix workflows.

@JannisFengler @SpamDoodler @jaroeichler What do you think? Statically linking Bazel with libc++ would add a few MB to all images, caches, the devenv etc because we'd have duplicate libc++ functions in every subbinary and we'd have to thinkg about infrastructure to support staying upstream with the bazel sources. That would make it easier to get remote execution to work though. Do we want to go down that path or should we try to find another solution?

Add support for packaging

We currently do not support loading loading shared objects during runtime. An ll_pkg rule may be an option.

BMIs are visible to `pcm` -> `o` compilations

While precompilations correctly cannot see each other if they are specified in the same interfaces attribute, the same is not true for the implicit BMI-to-Object compilation. This is a bug. Only files in srcs should be able to see BMIs from interfaces.

Release blockers

We need to address these issues before we can release the next version of rules_ll:

We can't ignore example lockfile but also can't commit it

Surely there is some workaround.

Ignoring examples/flake.lock causes direnv/devenv to break
Committing it would break downstream users because the relative reference to rules_ll via ../ is not reproducible across machines (for different users the absolute path to the directory is different).
Usually one would use git update-index --skip-worktree for this, but that doesn't work with devenv.

What a dilemma lmao.

For now I've sent #67, but that's hardly a satisfactory solution.

WSL: Number of devices is 0

WSL GPU Detection Issue in CUDA Example

Problem

Running the default CUDA example in WSL fails to detect the GPU.

Workaround

Setting LD_LIBRARY_PATH resolves the issue:

export LD_LIBRARY_PATH="/usr/lib/wsl/lib:$LD_LIBRARY_PATH"

Suggestion

rules_ll appends automatically /usr/lib/wsl/lib to LD_LIBRARY_PATH when rules_ll is running in WSL.
(Not sure If this belongs into the nix flake or into the bazel rules)

Improve user experience for the `rbegen` tool

This tool currently requires users to manually run the pre-commit hooks to reliably check whether generated configs have changed. This should be integrated into the rbegen invocation.

We should also add a release attribute to the tool that tags the image with a release version and pushes it to a remote registry. We need this to release the next version of rules_ll.

Rework internal file inputs as preparation for module std

The draft #98 adds experimental support for C++23 module std. Getting things to work required some customizations to the internal file inputs and to the way we handle toolchain.cpp_stdlib. This is not pretty. We should rework things in a way that doesn't require hacky list indexing and .to_list()ing depsets.

Investigate the use of aspects for clang-tidy

Playing around with clippy in rules_rust made me notice how incredibly convenient it would be to have clang-tidy run as a plugin that just prints warnings like a "regular" compiler warning. I'm not sure whtether this is possible, but if i was, it could be a significant improvement for our user experience and would obsolete the ll_compilation_database targets in many cases.

Let's see whether it's possible to copy the rules_rust/clippy behavior to rules_ll/clang-tidy.

😵 GPU examples make my eyes bleed

It can be tricky to write CUDA/HIP code that at least remotely looks like C++. At the moment the examples are littered with // NOLINT directives so that clang-tidy doesn't completely ragequit.

Let's try to find better ways to write these examples.

Rework `aggregate` attribute

We need support for creating dynamic shared objects. Clang plugins such as the hipsycl plugin require this.

The best way to implement this is probably by reworking the aggregate attribute. This will require support for position independent code.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Repository problems

These problems occurred while renovating this repository. View logs.

WARN: Package lookup failures

Warning

Renovate failed to look up the following dependencies: Could not determine new digest for update (github-tags package eomii/rules_ll).

Files affected: templates/default/MODULE.bazel

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

🤖 Update dependency rules_java to v7.6.0

Detected dependencies

bazel-module

MODULE.bazel

platforms 0.0.10

rules_cc 0.0.9

bazel_skylib 1.6.1

rules_java 7.5.0

stardoc 0.6.2

llvm-project-overlay 17-init-bcr.3

templates/default/MODULE.bazel

rules_ll <TODO: USE THE COMMIT FROM THE FLAKE HERE HERE>

bazelisk

.bazelversion

bazel 8.0.0-pre.20240422.4

templates/default/.bazelversion

bazel 8.0.0-pre.20240422.4

github-actions

.github/workflows/docs.yml

ubuntu 22.04

.github/workflows/pre-commit.yml

ubuntu 22.04

.github/workflows/scorecard.yml

ubuntu 22.04

templates/default/.github/workflows/pre-commit.yml

ubuntu 22.04

Check this box to trigger a request for Renovate to run again on this repository