Comments (8)
The docker build also fails for me with the following error.
# first, install bazelisk, then
docker run --name xla --gpus all -w /xla -it -d --rm -v $PWD:/xla tensorflow/build:latest-python3.9 bash
docker exec xla ./configure.py --backend=CUDA --nccl
docker exec xla bazel build --test_output=all --spawn_strategy=sandboxed //xla/...
...
ERROR: /xla/xla/tools/BUILD:694:14: Linking xla/tools/extract_collective_operations failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/tools:extract_collective_operations) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc @bazel-out/k8-opt/bin/xla/tools/extract_collective_operations-2.params
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
/usr/bin/ld: bazel-out/k8-opt/bin/xla/stream_executor/cuda/libcuda_platform_cuda_only.lo(cuda_platform.o): in function `stream_executor::gpu::CudaPlatform::~CudaPlatform()':
cuda_platform.cc:(.text._ZN15stream_executor3gpu12CudaPlatformD2Ev+0x18): undefined reference to `stream_executor::ExecutorCache::~ExecutorCache()'
/usr/bin/ld: bazel-out/k8-opt/bin/xla/stream_executor/cuda/libcuda_platform_cuda_only.lo(cuda_platform.o): in function `stream_executor::gpu::CudaPlatform::~CudaPlatform()':
cuda_platform.cc:(.text._ZN15stream_executor3gpu12CudaPlatformD0Ev+0x18): undefined reference to `stream_executor::ExecutorCache::~ExecutorCache()'
/usr/bin/ld: bazel-out/k8-opt/bin/xla/stream_executor/cuda/libcuda_platform_cuda_only.lo(cuda_platform.o): in function `stream_executor::gpu::CudaPlatform::GetExecutor(stream_executor::StreamExecutorConfig const&)':
cuda_platform.cc:(.text._ZN15stream_executor3gpu12CudaPlatform11GetExecutorERKNS_20StreamExecutorConfigE+0x1d): undefined reference to `stream_executor::ExecutorCache::Get(stream_executor::StreamExecutorConfig const&)'
/usr/bin/ld: cuda_platform.cc:(.text._ZN15stream_executor3gpu12CudaPlatform11GetExecutorERKNS_20StreamExecutorConfigE+0x49): undefined reference to `stream_executor::ExecutorCache::GetOrCreate(stream_executor::StreamExecutorConfig const&, std::function<absl::lts_20230802::StatusOr<std::unique_ptr<stream_executor::StreamExecutor, std::default_delete<stream_executor::StreamExecutor> > > ()> const&)'
/usr/bin/ld: bazel-out/k8-opt/bin/xla/stream_executor/cuda/libcuda_platform_cuda_only.lo(cuda_platform.o): in function `_GLOBAL__sub_I_cuda_platform.cc':
cuda_platform.cc:(.text.startup+0x6b): undefined reference to `stream_executor::ExecutorCache::ExecutorCache()'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
[37,322 / 48,603] Compiling src/cpu/x64/gemm/f32/jit_avx512_core_f32_copy_at_kern_part1_autogen.cpp [for tool]; 128s processwrapper-sandbox ... (39 actions running)
INFO: Elapsed time: 2286.145s, Critical Path: 211.76s
INFO: 37361 processes: 18996 internal, 1 local, 18364 processwrapper-sandbox.
FAILED: Build did NOT complete successfully
from xla.
I just now found this issue #10616
from xla.
Ok, somehow I missed the other issues. I'll close this as a duplicate.
from xla.
I ran into a similar build issue:
ERROR: /users/neeld2/xla/xla/service/gpu/BUILD:652:8: Compiling xla/service/gpu/ir_emitter_triton_mem_utils_test.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/service/gpu:ir_emitter_triton_mem_utils_test) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/xla/service/gpu/_objs/ir_emitter_triton_mem_utils_test/ir_emitter_triton_mem_utils_test.d ... (remaining 459 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
clang: warning: argument unused during compilation: '--cuda-path=/usr/local/cuda-12.3' [-Wunused-command-line-argument]
In file included from xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:41:
In file included from ./xla/service/gpu/ir_emitter_triton.h:38:
In file included from ./xla/hlo/ir/hlo_computation.h:31:
external/com_google_absl/absl/log/log.h:199:9: warning: 'LOG' macro redefined [-Wmacro-redefined]
199 | #define LOG(severity) ABSL_LOG_INTERNAL_LOG_IMPL(_##severity)
| ^
external/tsl/tsl/platform/default/logging.h:165:9: note: previous definition is here
165 | #define LOG(severity) _TF_LOG_##severity
| ^
In file included from xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:41:
In file included from ./xla/service/gpu/ir_emitter_triton.h:38:
In file included from ./xla/hlo/ir/hlo_computation.h:31:
external/com_google_absl/absl/log/log.h:237:9: warning: 'LOG_EVERY_N' macro redefined [-Wmacro-redefined]
237 | #define LOG_EVERY_N(severity, n) \
| ^
external/tsl/tsl/platform/default/logging.h:278:9: note: previous definition is here
278 | #define LOG_EVERY_N(severity, n) \
| ^
In file included from xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:41:
In file included from ./xla/service/gpu/ir_emitter_triton.h:38:
In file included from ./xla/hlo/ir/hlo_computation.h:31:
external/com_google_absl/absl/log/log.h:245:9: warning: 'LOG_FIRST_N' macro redefined [-Wmacro-redefined]
245 | #define LOG_FIRST_N(severity, n) \
| ^
external/tsl/tsl/platform/default/logging.h:284:9: note: previous definition is here
284 | #define LOG_FIRST_N(severity, n) \
| ^
In file included from xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:41:
In file included from ./xla/service/gpu/ir_emitter_triton.h:38:
In file included from ./xla/hlo/ir/hlo_computation.h:31:
external/com_google_absl/absl/log/log.h:253:9: warning: 'LOG_EVERY_POW_2' macro redefined [-Wmacro-redefined]
253 | #define LOG_EVERY_POW_2(severity) \
| ^
external/tsl/tsl/platform/default/logging.h:290:9: note: previous definition is here
290 | #define LOG_EVERY_POW_2(severity) \
| ^
In file included from xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:41:
In file included from ./xla/service/gpu/ir_emitter_triton.h:38:
In file included from ./xla/hlo/ir/hlo_computation.h:31:
external/com_google_absl/absl/log/log.h:265:9: warning: 'LOG_EVERY_N_SEC' macro redefined [-Wmacro-redefined]
265 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
| ^
external/tsl/tsl/platform/default/logging.h:300:9: note: previous definition is here
300 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
| ^
In file included from xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:46:
In file included from ./xla/tests/hlo_test_base.h:29:
In file included from ./xla/service/backend.h:29:
In file included from ./xla/service/compiler.h:36:
In file included from ./xla/service/buffer_assignment.h:38:
In file included from ./xla/service/memory_space_assignment/memory_space_assignment.h:188:
./xla/service/memory_space_assignment/cost_analysis.h:118:3: warning: explicitly defaulted default constructor is implicitly deleted [-Wdefaulted-function-deleted]
118 | HloCostAnalysisCosts() = default;
| ^
./xla/service/memory_space_assignment/cost_analysis.h:120:26: note: default constructor of 'HloCostAnalysisCosts' is implicitly deleted because field 'hlo_cost_analysis_' of reference type 'const HloCostAnalysis &' would not be initialized
120 | const HloCostAnalysis& hlo_cost_analysis_;
| ^
./xla/service/memory_space_assignment/cost_analysis.h:118:28: note: replace 'default' with 'delete'
118 | HloCostAnalysisCosts() = default;
| ^~~~~~~
| delete
xla/service/gpu/ir_emitter_triton_mem_utils_test.cc:48:10: fatal error: 'third_party/triton/include/triton/Dialect/Triton/IR/Dialect.h' file not found
48 | #include "third_party/triton/include/triton/Dialect/Triton/IR/Dialect.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6 warnings and 1 error generated.
INFO: Elapsed time: 4452.964s, Critical Path: 266.78s
INFO: 46179 processes: 20516 internal, 25661 linux-sandbox, 2 local.
FAILED: Build did NOT complete successfully
@sei-jgwohlbier can you please share your workaround?
- I am using the commit 8ffe362
- CUDA is 12.3.2, CUDNN is 9. clang 17 (same as yours)
from xla.
@neeldani @sei-jgwohlbier did you figure out what the problem was? I am running into the same problem too.
I am on today's top of tree: 213d931a012a74c581cd0509f3c9391f453bb986
docker exec xla_gpu ./configure.py --backend=CUDA --nccl
docker exec xla_gpu bazel build --spawn_strategy=sandboxed //xla/... --config=cuda --config=monolithic
265 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
| ^
external/tsl/tsl/platform/default/logging.h:300:9: note: previous definition is here
300 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
| ^
xla/service/gpu/execution_stream_assignment_test.cc:100:30: error: no matching constructor for initialization of 'AsyncExecutionStreamIds' (aka 'xla::gpu::ExecutionStreamAssignment::AsyncExecutionStreamIds')
100 | IsOkAndHolds(AsyncExecutionStreamIds(
| ^
101 | /*source_stream_id=*/ExecutionStreamId(0),
| ~~~~~~~~~~~~~~~~~~~~~
102 | /*destination_stream_id=*/ExecutionStreamId(1))));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/com_google_googletest/googlemock/include/gmock/gmock-matchers.h:5437:62: note: expanded from macro 'EXPECT_THAT'
5437 | ::testing::internal::MakePredicateFormatterFromMatcher(matcher), value)
| ^~~~~~~
external/com_google_googletest/googletest/include/gtest/gtest_pred_impl.h:109:23: note: expanded from macro 'EXPECT_PRED_FORMAT1'
109 | GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_NONFATAL_FAILURE_)
| ^~~~~~~~~~~
external/com_google_googletest/googletest/include/gtest/gtest_pred_impl.h:100:17: note: expanded from macro 'GTEST_PRED_FORMAT1_'
100 | GTEST_ASSERT_(pred_format(#v1, v1), on_failure)
| ^~~~~~~~~~~
external/com_google_googletest/googletest/include/gtest/gtest_pred_impl.h:79:52: note: expanded from macro 'GTEST_ASSERT_'
79 | if (const ::testing::AssertionResult gtest_ar = (expression)) \
| ^~~~~~~~~~
./xla/service/gpu/execution_stream_assignment.h:52:10: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 2 were provided
52 | struct AsyncExecutionStreamIds {
| ^~~~~~~~~~~~~~~~~~~~~~~
./xla/service/gpu/execution_stream_assignment.h:52:10: note: candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 2 were provided
52 | struct AsyncExecutionStreamIds {
| ^~~~~~~~~~~~~~~~~~~~~~~
./xla/service/gpu/execution_stream_assignment.h:52:10: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 2 were provided
xla/service/gpu/execution_stream_assignment_test.cc:107:30: error: no matching constructor for initialization of 'AsyncExecutionStreamIds' (aka 'xla::gpu::ExecutionStreamAssignment::AsyncExecutionStreamIds')
107 | IsOkAndHolds(AsyncExecutionStreamIds(
| ^
108 | /*source_stream_id=*/ExecutionStreamId(0),
| ~~~~~~~~~~~~~~~~~~~~~
109 | /*destination_stream_id=*/ExecutionStreamId(2))));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/com_google_googletest/googlemock/include/gmock/gmock-matchers.h:5437:62: note: expanded from macro 'EXPECT_THAT'
5437 | ::testing::internal::MakePredicateFormatterFromMatcher(matcher), value)
| ^~~~~~~
external/com_google_googletest/googletest/include/gtest/gtest_pred_impl.h:109:23: note: expanded from macro 'EXPECT_PRED_FORMAT1'
109 | GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_NONFATAL_FAILURE_)
| ^~~~~~~~~~~
external/com_google_googletest/googletest/include/gtest/gtest_pred_impl.h:100:17: note: expanded from macro 'GTEST_PRED_FORMAT1_'
100 | GTEST_ASSERT_(pred_format(#v1, v1), on_failure)
| ^~~~~~~~~~~
external/com_google_googletest/googletest/include/gtest/gtest_pred_impl.h:79:52: note: expanded from macro 'GTEST_ASSERT_'
79 | if (const ::testing::AssertionResult gtest_ar = (expression)) \
| ^~~~~~~~~~
./xla/service/gpu/execution_stream_assignment.h:52:10: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 2 were provided
52 | struct AsyncExecutionStreamIds {
| ^~~~~~~~~~~~~~~~~~~~~~~
./xla/service/gpu/execution_stream_assignment.h:52:10: note: candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 2 were provided
52 | struct AsyncExecutionStreamIds {
| ^~~~~~~~~~~~~~~~~~~~~~~
./xla/service/gpu/execution_stream_assignment.h:52:10: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 2 were provided
5 warnings and 2 errors generated.
[41,612 / 44,077] Compiling xla/tests/reduce_window_test.cc; 28s processwrapper-sandbox ... (14 actions running)
INFO: Elapsed time: 47.853s, Critical Path: 29.29s
INFO: 19 processes: 17 internal, 2 processwrapper-sandbox.
FAILED: Build did NOT complete successfully
Building for GPU is turning out to be very bad experience so far (I ran into all the reported issues on nccl configs as well. Any help from the right folks would be super helpful. I am happy to draft up a PR to update the README after a successful build. Thanks.
from xla.
Related Issues (20)
- OpenCL Support. HOT 2
- BF16 matmul slower than F32 matmul on T4 GPU HOT 3
- PJRT `CopyCpuBufferToLiteral` of JAX buffer taking too long HOT 9
- Porting XLA to different backends. HOT 4
- Compiling xla/mlir/tools/mlir_interpreter/dialects/util.cc failed HOT 3
- MHLO Extraction from XLA Compiler HOT 4
- Controlling a single compiler pass in XLA for CPU target HOT 2
- Compiling scatter results in very slow while-loop on TPU
- Wrong output from JAX test after onednn change HOT 4
- The official website documentation and the repository documentation do not correspond HOT 5
- lowering stablehlo to ptx HOT 5
- Interleaved/Looping SPMD Pipelines HOT 1
- How are random seeds set for RNG operations? HOT 3
- SPMD - Handle Gather/Scatter is Harcoded to IndexParallel HOT 3
- All Gather Combiner - Coalesces different dtypes HOT 2
- HandleReshape in SPMD partitioner splits sharding in some cases where it is unoptimal HOT 2
- ReduceScatterCreator - Misses One Pattern HOT 3
- [XLA:GPU] cuDNN norm rewriter fails with OOB shape access HOT 6
- Question : When do we not honor "preferred_element_type" ? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xla.