google / gematria Goto Github PK
View Code? Open in Web Editor NEWMachine learning for machine code.
License: Apache License 2.0
Machine learning for machine code.
License: Apache License 2.0
There are other ways to make syscalls on X86, including int 0x80
, and sysenter
that we should probably also be sanitizing for security reasons.
We also need to make sure that we're taking care of all terminator instructions. We seem to be, but changing to MCInstrDesc::isTerminator
probably makes a lot of sense and avoids us needing to manually handle the cases.
Title says it all. Reduce dependencies on Abseil libraries from the C++ code.
The current script in ./gematria/datasets/convert_bhive_to_exegesis_inputs.cc
runs sequentially. This is somewhat of a problem for using the Exegesis annotator, which isn't particularly fast. This can easily be parallelized as we don't care about the timings at all while running the annotations. This should be doable with some refactoring and use of LLVM's threading APIs.
After a while (maybe about ~1000 blocks from my testing), the annotator begins to fail on every block with the following message:
Failed to find addresses for block '488B442410488B7808837C240C00': INTERNAL: Failed to create child process: Resource temporarily unavailable
Block disassembly:
movq 16(%rsp), %rax
movq 8(%rax), %rdi
cmpl $0, 12(%rsp)
This is presumably because the underlying exegesis code is keeping processes around (although I have yet to confirm that hypothesis). More debugging is needed.
As of now, the GRANITE model is basic-block oriented (resp. trace-oriented), i.e. it doesn't use any information about code that was executed before or after the basic block. We believe that adding such information may provide additional context to improve the prediction of the precisions.
The GRANITE model can be extended to cover such context by:
This modification will also require an extension to the data collection methodology to collect basic blocks and their throughput with the execution context.
Traceback:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1379, in _do_call
return fn(*args)
^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1362, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1455, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 8 is not in [0, 8)
[[{{node encoder_1/edge_model/embed/embedding_lookup}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
app.run(main)
File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
^^^^^^^^^^
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
main_function.run_gematria_model_from_command_line_flags(
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 871, in run_gematria_model_from_command_line_flags
model.train(
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1535, in train
stats = run_one_epoch()
^^^^^^^^^^^^^^^
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1500, in run_one_epoch
return self.train_mini_batch(
^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1628, in train_mini_batch
return self.train_batch(sess, train_schedule)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1590, in train_batch
(_, stats) = sess.run((self._train_step, stats_ops), feed_dict=schedule)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 778, in run
return self._sess.run(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1307, in run
return self._sess.run(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1397, in run
return self._sess.run(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1464, in run
outputs = _WrappedSession.run(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1228, in run
return self._sess.run(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 969, in run
result = self._run(None, fetches, feed_dict, options_ptr,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1192, in _run
results = self._do_run(handle, final_targets, final_fetches,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1372, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1398, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
Detected at node 'encoder_1/edge_model/embed/embedding_lookup' defined at (most recent call last):
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
app.run(main)
File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
main_function.run_gematria_model_from_command_line_flags(
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 803, in run_gematria_model_from_command_line_flags
model.initialize()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 391, in initialize
self._create_tf_graph()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/graph_builder_model_base.py", line 170, in _create_tf_graph
super()._create_tf_graph()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/token_model.py", line 200, in _create_tf_graph
super()._create_tf_graph()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 238, in _create_tf_graph
self._graphs_tuple_outputs = self._create_graph_network()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 353, in _create_graph_network
graphs_tuple = layer.module(graphs_tuple)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
return self._call(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
outputs, subgraph_name_scope = self._template(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
output = self._build(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/modules.py", line 409, in _build
edges=self._edge_model(graph.edges, **edge_model_kwargs),
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
return self._call(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
outputs, subgraph_name_scope = self._template(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
output = self._build(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/_base.py", line 112, in _build
return self._model(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
return self._call(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
outputs, subgraph_name_scope = self._template(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
output = self._build(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/embed.py", line 182, in _build
return tf.nn.embedding_lookup(embeddings, ids, name="embedding_lookup")
Node: 'encoder_1/edge_model/embed/embedding_lookup'
indices[0] = 8 is not in [0, 8)
[[{{node encoder_1/edge_model/embed/embedding_lookup}}]]
Original stack trace for 'encoder_1/edge_model/embed/embedding_lookup':
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
app.run(main)
File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
main_function.run_gematria_model_from_command_line_flags(
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 803, in run_gematria_model_from_command_line_flags
model.initialize()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 391, in initialize
self._create_tf_graph()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/graph_builder_model_base.py", line 170, in _create_tf_graph
super()._create_tf_graph()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/token_model.py", line 200, in _create_tf_graph
super()._create_tf_graph()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 238, in _create_tf_graph
self._graphs_tuple_outputs = self._create_graph_network()
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 353, in _create_graph_network
graphs_tuple = layer.module(graphs_tuple)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
return self._call(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
outputs, subgraph_name_scope = self._template(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
return self._call_func(args, kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
result = self._func(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
output = self._build(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/modules.py", line 409, in _build
edges=self._edge_model(graph.edges, **edge_model_kwargs),
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
return self._call(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
outputs, subgraph_name_scope = self._template(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
return self._call_func(args, kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
result = self._func(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
output = self._build(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/_base.py", line 112, in _build
return self._model(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
return self._call(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
outputs, subgraph_name_scope = self._template(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
return self._call_func(args, kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
result = self._func(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
output = self._build(*args, **kwargs)
File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/embed.py", line 182, in _build
return tf.nn.embedding_lookup(embeddings, ids, name="embedding_lookup")
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/embedding_ops.py", line 326, in embedding_lookup
return _embedding_lookup_and_transform(
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/embedding_ops.py", line 145, in _embedding_lookup_and_transform
array_ops.gather(params[0], ids, name=name), ids, max_norm)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/deprecation.py", line 576, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/array_ops.py", line 5138, in gather
return gen_array_ops.gather_v2(params, indices, axis, name=name)
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3982, in gather_v2
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/framework/op_def_library.py", line 795, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/framework/ops.py", line 3381, in _create_op_internal
ret = Operation.from_node_def(
With the following command line invocation:
bazel run //gematria/granite/python:run_granite_model -- --gematria_action=train --gematria_checkpoint_dir=/tmp/test_model/ --gematria_learning_rate=0.001 --gematria_loss_type=mean_absolute_error --gematria_training_num_epochs=100000 --gematria_tokens_file=/data/vocab_10u7.txt --gematria_input_file=/tmp/test.tfrecord --gematria_max_blocks_in_batch=100 --gematria_learning_rate_schedule=cosine --gematria_decay_steps=100000
With the tfrecord dataset produced from the following csv:
f3b801000000,1
With the patch from #107 applied.
Hello, after building tflite, there is an error with llvm-cm:
(env) $ ninja check-llvm-tools-llvm-cm
[0/1] Running llvm-cm tests
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using yaml2obj: /home/hrong1/llvm-src/cmake-build/bin/yaml2obj
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using llvm-cm: /home/hrong1/llvm-src/cmake-build/bin/llvm-cm
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using split-file: /home/hrong1/llvm-src/cmake-build/bin/split-file
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using llvm-mc: /home/hrong1/llvm-src/cmake-build/bin/llvm-mc
FAIL: llvm-cm :: X86/multi_func.s (11 of 11)
******************** TEST 'llvm-cm :: X86/multi_func.s' FAILED ********************
Exit Code: 1
Command Output (stdout):
--
# RUN: at line 2
/home/hrong1/llvm-src/cmake-build/bin/llvm-mc -o /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o --filetype=obj -triple=x86_64-unknown-linux-gnu /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# executed command: /home/hrong1/llvm-src/cmake-build/bin/llvm-mc -o /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o --filetype=obj -triple=x86_64-unknown-linux-gnu /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# RUN: at line 3
/home/hrong1/llvm-src/cmake-build/bin/llvm-cm /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o -csv=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/multi-func.csv -granite_model=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/gb-token-mit-2022_12_02.tflite -evaluator=granite | /home/hrong1/llvm-src/cmake-build/bin/FileCheck /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# executed command: /home/hrong1/llvm-src/cmake-build/bin/llvm-cm /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o -csv=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/multi-func.csv -granite_model=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/gb-token-mit-2022_12_02.tflite -evaluator=granite
# .---command stderr------------
# | Unexpected node token: 'RIP'
# `-----------------------------
# executed command: /home/hrong1/llvm-src/cmake-build/bin/FileCheck /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# .---command stderr------------
# | /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s:8:15: error: CHECK-NEXT: expected string not found in input
# | # CHECK-NEXT: Calculated Frequency: 8.342712e+03
# | ^
# | <stdin>:1:11: note: scanning from here
# | <reverse>:
# | ^
# | <stdin>:2:1: note: possible intended match here
# | Calculated Frequency: 8.342695e+03
# | ^
# |
# | Input file: <stdin>
# | Check file: /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# |
# | -dump-input=help explains the following input dump.
# |
# | Input was:
# | <<<<<<
# | 1: <reverse>:
# | next:8'0 X~ error: no match found
# | 2: Calculated Frequency: 8.342695e+03
# | next:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:8'1 ? possible intended match
# | 3: <tallestBillboard>:
# | next:8'0 ~~~~~~~~~~~~~~~~~~~~~
# | 4: Calculated Frequency: 2.928508e+05
# | next:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | 5: <isMatch>:
# | next:8'0 ~~~~~~~~~~~~
# | 6: Calculated Frequency: 8.204262e+02
# | next:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | 7: <bubbleSort>:
# | next:8'0 ~~~~~~~~~~~~~~~
# | .
# | .
# | .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1
--
********************
********************
Failed Tests (1):
llvm-cm :: X86/multi_func.s
Testing Time: 0.59s
Total Discovered Tests: 11
Passed: 10 (90.91%)
Failed: 1 (9.09%)
FAILED: tools/gematria/llvm_cm/CMakeFiles/check-llvm-tools-llvm-cm /home/hrong1/llvm-src/cmake-build/tools/gematria/llvm_cm/CMakeFiles/check-llvm-tools-llvm-cm
cd /home/hrong1/llvm-src/cmake-build/tools/gematria/llvm_cm && /home/hrong1/gematria/env/bin/python3 /home/hrong1/llvm-src/cmake-build/./bin/llvm-lit -sv /home/hrong1/llvm-src/cmake-build/tools/gematria/llvm_cm
Hi, after building gematria (bazel build ...
), I run bazel test ...
, but failed at bhive_importer_test.test_x86_parse_csv_line
, and the error log indicate that
e_importer_test/test.log
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //gematria/datasets/python:bhive_importer_test
-----------------------------------------------------------------------------
Running tests under Python 3.10.0: /home/gematria/gematria_env/bin/python3
[ RUN ] BhiveImporterTest.test_x86_basic_block_proto_from_bytes
[ OK ] BhiveImporterTest.test_x86_basic_block_proto_from_bytes
[ RUN ] BhiveImporterTest.test_x86_basic_block_proto_from_hex
[ OK ] BhiveImporterTest.test_x86_basic_block_proto_from_hex
[ RUN ] BhiveImporterTest.test_x86_nonstandard_columns
[ OK ] BhiveImporterTest.test_x86_nonstandard_columns
[ RUN ] BhiveImporterTest.test_x86_parse_csv_line
[ FAILED ] BhiveImporterTest.test_x86_parse_csv_line
======================================================================
ERROR: test_x86_parse_csv_line (__main__.BhiveImporterTest)
BhiveImporterTest.test_x86_parse_csv_line
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/lukez/.cache/bazel/_bazel_lukez/6ec059981b607312b48b2c4811597fe7/sandbox/linux-sandbox/6/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/python/bhive_importer_test.runfiles/com_google_gematria/gematria/datasets/python/bhive_importer_test.py", line 203, in test_x86_parse_csv_line
block_proto = importer.basic_block_with_throughput_proto_from_csv_line(
TypeError: basic_block_with_throughput_proto_from_csv_line(): incompatible function arguments. The following argument types are supported:
1. (self: gematria.datasets.python.bhive_importer.BHiveImporter, source_name: str, line: str, machine_code_hex_column_index: int, throughput_column_index: int, throughput_scaling: float = 1.0, base_address: int = 0) -> gematria::BasicBlockWithThroughputProto
Invoked with: <gematria.datasets.python.bhive_importer.BHiveImporter object at 0x7ffb153321f0>; kwargs: source_name='test: made-up', line='4829d38b44246c8b54246848c1fb034829d04839c3,10', base_address=600, throughput_scaling=2.0
----------------------------------------------------------------------
Ran 4 tests in 0.069s
FAILED (errors=1)
Do you know how to address this issue? The machine I am using is a Intel Broadwell in x86.
Main goals are to:
We can start with @boomanaiden154 's very simple decompression benchmark and then @virajbshah 's cache missing benchmarks - totally fine if models are overfitting initially.
Hello, I try to train the granite model and get an error with unmatched tensors:
env USE_BAZEL_VERSION=6.4.0 ../bazelisk-linux-amd64 run //gematria/granite/python:run_granite_model -- --gematria_action=train --gematria_checkpoint_dir=/tmp/test_model/ --gematria_training_num_epochs=10 --gematria_input_file=/tmp/basic_blocks_with_throughput.tfrecord --gematria_tokens_file=/tmp/tokens.txt
(env) ~/gematria$ env USE_BAZEL_VERSION=6.4.0 ../bazelisk-linux-amd64 run //gematria/granite/python:run_granite_model -- --gematria_action=train --gematria_checkpoint_dir=/tmp/test_model/ --gematria_training_num_epochs=10 --gematria_input_file=/tmp/basic_blocks_with_throughput.tfrecord --gematria_tokens_file=/tmp/tokens.txt
INFO: Analyzed target //gematria/granite/python:run_granite_model (0 packages loaded, 85 targets configured).
INFO: Found 1 target...
Target //gematria/granite/python:run_granite_model up-to-date:
bazel-bin/gematria/granite/python/run_granite_model
INFO: Elapsed time: 0.168s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/gematria/granite/python/run_granite_model '--gematria_action=train' '--gematria_checkpoint_dir=/tmp/test_model/' '--gematria_training_num_epochs=10' '--gematria_input_file=/tmp/basic_blocks_with_throughput.tfrecord' '--gematria_tokens_file=/tmp/tokens.txt'
2024-06-07 11:42:19.573765: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-06-07 11:42:19.575373: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-07 11:42:19.603053: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-07 11:42:19.603109: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-07 11:42:19.604100: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-07 11:42:19.609623: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-07 11:42:19.609868: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-07 11:42:20.147141: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/compat/v2_compat.py:108: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
W0607 11:42:21.881521 139849213219904 model_base.py:683] ModelBase._output_tensor has invalid name. Expected ModelBase.output_tensor, found concat/concat:0.
W0607 11:42:22.258888 139849213219904 model_base.py:902] ModelBase._synchronous_training is True with a single worker.
I0607 11:42:23.650717 139849213219904 timer.py:61] Creating model: TokenGraphBuilderModel: 2.525757s
I0607 11:42:23.650949 139849213219904 timer.py:61] Loading basic blocks: 0.000080s
WARNING:tensorflow:From /home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py:545: StopAtStepHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
W0607 11:42:23.651067 139849213219904 deprecation.py:50] From /home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py:545: StopAtStepHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py:579: StepCounterHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
W0607 11:42:23.882867 139849213219904 deprecation.py:50] From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py:579: StepCounterHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/basic_session_run_hooks.py:686: SecondOrStepTimer.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
W0607 11:42:23.883046 139849213219904 deprecation.py:50] From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/basic_session_run_hooks.py:686: SecondOrStepTimer.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py:586: SummarySaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
W0607 11:42:23.883132 139849213219904 deprecation.py:50] From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py:586: SummarySaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py:597: CheckpointSaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
W0607 11:42:23.883212 139849213219904 deprecation.py:50] From /home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py:597: CheckpointSaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Create CheckpointSaverHook.
I0607 11:42:23.883271 139849213219904 basic_session_run_hooks.py:557] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0607 11:42:25.717828 139849213219904 monitored_session.py:240] Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/test_model/model.ckpt-0
I0607 11:42:25.724106 139849213219904 saver.py:1413] Restoring parameters from /tmp/test_model/model.ckpt-0
2024-06-07 11:42:25.754281: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
Traceback (most recent call last):
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 1402, in _do_call
return fn(*args)
^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 1385, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 1478, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [53,128] rhs shape= [9,128]
[[{{node save/Assign_12}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 1418, in restore
sess.run(self.saver_def.restore_op_name,
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 972, in run
result = self._run(None, fetches, feed_dict, options_ptr,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 1215, in _run
results = self._do_run(handle, final_targets, final_fetches,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 1395, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/session.py", line 1421, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
Detected at node 'save/Assign_12' defined at (most recent call last):
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 308, in run
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 893, in run_gematria_model_from_command_line_flags
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 565, in _monitored_training_session_from_flags
Node: 'save/Assign_12'
Assign requires shapes of both tensors to match. lhs shape= [53,128] rhs shape= [9,128]
[[{{node save/Assign_12}}]]
Original stack trace for 'save/Assign_12':
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 308, in run
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 893, in run_gematria_model_from_command_line_flags
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 565, in _monitored_training_session_from_flags
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 934, in __init__
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 946, in build
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 974, in _build
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 543, in _build_internal
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 383, in _AddRestoreOps
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 86, in restore
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/ops/state_ops.py", line 353, in assign
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py", line 796, in _apply_op_helper
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 2652, in _create_op_internal
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 1160, in from_node_def
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
app.run(main)
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
^^^^^^^^^^
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
main_function.run_gematria_model_from_command_line_flags(
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 893, in run_gematria_model_from_command_line_flags
session = _monitored_training_session_from_flags(model, is_chief)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 571, in _monitored_training_session_from_flags
return tf.train.MonitoredTrainingSession(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 606, in MonitoredTrainingSession
return MonitoredSession(
^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 1050, in __init__
super(MonitoredSession, self).__init__(
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 753, in __init__
self._sess = _RecoverableSession(self._coordinated_creator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 1259, in __init__
_WrappedSession.__init__(self, self._create_session())
^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 1264, in _create_session
return self._sess_creator.create_session()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 906, in create_session
self.tf_sess = self._session_creator.create_session()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/monitored_session.py", line 665, in create_session
return self._get_session_manager().prepare_session(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/session_manager.py", line 320, in prepare_session
sess, is_loaded_from_checkpoint = self._restore_checkpoint(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/session_manager.py", line 254, in _restore_checkpoint
_restore_checkpoint_and_maybe_run_saved_model_initializers(
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/session_manager.py", line 71, in _restore_checkpoint_and_maybe_run_saved_model_initializers
saver.restore(sess, path)
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 1454, in restore
raise _wrap_restore_error_with_msg(
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Graph execution error:
Detected at node 'save/Assign_12' defined at (most recent call last):
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 308, in run
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 893, in run_gematria_model_from_command_line_flags
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 565, in _monitored_training_session_from_flags
Node: 'save/Assign_12'
Assign requires shapes of both tensors to match. lhs shape= [53,128] rhs shape= [9,128]
[[{{node save/Assign_12}}]]
Original stack trace for 'save/Assign_12':
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 308, in run
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 893, in run_gematria_model_from_command_line_flags
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 565, in _monitored_training_session_from_flags
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 934, in __init__
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 946, in build
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 974, in _build
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 543, in _build_internal
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saver.py", line 383, in _AddRestoreOps
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 86, in restore
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/ops/state_ops.py", line 353, in assign
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py", line 796, in _apply_op_helper
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 2652, in _create_op_internal
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 1160, in from_node_def
The input tfrecord is converted from gematria/testing/testdata/basic_blocks_with_throughput.pbtxt
.
Any suggestion?
Or do you have any working example for me to try instead? I just want to see how to train the model at this stage.
Thanks,
Hongbo
Hello, after building gematria, there is 1 failure with FindAccessedAddrsExegesisTest:
(env)$ env USE_BAZEL_VERSION=6.4.0 ../bazelisk-linux-amd64 test ...
...
//gematria/datasets:find_accessed_addrs_exegesis_test FAILED in 0.8s
/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/testlogs/gematria/datasets/find_accessed_addrs_exegesis_test/test.log
Executed 1 out of 51 tests: 50 tests pass and 1 fails locally.
Here is the content of find_accessed_addrs_exegesis_test/test.log:
Executing tests from //gematria/datasets:find_accessed_addrs_exegesis_test
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from FindAccessedAddrsExegesisTest
[ RUN ] FindAccessedAddrsExegesisTest.ExegesisNoAccess
Failure value returned from cantFail wrapped call
can't run 'latency' mode, sched model does not define a cycle counter. You can pass --benchmark-phase=... to skip the actual benchmarking or --use-dummy-perf-counters to not query the kernel for real event counts.
UNREACHABLE executed at external/llvm-project/llvm/include/llvm/Support/Error.h:790!
Is this expected? This looks like an important test to fix, as it seems to measure cycles of instructions, which I guess is a basic functionality of gematria.
Thanks!
llvm-cm needs to be updated to reflect some recent LLVM changes:
llvm-cm
needs to support these for cases like basic block sections and split machine functions (and should at the very least have test coverage for them).-mbb-profile-dump
no longer exists, and instead PGOAnalysisMap
should be used. Tests/code needs to be updated for this.This (at least the second part) is sort of a prerequisite for #55.
Hello, I am trying the example at https://github.com/google/gematria/blob/main/g3doc/obtaining-training-data.md to convert bhive to tfrecord, and get the following error:
(env) ~/gematria$ curl -L https://raw.githubusercontent.com/ithemal/bhive/5f1d50077ac0779fd227b261dcf517862c7104bd/benchmark/throughput/skl.csv > skl.csv
(env) ~/gematria$ env USE_BAZEL_VERSION=6.4.0 ../bazelisk-linux-amd64 run //gematria/datasets/python:import_from_bhive -- \
--gematria_input_csv=skl.csv \
--gematria_output_tfrecord=skl.tfrecord \
--gematria_throughput_source_name="bhive: skl"
INFO: Analyzed target //gematria/datasets/python:import_from_bhive (9 packages loaded, 3974 targets configured).
INFO: Found 1 target...
INFO: From Compiling llvm/lib/Support/Process.cpp:
In file included from external/llvm-project/llvm/lib/Support/Process.cpp:123:
external/llvm-project/llvm/lib/Support/Unix/Process.inc:101:10: warning: 'mallinfo' is deprecated [-Wdeprecated-declarations]
mi = ::mallinfo();
^
/usr/include/malloc.h:114:48: note: 'mallinfo' has been explicitly marked deprecated here
extern struct mallinfo mallinfo (void) __THROW __MALLOC_DEPRECATED;
^
/usr/include/malloc.h:32:30: note: expanded from macro '__MALLOC_DEPRECATED'
# define __MALLOC_DEPRECATED __attribute_deprecated__
^
/usr/include/x86_64-linux-gnu/sys/cdefs.h:339:51: note: expanded from macro '__attribute_deprecated__'
# define __attribute_deprecated__ __attribute__ ((__deprecated__))
^
1 warning generated.
INFO: From Compiling llvm/lib/Support/Process.cpp [for tool]:
In file included from external/llvm-project/llvm/lib/Support/Process.cpp:123:
external/llvm-project/llvm/lib/Support/Unix/Process.inc:101:10: warning: 'mallinfo' is deprecated [-Wdeprecated-declarations]
mi = ::mallinfo();
^
/usr/include/malloc.h:114:48: note: 'mallinfo' has been explicitly marked deprecated here
extern struct mallinfo mallinfo (void) __THROW __MALLOC_DEPRECATED;
^
/usr/include/malloc.h:32:30: note: expanded from macro '__MALLOC_DEPRECATED'
# define __MALLOC_DEPRECATED __attribute_deprecated__
^
/usr/include/x86_64-linux-gnu/sys/cdefs.h:339:51: note: expanded from macro '__attribute_deprecated__'
# define __attribute_deprecated__ __attribute__ ((__deprecated__))
^
1 warning generated.
Target //gematria/datasets/python:import_from_bhive up-to-date:
bazel-bin/gematria/datasets/python/import_from_bhive
INFO: Elapsed time: 144.058s, Critical Path: 31.45s
INFO: 782 processes: 3 internal, 779 linux-sandbox.
INFO: Build completed successfully, 782 total actions
INFO: Running command line: bazel-bin/gematria/datasets/python/import_from_bhive '--gematria_input_csv=skl.csv' '--gematria_output_tfrecord=skl.tfrecord' '--gematria_throughput_source_name=bhive: skl'
2024-06-11 11:24:53.336711: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-06-11 11:24:53.338850: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-11 11:24:53.372145: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-11 11:24:53.372199: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-11 11:24:53.373893: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-11 11:24:53.382875: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-11 11:24:53.383089: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/python/import_from_bhive.runfiles/com_google_gematria/gematria/datasets/python/import_from_bhive.py", line 36, in <module>
import tensorflow as tf
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/__init__.py", line 48, in <module>
from tensorflow._api.v2 import __internal__
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/_api/v2/__internal__/__init__.py", line 8, in <module>
from tensorflow._api.v2.__internal__ import autograph
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/_api/v2/__internal__/autograph/__init__.py", line 8, in <module>
from tensorflow.python.autograph.core.ag_ctx import control_status_ctx # line: 34
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/autograph/core/ag_ctx.py", line 21, in <module>
from tensorflow.python.autograph.utils import ag_logging
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/autograph/utils/__init__.py", line 17, in <module>
from tensorflow.python.autograph.utils.context_managers import control_dependency_on_returns
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/autograph/utils/context_managers.py", line 19, in <module>
from tensorflow.python.framework import ops
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 44, in <module>
from tensorflow.python.client import pywrap_tf_session
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/client/pywrap_tf_session.py", line 25, in <module>
from tensorflow.python.util import tf_stack
File "/home/hrong1/gematria/env/lib/python3.11/site-packages/tensorflow/python/util/tf_stack.py", line 22, in <module>
from tensorflow.python.util import _tf_stack
ImportError: generic_type: cannot initialize type "StatusCode": an object with that name is already defined
I'm wondering if this is expected or if there is something wrong with my environment. My environment uses the same requirment.in as master except tensorflow-probability>=0.19.0 is changed to tensorflow-probability==0.23.0.
Any suggestion? Thanks!
Hongbo
There is currently no python formatting tooling in the repository, although it presumably follows the Google Python style guide.
I'd like to propose using yapf:
The only issue is that yapf currently doesn't support match-case statements due to some third-party dependencies being unmaintained. This is being tracked in this issue and some work has been going on recently in this PR, but nothing has been finalized yet. This means yapf
fails when formatting gematria currently. Building from a fork seems like it should work in the mean time.
Noting here that other formatters like black or pyink don't have this issue.
Title says it all. Remove references to absl::StatusOr from the C++ code.
When adding any protobuf library as a dependency to compile_modules_lib
, we get a segmentation fault (at least in compile_modules_lib_test
) rather than the expected behavior:
Fatal Python error: Segmentation fault
Thread 0x00007f3a20e00640 (most recent call first):
File "/usr/lib/python3.10/threading.py", line 324 in wait
File "/usr/lib/python3.10/threading.py", line 607 in wait
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/worker/data_plane.py", line 255 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Current thread 0x00007f3ad2b9d1c0 (most recent call first):
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1048 in _run_bundle
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 811 in _execute_bundle
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 483 in run_stages
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 228 in run_via_runner_api
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 204 in run_pipeline
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/direct/direct_runner.py", line 128 in run_pipeline
File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 587 in run
File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 563 in run
File "/usr/local/lib/python3.10/dist-packages/apache_beam/testing/test_pipeline.py", line 115 in run
File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 613 in __exit__
File "/root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/sandbox/linux-sandbox/8748/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/pipelines/compile_modules_lib_test.runfiles/com_google_gematria/gematria/datasets/pipelines/compile_modules_lib_test.py", line 129 in test_get_bbs
File "/usr/lib/python3.10/unittest/case.py", line 549 in _callTestMethod
File "/usr/lib/python3.10/unittest/case.py", line 591 in run
File "/usr/lib/python3.10/unittest/case.py", line 650 in __call__
File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.10/unittest/runner.py", line 184 in run
File "/usr/lib/python3.10/unittest/main.py", line 271 in runTests
File "/usr/lib/python3.10/unittest/main.py", line 101 in __init__
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2653 in _run_and_get_tests_result
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2689 in run_tests
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2234 in main_function
File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254 in _run_main
File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308 in run
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2236 in _run_in_app
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2131 in main
File "/root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/sandbox/linux-sandbox/8748/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/pipelines/compile_modules_lib_test.runfiles/com_google_gematria/gematria/datasets/pipelines/compile_modules_lib_test.py", line 142 in <module>
Extension modules: google.protobuf.pyext._message, google3.net.proto2.python.internal.cpp._message, apache_beam.coders.stream, grpc._cython.cygrpc, apache_beam.utils.windowed_value, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, fastavro._logical_readers, fastavro._schema, zstandard.backend_c, fastavro._read, fastavro._logical_writers, fastavro._validation, fastavro._write, pyarrow.lib, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, apache_beam.coders.coder_impl, apache_beam.transforms.cy_dataflow_distribution_counter, apache_beam.transforms.cy_combiners, charset_normalizer.md, apache_beam.utils.counters, apache_beam.runners.common, apache_beam.transforms.stats, apache_beam.metrics.cells, apache_beam.runners.worker.statesampler_fast, apache_beam.metrics.execution, bson._cbson, pymongo._cmessage, pyarrow._parquet, pyarrow._fs, pyarrow._azurefs, pyarrow._hdfs, pyarrow._gcsfs, pyarrow._s3fs, crcmod._crcfunext, regex._regex, apache_beam.runners.worker.opcounters, apache_beam.runners.worker.operations (total: 89)
The main difference between the two setups seems to be that without any proto dependency defined in bazel, we use the system-installed protobuf version, whereas when we have a proto dependency specified, we use the bazel-managed protobuf version and automatically use the cpp backend. I have not been able to see if the issue reproduces with the system installed protobuf with the cpp backend as it is not installed by default.
Python 3.12 supports only TensorFlow 2.16 - which drops the tf.estimator
API, used by the current version of tensorflow-ranking
used in Gematria. Until tensorflow-ranking
gains compatibility with TensorFlow 2.16 (and hence Python 3.12), we'll have to stick with Python 3.11 and TensorFlow 2.15.
Hello, is it accurate to say that inverse_throughput_cycles refers to CPI (cycles per instruction)?
I do not understand the description of prefix_inverse_throughput_cycles in throughput.py: "The number of cycles of inverse throughput of the prefixes of the basic block.". There can be prefixes like REP for instructions, but what are the prefixes of a basic block?
Thanks!
Hongbo
Hello, this is not an issue, but a question how to reproduce the experiments in Table 5 or 6 in the IISWC'22 Granite paper. Specific instructions would be greatly appreciated.
Hongbo
Title says it all. Reduce dependency on Abseil libraries from the C++ code. Use LLVM command-line parsing facilities instead.
Hello, I followed the instructions in README to install gematria, but the build failed for an issue: no such attribute 'exec_tools' in 'genrule' rule. Below are the details.
This is the version of bazelisk:
(env) [gematria]$ ../bazelisk-linux-amd64 version
Bazelisk version: v1.19.0
WARNING: Output base '/data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86' is on NFS. This may lead to surprising failures and undetermined behavior.
Build label: 7.1.1
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Thu Mar 21 18:08:37 2024 (1711044517)
Build timestamp: 1711044517
Build timestamp as int: 1711044517
And this is the build:
(env) [gematria]$ ../bazelisk-linux-amd64 build ...
WARNING: Output base '/data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86' is on NFS. This may lead to surprising failures and undetermined behavior.
Starting local Bazel server and connecting to it...
WARNING: --enable_bzlmod is set, but no MODULE.bazel file was found at the workspace root. Bazel will create an empty MODULE.bazel file. Please consider migrating your external dependencies from WORKSPACE to MODULE.bazel. For more details, please refer to https://github.com/bazelbuild/bazel/issues/18958.
DEBUG: Rule 'com_google_protobuf' indicated that a canonical reproducible form can be obtained by modifying arguments commit = "a74f54b724bdc2fe0bfc271f4dc0ceb159805625" and dropping ["tag"]
DEBUG: Repository com_google_protobuf instantiated at:
/data/nfs_home/hrong1/gematria/WORKSPACE:16:15: in <toplevel>
Repository rule git_repository defined at:
/data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/bazel_tools/tools/build_defs/repo/git.bzl:189:33: in <toplevel>
ERROR: /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/com_google_protobuf/python/BUILD.bazel:123:13: @@com_google_protobuf//python:aarch64_test_genrule: no such attribute 'exec_tools' in 'genrule' rule (did you mean 'executable'?)
ERROR: /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/com_google_protobuf/python/BUILD.bazel:131:12: @@com_google_protobuf//python:x86_64_test_genrule: no such attribute 'exec_tools' in 'genrule' rule (did you mean 'executable'?)
ERROR: /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/com_google_protobuf/python/BUILD.bazel:17:11: errors encountered resolving select() keys for @@com_google_protobuf//python:protobuf_python
ERROR: Analysis of target '//gematria/proto:canonicalized_instruction_py_pb2' failed; build aborted: Analysis failed
INFO: Elapsed time: 123.120s, Critical Path: 0.03s
INFO: 1 process: 1 internal.
ERROR: Build did NOT complete successfully
FAILED:
Fetching repository @@pybind11; Cloning tags/v2.10.3 of https://github.com/pybind/pybind11.git
Fetching repository @@pybind11_abseil_repo; Cloning 1caf1890443e8e303bf88850d3c27d5422903168 of https://github.com/pybind/pybind11_abseil.git
Fetching repository @@sonnet_repo; Cloning cd5b5fa48e15e4d020f744968f5209949ebe750f of https://github.com/deepmind/sonnet.git
Fetching repository @@graph_nets_repo; Cloning adf25162ba21bb0ae176c35483a74fb0c9dff576 of https://github.com/deepmind/graph_nets.git
Fetching repository @@rules_license~; starting
Fetching repository @@protobuf~; starting
Fetching repository @@rules_java~; starting
Fetching repository @@apple_support~; starting
Anyone has any idea? Thanks!
With the large scale of our datasets (potentially 10^8 BBs), we will need a reasonably fast way to benchmark basic blocks. Parallelizing this is an obvious first step. This needs a couple things implemented on the LLVM side:
llvm-exegesis
.(There might be more on the llvm-exegesis
side).
Then, we need to do the following:
Hi,
I'm trying to follow the g3doc inference-api.md documentation, but when I run the command I'm missing the /tmp/tokens.txt file. Could you please let me know how to generate this file?
Thanks,
Z
It would be good to validate to validate that the benchmarking numbers that we're getting match previous results (like BHive and uica-eval) to ensure that we aren't doing anything egregiously wrong. To do this we need to do a couple things:
Using the parallelized annotator:
Bus error (core dumped)
Need to see if this is reproducible and debug why it is happening. It seemed like this happened in the parent process rather than a signal received in the child process that would've been handled through ptrace.
Failed to annotate block: INTERNAL: Failed to create a pipe for interprocess communication between llvm-exegesis and the benchmarking subprocess: Too many open files
More investigation is needed. Probably an issue on the LLVM side, but opening here first in case there is some complicated interaction.
Using the following CSV, test.csv
:
85c044897c2460,98.000000
3b31,45.000000
With the following command line invocation, assuming ./json
exists:
./bazel-bin/gematria/datasets/convert_bhive_to_llvm_exegesis_input --json_output_dir=./json --bhive_csv=./test.csv --blocks_per_json_file=1
We get the following in ./json
:
0.json 1.json 2.json
Note that we should only get two files.
0.json
:
[
{
"Hex": "85c044897c2460",
"MemoryDefinitions": [
{
"Name": "MEM",
"Size": 4096,
"Value": 305419776
}
],
"MemoryMappings": [
{
"Address": 65536,
"Value": "MEM"
}
]
}
]
1.json
:
[
{
"Hex": "3b31",
"MemoryDefinitions": [
{
"Name": "MEM",
"Size": 4096,
"Value": 305419776
}
],
"MemoryMappings": [
{
"Address": 65536,
"Value": "MEM"
}
]
}
]
2.json
:
[]
We see one of the blocks duplicated, the second block shows up twice, we get an extra file, and the extra file is empty. This needs to be fixed.
In order to construct large-scale BB datasets, we need a script that can perform these benchmarking runs, taking in annotated basic blocks from the annotation script (most likely in JSON), and then returning them with throughput information.
# LLVM-EXEGESIS-DEFREG EFLAGS 12345600
# LLVM-EXEGESIS-DEFREG RCX 12345600
# LLVM-EXEGESIS-DEFREG RDI 12345600
# LLVM-EXEGESIS-DEFREG RIP 12345600
# LLVM-EXEGESIS-DEFREG XMM2 12345600
# LLVM-EXEGESIS-LOOP-REGISTER RDX
movzbl (%rcx), %eax
movd %edi, %xmm0
pshufd $0, %xmm0, %xmm0
movdqa (%rip), %xmm1
pand %xmm0, %xmm1
pand (%rip), %xmm0
pxor %xmm2, %xmm2
movdqa %xmm0, %xmm3
pcmpeqd %xmm2, %xmm3
movdqa %xmm1, %xmm4
pcmpeqd %xmm2, %xmm4
packssdw %xmm3, %xmm4
packsswb %xmm4, %xmm4
movdqa %xmm4, %xmm3
pandn (%rip), %xmm3
movb %al, (%rip)
pand (%rip), %xmm4
por %xmm3, %xmm4
movq %xmm4, (%rip)
testb $1, %dil
movl $45, %eax
movl $120, %ecx
cmovel %eax, %ecx
movb %cl, (%rip)
testl $2048, %edi
This snippet ends up making the exegesis annotator map the same address over and over (but eventually, it moves on to another page). Not sure why this behavior is occurring and more investigation is needed.
The get_bbs
function composes Beam pipelines. We should adopt the idiomatic Beam approach, which is to make this a composite transform.
Adding a features = ["layering_check"]
to all our packages incrementally (and eventually somewhere common) would be ideal. It would prevent any transitive dependency issues and would also make imports into google3
easier where this check is enabled by default.
Hello, for a sequence of instructions executed on an emulator, how to generate the right llvm_mnemonics and memory alias group IDs? In the emulator, I can see the opcode, operands, and memory locations accessed.
Also, can we get rid of llvm_mnemonics entirely? Its information should have already been expressed by opcode and operands. In basic_block.cc, Instruction::AddTokensToList() does not seem to treat llvm_mnemonics as a token either; so getting rid of it should not affect the sonnet.Embed layer, I guess?
Thanks!
Hongbo
As discussed in the review of #132, we're patching tf_keras.layers.LayerNormalization
with itself in some of our tests, e.g. in gematria/granite/python/gnn_model_base_test.py. We should investigate whether this mocking is still needed, and if not, remove it.
I'm observing the following failures:
FAIL: //gematria/datasets/convert_bhive_to_llvm_exegesis_input_tests:conversion.test_lit_test (see /root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/execroot/com_google_gematria/bazel-out/k8-opt/testlogs/gematria/datasets/convert_bhive_to_llvm_exegesis_input_tests/conversion.test_lit_test/test.log)
FAIL: //gematria/datasets/convert_bhive_to_llvm_exegesis_input_tests:loop_register.test_lit_test (see /root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/execroot/com_google_gematria/bazel-out/k8-opt/testlogs/gematria/datasets/convert_bhive_to_llvm_exegesis_input_tests/loop_register.test_lit_test/test.log)
FAIL: //gematria/datasets/convert_bhive_to_llvm_exegesis_input_tests:max_bb_count.test_lit_test (see /root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/execroot/com_google_gematria/bazel-out/k8-opt/testlogs/gematria/datasets/convert_bhive_to_llvm_exegesis_input_tests/max_bb_count.test_lit_test/test.log)
specifically when I run blaze test -c opt ...
. These failures notably do not appear when running blaze test ...
.
Hello, I checked out the latest gematria (5409714) for Ubuntu 22.04.4 LTS in WSL on a 12th Gen Intel(R) Core(TM) i7-1270P CPU.
I made no change except:
requirements.in
-tensorflow-probability>=0.19.0
+tensorflow-probability>=0.23.0
-tensorflow>=2.11.0; sys_platform=='linux'
+tensorflow>=2.15.1; sys_platform=='linux'
Then build, and test:
(env) (base) ~/gematria$ env USE_BAZEL_VERSION=6.4.0 ../bazelisk-linux-amd64 test ...
.....
FAIL: //gematria/datasets:find_accessed_addrs_exegesis_test (see /home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/testlogs/gematria/datasets/find_accessed_addrs_exegesis_test/test.log)
[6,387 / 6,636] 22 / 52 tests, 1 failed; 16 actions running; last test: //gematria/llvm/python:canonicalizer_test
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/compiler/plugin_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/any_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/duration_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/descriptor_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/api_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/empty_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/field_mask_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/wrappers_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/source_context_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/struct_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/timestamp_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
INFO: From ProtoCompile external/com_google_protobuf/python/google/protobuf/type_pb2.py:
external/com_google_protobuf/.: warning: directory does not exist.
The log file of the failed test: /home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/testlogs/gematria/datasets/find_accessed_addrs_exegesis_test/test.log
:
Executing tests from //gematria/datasets:find_accessed_addrs_exegesis_test
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from FindAccessedAddrsExegesisTest
[ RUN ] FindAccessedAddrsExegesisTest.ExegesisNoAccess
event not found - cannot create event unhalted_core_cycles
gematria/datasets/find_accessed_addrs_exegesis_test.cc:91: Failure
Value of: static_cast<bool>(AddrsOrErr)
Actual: false
Expected: true
Any idea? Thanks!
Hongbo
Looking at the following block:
basic_block {
machine_instructions {
assembly: "\tmovl\t$7, %eax"
machine_code: "\270\007\000\000\000"
}
machine_instructions {
address: 5
assembly: "\trep\t\tmovl\t$1, %eax"
machine_code: "\363\270\001\000\000\000"
}
canonicalized_instructions {
mnemonic: "MOV"
llvm_mnemonic: "MOV32ri"
output_operands {
register_name: "EAX"
}
input_operands {
immediate_value: 7
}
}
canonicalized_instructions {
mnemonic: "MOV\tEAX,"
prefixes: "REP"
llvm_mnemonic: "MOV32ri"
output_operands {
register_name: "EAX"
}
input_operands {
immediate_value: 1
}
}
}
inverse_throughputs {
source: "zen2"
inverse_throughput_cycles: 100.0
}
The mneomic for the second canonicalized instruction is incorrect, as for some reason it also includes the register. This causes issues when trying to train a model as there ends up being an out of bounds embedding table access, which causes the job to fail.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.