open-telemetry / build-tools Goto Github PK
View Code? Open in Web Editor NEWBuilding tools provided by OpenTelemetry
Home Page: https://opentelemetry.io
License: Apache License 2.0
Building tools provided by OpenTelemetry
Home Page: https://opentelemetry.io
License: Apache License 2.0
Currently, the way to refer to another attribute while clarifying its usage in a downstream semantic convention file, for example how aws-sdk does here
The code generator seems to not correctly use the base definition. Perhaps this was too hacky of a way to do this approach but I think we need a way to refer to convventions from other files, and without breaking generation.
Also see:
open-telemetry/opentelemetry-specification#1607
/cc @weyert
I think I realized there was no tracking bug for adding this feature, even though there's a PR: #79
I'm not as familiar with all the requirements we had so if someone with better memory can flesh out this ticket, please do!
In the semantic conventions, some fields will be strictly integers. Having an "integer" enum rather than just number for attributes may also enable better linting / enforcement in the auto-generated schemas.
Would it make sense to add this enum?
Note with possible attribute values (and allowing custom values) is not populated when an attribute is referenced.
E.g. I have
- id: messaging.destination
prefix: messaging.destination
brief: 'Semantic convention for attributes that describe messaging destination on broker'
attributes:
- id: kind
type:
allow_custom_values: true
members:
- id: queue
value: "queue"
brief: "A message sent to a queue"
- id: topic
value: "topic"
brief: "A message sent to a topic"
If I generate md from it using <!-- semconv messaging.destination-->
, I'd see an attribute in the table and a note describing possible values:
| `messaging.destination.kind` | string | The kind of message destination | `queue` | Recommended |
`messaging.destination.kind` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.
| Value | Description |
|---|---|
| `queue` | A message sent to a queue |
| `topic` | A message sent to a topic |
Then if I reference this attribute
- id: messaging.producer
prefix: messaging
type: span
extends: messaging
span_kind: producer
brief: 'Semantic convention for producers of messages sent to a messaging systems.'
attributes:
- ref: messaging.destination.kind
requirement_level:
conditionally_required: If the message destination is either a `queue` or a `topic`.
If I generate md from it using <!-- semconv messaging.producer-->
, I'd expect a similar note to be populated, but it's not (see
This stems from open-telemetry/opentelemetry-specification#2972 and https://github.com/open-telemetry/opentelemetry-specification/pull/3158/files
"The HTTP metrics semantic convention spec should provide clarity "what is considered a compliant implementation", e.g. "an implementation is considered compliant if it has implemented all of the metrics described by this spec"
The work required in build-tools is:
We need a new release to use the latest version of schemas check tool in the spec repo before spec release 0.13.0.
There is no release guidelines in this repo, so I am not sure how to make a release. @open-telemetry/specs-approvers who can help with this and/or write a paragraph in CONTRIBUTING.md to explain how to make the release?
Recently in semantic conventions, two new types of id were introduced:
The const names generated for those ids are ALIBABA CLOUD
and 1XRTT
respectively. Those generated values are not valid variable identifiers in most languages.
In opentelemetry-js, we use to_const_name
to generate keys for those ids: https://github.com/open-telemetry/opentelemetry-js/blob/main/scripts/semconv/templates/SemanticAttributes.ts.j2#L57. In this case, the keys should be valid variable identifiers.
Should we change the semantic conventions or update the generator to generate valid identifiers?
We're using the semconvgen
tooling to generate documentation (and eventually code) for our internal conventions. We would like to be able to associate various metadata with individual attributes and possibly groups of attributes so that we could have all the data about our internal conventions in a single source. Some examples of data we'd like to be able to associate with attributes are:
I suspect that at least some of the fields we would like to annotate on attributes and groups are unique to our internal use cases and probably do not make sense to be integrated with the semconvgen
tooling. Instead I was wondering if there might be interest in adding a generic mechanism for associating arbitrary metadata with group and attributes? If so, would you have have any guidance how we might go about implementing this?
Currently the markdown generator (probably others too) uses the platform line endings, which are CRLF on Windows. It should be changed to use LF, always.
This can be controlled with the newline='\n'
option to open: https://docs.python.org/3/library/functions.html#open
It would probably make sense to use a wrapper function for open() that specifies both the utf-8 encoding and the LF line endings.
We need to tag the new release.
Schema check needs to be updated to support file format 1.1.0 which supports splitting metrics by attributes. See open-telemetry/opentelemetry-specification#2653 for more details.
I am running into this issue when trying to use otel/semconvgen
to generate semantic convention code. How do I resolve this?
The image released on docker hub should use the version contained in https://github.com/open-telemetry/build-tools/blob/master/semantic-conventions/src/opentelemetry/semconv/version.py
A mistake was fixed in spec#1595 in which it was spotted that the exact same brief was used, incorrectly, by multiple attributes.
I believe a warning/error should be introduced to the tooling to spot duplications like this in either the brief
or note
value.
Hey folks ๐ We have existing code that emits attributes with IDs that do not match the semconv regex (for example, some of them are pascal case). We want to use the semconvgen tool with a custom template, but ideally we would like to be able to disable the regex validation on the attribute IDs as an option flag for semconvgen. In our case, relabelling would be too long/costly as we have a huge codebase in multiple languages to comb through.
From open-telemetry/opentelemetry-specification#1096:
YAML model for attributes seems to only have number type. But we have int or double as possible attribute types.
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/common/common.md#attributes
https://github.com/open-telemetry/opentelemetry-specification/blob/master/semantic_conventions/syntax.md
From open-telemetry/opentelemetry-specification#926 (comment)
Attribute types that have a finite set of options, e.g. network.transport, should not report their type with the enum
suffix.
Attributes can appear on different signals (metrics, traces, events, links, logs), but we don't have a way to describe attributes without the signal in semantic conventions.
I'd like to have a mechanism, where I'd define attributes separately and then refer to them from spans/links/events definitions.
E.g. in messaging we can have attributes set on links or on spans, maybe events:
- id: messaging.message
prefix: messaging
type: attribute_group # this is the proposal. We currently use 'span' in cases like this
brief: 'Semantic convention describing per-message attributes populated on messaging spans or links.'
attributes:
- id: message.id
type: string
brief: message id
note: when sending individual messages, it should be populated on span, otherwise should be populated on links
- id: messaging.publish
prefix: messaging
type: span
brief: ...
attributes:
- ref: messaging.message.id
- id: messaging.link
prefix: messaging
type: link # out of scope of this proposal
brief: ...
attributes:
- ref: mesasging.message.id
We'd also benefit from it in span-general.yaml which essentially describes unrelated groups of attributes, but not a span per se.
Additional attribute requirements: At least one of the following sets of attributes is required:
net.peer.name
net.peer.ip
Obviously the set here is {"net.peer.name", "net.peer.ip"}
, so you need both. Except: This is wrong! There are two sets, one contains only net.peer.name
and the other only net.peer.ip
.
I think if each set contains only one member, the text above the list MUST change to "At least one of the following sets of attributes is required:" (without the sets of).
The code generator readme (https://github.com/open-telemetry/build-tools/tree/main/semantic-conventions#code-generator) is outdated and contains a couple of broken links. The generation steps in opentelemetry-java were changed and the generated constants moved.
The new locations are:
https://github.com/open-telemetry/opentelemetry-java/tree/main/buildscripts/semantic-convention
https://github.com/open-telemetry/opentelemetry-java/tree/main/semconv
Hi! We use the otel/build-protobuf
image in Tempo, it is run as part of make vendor-check
:
https://github.com/grafana/tempo/blob/055573362709a02e2ca6aa98400a3de38c91c4a1/Makefile#L123
I'm the first person on the team using the Apple M1 and using this image results in an error (full log at the end):
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
The recommended way to solve this is by providing multi-arch Docker images. Similar work was done in Tempo itself: this requires the pipeline to be adapted so it builds both amd64 and arm64 images and the Dockerfile is parameterised to accept a target arch.
Is this work you would consider accepting?
Full log:
$ make vendor-check
...
--
-- Gen proto --
--
docker run --rm -u 501 -v/Users/koenraad/Repositories/grafana/tempo:/Users/koenraad/Repositories/grafana/tempo -w/Users/koenraad/Repositories/grafana/tempo otel/build-protobuf:0.2.1 --proto_path=/Users/koenraad/Repositories/grafana/tempo -Ipkg/.patched-proto --gogofaster_out=plugins=grpc,paths=source_relative:./pkg/tempopb/ pkg/.patched-proto/common/v1/common.proto
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
--gogofaster_out: protoc-gen-gogofaster: Plugin killed by signal 11.
make: *** [gen-proto] Error 1
Using opentelemetry-specifications
version 1.16.0
.
In particular, the spec for browser.platform
contains, in the note:
Note that some (but not all) of these values can overlap with values
in the [`os.type` and `os.name` attributes](./os.md).
Using build-tools
version 0.14.0
Using opentelemetry-cpp
version 1.8.1
, with the following generation script:
After generation, the C++ code for browser.platform
looks like:
/**
* The platform on which the browser is running
*
* <p>Notes:
<ul> <li>This value is intended to be taken from the <a
href="https://wicg.github.io/ua-client-hints/#interface">UA client hints API</a> ({@code
navigator.userAgentData.platform}). If unavailable, the legacy {@code navigator.platform} API SHOULD
NOT be used instead and this attribute SHOULD be left unset in order for the values to be
consistent. The list of possible values is defined in the <a
href="https://wicg.github.io/ua-client-hints/#sec-ch-ua-platform">W3C User-Agent Client Hints
specification</a>. Note that some (but not all) of these values can overlap with values in the <a
href="./os.md">{@code os.type} and {@code os.name} attributes</a>. However, for consistency, the
values in the {@code browser.platform} attribute should capture the exact value that the user agent
provides.</li> </ul>
*/
static constexpr const char *kBrowserPlatform = "browser.platform";
Note in particular how the original link to os.md
is encoded:
Note that some (but not all) of these values can overlap with values in the <a
href="./os.md">{@code os.type} and {@code os.name} attributes</a>.
Using doxygen
version 1.9.5
, it reports the following warnings:
/.../semantic_conventions.h:49: warning: Illegal command @iliteral found as part of a <a>..</a> block
/.../semantic_conventions.h:49: warning: Illegal command @endiliteral found as part of a <a>..</a> block
/.../semantic_conventions.h:49: warning: Illegal command @iliteral found as part of a <a>..</a> block
/.../semantic_conventions.h:49: warning: Illegal command @endiliteral found as part of a <a>..</a> block
There are a few similar warnings for other semantic conventions.
Doxygen complains about the {@code os.type}
and {@code os.name}
seen inside an <a>anchor</a>
The expected result is to have a complete chain that generates clean doxygen documentation, without warnings.
About the point of failure, not sure where the root cause actually is:
@code
inside an <a></a>
tag ?Spec syntax:
[`os.type` and `os.name` attributes](./os.md)
Please clarify and provide guidance here, in particular on which part (spec, generator, generator template) should be fixed, with suggestions if known.
If this turns out to be a doxygen limitation, what is the best way to implement a work around ?
Thanks.
Please make a new release of build-tools - 0.9.1.
My parquet generator changes would benefit the work happening under open-telemetry/opentelemetry-proto#346
Many tests that use separate input files are not written very cleanly. The most important points:
Further ideas, not as important:
Hi, I want to translate open-telemetry specifications to Japanese. But I found the text is hard-coded in English, at here, and table header. (perhaps only those two. Status, type, Description is ok for me)
So I think it is better to use some i18n for those texts. If a table header is difficult because of character length, I would like to add only the "MUST be one of the ..." text to the translation.
Hi,
I use the docker to gen Jave code, but I just find the java bean code, how to gen grpc java client/server code?
I (and apparently another engineer) had trouble seeing why exactly the build process was failing, the error message only being
File {blah} contains a table that would be reformatted.
I've since found the tool is very particular about whitespace and even the number of dashes in a table header.
This isn't a blocker, as we can add the note manually to the markdown file for now, but it would be nice to add the note in the yaml definition.
Related to open-telemetry/opentelemetry-specification#3418 and open-telemetry/opentelemetry-specification#3352 (comment)
From open-telemetry/opentelemetry-specification#925 (comment):
"Required: Conditional No, but recommended"
looks odd and might be misleading.
The MD generator should only print the condition itself without prepending the word Conditional in the table, unless it references a footnote.
I think this project has enough contributions to be its own SIG that has approvers and maintainers. We should consider some of the instrumentation SIG members for the semantic convention tools as well as people that made contributions @arminru, @Oberon00, @atoulme, @lmolkova, myself for the proto docker.
@open-telemetry/technical-committee @open-telemetry/specs-approvers @open-telemetry/instr-wg thoughts on this?
We would like to enforce formatting for enum values in semantic conventions. (See open-telemetry/opentelemetry-specification#1519 for more details.)
In order to enforce the requirements of only allowing lower case characters and underscores (_), we need linting rules in the markdown generator. The generator should error if there are enum values that doesn't meet this criteria.
I noticed that we have duplicated the list of well-known cloud providers to both the cloud.provider
resource attribute and the faas.invoked_provider
attribute.
With this we keep approaching XSD ๐
Repo opentelemetry-cpp successfully used the semconv code generator already on 1.13.0 specs.
See the generate script used:
Now trying to upgrade to 1.14.0 opentelemetry-specification:
SEMCONV_VERSION=1.14.0
The build fails with this error message:
opentelemetry.semconv.model.exceptions.ValidationError: Semantic Convention trace-exception reference `exception.type` but it cannot be found! - @2:5
In the 1.14.0 specs, indeed, file
semantic_conventions/trace/trace-exception.yaml
uses the exception defined in
semantic_conventions/exception.yaml
Because the docker mount point used is semantic_conventions/trace/
,
the docker image does not see file exceptions.yaml
docker run --rm \
-v ${SCRIPT_DIR}/opentelemetry-specification/semantic_conventions/trace:/source \
...
I hope this helps to narrow down the root cause.
No idea about a fix however.
Please fix and adjust the example to work with the latest semantic conventions from the specs,
as this is needed for each language SIG in general.
[resolved] Blocking for open-telemetry/opentelemetry-cpp#1671
See open-telemetry/opentelemetry-specification#1759 (comment)
The PR in question has two enums in a single convention, and the first one has a note on one of the members. As a result, two consecutive newlines are printed between the footnote and the next enum table.
See open-telemetry/opentelemetry-specification#1192
The markdown generator would need to be passed an URL prefix that is treated as root which is relatively linkable. Semantic conventions would contain absolute links (i.e. revert that part of open-telemetry/opentelemetry-specification#1192).
Other generators should support replacing an URL prefix e.g. https://github.com/open-telemetry/opentelemetry-specification/tree/master/
to https://github.com/open-telemetry/opentelemetry-specification/tree/v0.6/
.
open-telemetry/opentelemetry-specification#3183 declares attributes separately from traces and metrics and then reuses attributes in corresponding semconvs. There are two issues:
Grandparent attributes are not populated in the table. E.g.
http.common
group defines http.method
http.server
group extends http.common
and defines http.route
metric.http.server.duration
extends http.server
. When metric.http.server.duration(full)
semconv table is rendered, it does not include http.method
, but includes http.route
Referenced attributes description should come from extended semconv first. E.g.
net.peer.name
is defined in span-general semconvhttp.client.common
providing long HTTP-specific descriptionsampling_relevant: true
for tracing (and none of it for metrics), so it has to be referenced again in trace.http.client
and long HTTP-specific description should be repeatednet.peer.name
for trace.http.client
semconv, it should prioritize properties from trace.http.client
, then read them from parent (http.client.common
) and finally fill in the gaps from original span-general spec.The code generator uses a flexible mechanism based on Jinja2 to generate output. Yet the markdown output is generated with hard-coded Python code.
Consider introducing more flexibility into markdown generation with Jinja. Ideally, completely replace the Python code for markdown with Jinja templates which allow good structuring and are quite powerful. Just the replacement mechanism (finding the HTML comments) will best stay in Pyhton.
From #70 (comment)
As a new contributor to opentelemetry, I had to debug some stuff in build-tools, and it was a bit vague how we go about building the docker container. It would be great if we could formalize these into a make
command or similar so that it's easier to onboard new contributors.
From open-telemetry/opentelemetry-specification#928 (comment)
Having an example per row makes it harder to understand that they are multiple examples and not a single one. The MD generator should use "or" instead of "
" to separate examples.
See this suggested semantic convention for HTTP headers: open-telemetry/opentelemetry-specification#1061.
Right now, I think it is not possible to specify this in the semantic convention generator. It would be useful however to generate a constant for the http.request.header
prefix (or http.request.header.
with a trailing dot, or a function string httpRequestHeaderName(string key) { return 'http.request.header' + key; }
.
In the markdown, this should probably be designated by adding .*
to the name of the attribute. E.g. like this:
Attribute | Type | Description | Examples | Required |
---|---|---|---|---|
http.request.header.<key> |
string[] | HTTP request headers, <key> being the HTTP Header name (case preserving), the value being the header values. |
http.request.header.Content-Type =["application/json"] ; http.request.X-Forwarded-for =["1.2.3.4", "1.2.3.5"] |
No |
In the YAML source:
type: prefix_map: string[]
.For example, see the Exception semantic convention
(split from #79)
See https://github.com/open-telemetry/build-tools/pull/79/files#r996803040 for the previous discussion and proposals.
Two possible options would be:
- A table of metric names with a (then shared) table of attribute names that apply.
- A table of the metric where "empty" rows are created while attributes are expanded.
Both of them are currently manually applied in the hand-crafted metrics tables in the spec repo.
From open-telemetry/opentelemetry-specification#1497 (comment) there are cases we may want to stabilize a subset of attributes in a given category. It would be nice if we can just mark attributes as stable so the generator handles this for us. This is less important for MD generation and more for code, where stable attributes will need to be rendered differently (different packages for example).
Unable to build the opentelemetry-go-proto release for OTLP v0.19.0 on an arm chip because the build tools are for amd64.
The release steps fail like so:
% make sync VERSION=v0.19.0
upgrading opentelemetry-proto submodule to v0.19.0
HEAD is now at 6459e1a Prepare for v0.19.0 release (#420)
rm -rf gen otlp
rm -rf ./gen/go
mkdir -p ./gen/go
docker run --rm -u 502 -v/Users/josh.macdonald/src/opentelemetry/proto-go:/Users/josh.macdonald/src/opentelemetry/proto-go -w/Users/josh.macdonald/src/opentelemetry/proto-go otel/build-protobuf:0.11.0 --proto_path="gen/proto" --go_out=./gen/go gen/proto/opentelemetry/proto/common/v1/common.proto
Unable to find image 'otel/build-protobuf:0.11.0' locally
0.11.0: Pulling from otel/build-protobuf
Digest: sha256:a1f16b31cb70dca3e486afad20c268cf9d3ee776b0cd6656120c8e013c27a52c
Status: Downloaded newer image for otel/build-protobuf:0.11.0
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
--go_out: protoc-gen-go: Plugin killed by signal 11.
make: *** [gen-otlp-protobuf] Error 1
From #110:
There is no release guidelines in this repo, so I am not sure how to make a release.
specs-approvers
who can help with this and/or write a paragraph in CONTRIBUTING.md to explain how to make the release?
I would like to request a new release, please, to ship the changes in #88.
See open-telemetry/opentelemetry-specification#3299.
E.g. user_agent.original
is defined within attribute_group
in opentelemetry-specification/semantic_conventions folder and referenced in resource semconv here https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/resource/browser.yaml
Code generator allows to specify source
folder and would generate all semconv from there, applying --only
or --exclude
flags to the source. It uses the same subset of files to resolve attributes and generate code for them.
When browser resource semconv is generated, we need to use all resource + cross-signal attributes to resolve attributes, but generate only those that are in resource. there is no such option today.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.