yelp / pyramid_zipkin Goto Github PK

View Code? Open in Web Editor NEW

28.0 28.0 15.0 330 KB

Pyramid tween to add Zipkin service spans

License: Apache License 2.0

Makefile 0.72% Python 99.28%

pyramid_zipkin's People

Contributors

Stargazers

Watchers

Forkers

mjbryant prat0318 kaisen sunmoonone yzhang226 rmoorman bplotnick lorinbeer kleopatra999 bbotte sokac acer618 drolando benbariteau

pyramid_zipkin's Issues

pre-commit hooks don't work

There's a few issues with pre-commit config right now.

The first is that when you run pre-commit with the current .pre-commit-config.yaml, you get that the sha is invalid. We should just use a version number in the sha field anyway.

The second is that there are a number issues that pre-commit catches when you run with --run-all, which mainly tells me that we should run "pre-commit run --all-files" in travis.

We currently support logging of only IPv4 addresses. Starting with Zipkin 1.4 endpoints can omit IPv4 (by setting Endpoint.ipv4 to 0), and optionally log Endpoint.ipv6 as the raw 16byte address.
https://github.com/openzipkin/zipkin-api/blob/master/thrift/zipkinCore.thrift#L276

In json, both are the string formatting http://zipkin.io/zipkin-api

ipv4:
string
The text representation of a IPv4 address associated with this endpoint. Ex. 192.168.99.100
ipv6:
string
The text representation of a IPv6 address associated with this endpoint. Ex. 2001:db8::c001

API for adding binary annotations

Users of the pyramid-zipkin library might want to add custom annotations. Currently there is a workaround by logging messages in a certain format to the "pyramid_zipkin.logger" log. However, it'd be nice if instead the user could call something like pyramid_zipkin.add_custom_annotation({'blah': 'foo'}).

Add support for py3

implement "b3 single" header format

I believe X-B3- headers are implemented here, not in py_zipkin. Depending on the course of Yelp/py_zipkin#98 "b3" support might be there not here.

pyramid_zipkin should be over=EXCVIEW instead of over=MAIN

As @lauris pointed out, if non-200 responses are handled through excview, pyramid_zipkin can only handle these if it is over EXCVIEW instead of being the most closer to MAIN.

invalid trace and span ids when always_emit_zipkin_headers is False

If always_emit_zipkin_headers is set to False and the current request is not traced, create_zipkin_attr returns a ZipkinAttr with empty trace and span ids.

If firehose is enabled this breaks py_zipkin as those ids are invalid. The always_emit_zipkin_headers should only control setting trace_id in the request object and nothing else.

Make span name use route if possible

Let's basically re-implement what was done in Brave to reduce the cardinality of the span name: https://github.com/openzipkin/brave/blob/master/instrumentation/http/src/main/java/brave/http/HttpParser.java#L70

Deprecate "is_client" boolean in favor of "service_name" key

See #21 (comment) for context. Basically, if a logged client span has a "service_name" key, that should mean that is_client=True, and if is_client=True, then we should always have a new service_name.

Batch all service annotations together

Adding an annotation like so:

zipkin_logger.debug({'annotations': {'start_db_call': time.time()}})

results in warning in Collector console:

19:00:58.772 [pool-2-thread-2] WARN  o.t.z.storage.cassandra.Repository - Span 1_547073285_-1325006353 in trace 4848644533043884200 had no timestamp. If this happens a lot consider switching back to SizeTieredCompactionStrategy for zipkin.traces

This is because the above debug creates a new span message with no cs, cr OR ss, sr pair. These spans can be merged with the service span created here. (by getting all non-client spans from the loop and getting the annotations and binary annotations out).

Also, we should mention in the docs that the client spans SHOULD contain cs, cr pair.

Is it possible to refactor code and just depend on py_zipkin for the core

Now, that we have py_zipkin, lets slim down this package and keep the core logic unified.

Unset parentId for root span instead of setting to '0'*16

As discussed in openzipkin/zipkin#1085, parentId should not be set for root span, instead of setting to '0'*16.

Test against python3.5

This will require some changing of the Makefile and .travis.yml files to make sure they use tox >= 2.

deprecate python2.6

Are there any users of py26? If not, we should remove it

Fix pyramid_zipkin build failure on py_zipkin 0.9.0

py_zipkin 0.9.0 changed the API slightly, so the builds are failing now

https://travis-ci.org/Yelp/pyramid_zipkin/builds/260325546

Support kafka as well as scribe transport.

Non BC change.

leave transport layer upto the user, just return back the bytes.

should_not_sample_route and should_not_sample_path should not override X-B3-Sampled

With the current logic paths and routes that are blacklisted won't generate a span (and actually interrupt the trace at that point) even if the trace is being sampled. Imo that's not what we usually want.

We usually blacklist paths like /status, /swagger.json or /status/metrics. We do that because we don't want healthchecks and other automated calls to be traced.
But imo if we're in the middle of a trace I'm interested to know if my code is calling the status endpoint of another service and spending time on that.

@adriancole @bplotnick thoughts?

Using zipkin.tracing_percent > 50.0 results in 100% sampling

This is the same issue as Yelp/py_zipkin#8

The problem is that int((1.0 / x) * 100) is always 1 for x > 50.

Support http.route

This has been formalized in Zipkin 2.5. We can get the route via https://docs.pylonsproject.org/projects/pyramid/en/latest/api/request.html#pyramid.request.Request.matched_route

This issue is effectively the same as #56, but with a different name.

spans gettings squashed with same span name within a single trace

culprit is this logic which needs to change.

Report span.timestamp and duration

While possible to derive a span's timestamp and duration at query time, it is best to report them to zipkin. This reduces the guesswork on the server side, and particularly makes the cassandra storage implementation more effective. This also makes the duration query operable.

Do we need to do anything?

First, we'd need to report span.timestamp and duration in the first place :) This might start with an upstream change to py_zipkin to capture this. Here's the normal doc on that zipkin.io/pages/instrumenting.html

Then, we'd need to ensure that the special-case of continuing a client-originated span on the server is addressed (where B3 headers are read).

Here are tests that should be in place to ensure that span.timestamp and duration are set properly.

client creates a span and propagates it to a server
- client propagates b3 headers for trace id, span id, and sampled=1
- server neither reports span.timestamp nor duration
- client reports span.timestamp and duration
server creates a new trace
- no b3 headers are detected
- server reports span.timestamp and duration
server creates a new trace with externally supplied ids
- caller propagates b3 headers for trace id, span id, but no sampled header
- server reports span.timestamp and duration

Does this make sense? Can anyone here help implement this?

See openzipkin/brave#277 (comment)

create_zipkin_attr
is_tracing

this will support generating zipkin attributes and determining if a request is traced in a flexible and customized manner

yelp / pyramid_zipkin Goto Github PK

pyramid_zipkin's People

Contributors

Stargazers

Watchers

Forkers

pyramid_zipkin's Issues

Do we need to do anything?

Recommend Projects

Recommend Topics

Recommend Org

Jobs