yelp / pyramid_zipkin Goto Github PK
View Code? Open in Web Editor NEWPyramid tween to add Zipkin service spans
License: Apache License 2.0
Pyramid tween to add Zipkin service spans
License: Apache License 2.0
There's a few issues with pre-commit config right now.
The first is that when you run pre-commit with the current .pre-commit-config.yaml, you get that the sha is invalid. We should just use a version number in the sha field anyway.
The second is that there are a number issues that pre-commit catches when you run with --run-all, which mainly tells me that we should run "pre-commit run --all-files" in travis.
We currently support logging of only IPv4 addresses. Starting with Zipkin 1.4 endpoints can omit IPv4 (by setting Endpoint.ipv4 to 0), and optionally log Endpoint.ipv6 as the raw 16byte address.
https://github.com/openzipkin/zipkin-api/blob/master/thrift/zipkinCore.thrift#L276
In json, both are the string formatting http://zipkin.io/zipkin-api
ipv4:
string
The text representation of a IPv4 address associated with this endpoint. Ex. 192.168.99.100
ipv6:
string
The text representation of a IPv6 address associated with this endpoint. Ex. 2001:db8::c001
Users of the pyramid-zipkin library might want to add custom annotations. Currently there is a workaround by logging messages in a certain format to the "pyramid_zipkin.logger" log. However, it'd be nice if instead the user could call something like pyramid_zipkin.add_custom_annotation({'blah': 'foo'})
.
I believe X-B3-
headers are implemented here, not in py_zipkin. Depending on the course of Yelp/py_zipkin#98 "b3" support might be there not here.
As @lauris pointed out, if non-200 responses are handled through excview, pyramid_zipkin can only handle these if it is over EXCVIEW instead of being the most closer to MAIN.
If always_emit_zipkin_headers
is set to False and the current request is not traced, create_zipkin_attr returns a ZipkinAttr with empty trace and span ids.
If firehose is enabled this breaks py_zipkin as those ids are invalid. The always_emit_zipkin_headers
should only control setting trace_id in the request object and nothing else.
Let's basically re-implement what was done in Brave to reduce the cardinality of the span name: https://github.com/openzipkin/brave/blob/master/instrumentation/http/src/main/java/brave/http/HttpParser.java#L70
See #21 (comment) for context. Basically, if a logged client span has a "service_name" key, that should mean that is_client=True, and if is_client=True, then we should always have a new service_name.
Adding an annotation like so:
zipkin_logger.debug({'annotations': {'start_db_call': time.time()}})
results in warning in Collector console:
19:00:58.772 [pool-2-thread-2] WARN o.t.z.storage.cassandra.Repository - Span 1_547073285_-1325006353 in trace 4848644533043884200 had no timestamp. If this happens a lot consider switching back to SizeTieredCompactionStrategy for zipkin.traces
This is because the above debug creates a new span message with no cs
, cr
OR ss
, sr
pair. These spans can be merged with the service span created here. (by getting all non-client spans from the loop and getting the annotations and binary annotations out).
Also, we should mention in the docs that the client spans SHOULD contain cs
, cr
pair.
Now, that we have py_zipkin, lets slim down this package and keep the core logic unified.
As discussed in openzipkin/zipkin#1085, parentId should not be set for root span, instead of setting to '0'*16.
This will require some changing of the Makefile and .travis.yml files to make sure they use tox >= 2
.
Are there any users of py26? If not, we should remove it
py_zipkin 0.9.0 changed the API slightly, so the builds are failing now
Non BC change.
With the current logic paths and routes that are blacklisted won't generate a span (and actually interrupt the trace at that point) even if the trace is being sampled. Imo that's not what we usually want.
We usually blacklist paths like /status
, /swagger.json
or /status/metrics
. We do that because we don't want healthchecks and other automated calls to be traced.
But imo if we're in the middle of a trace I'm interested to know if my code is calling the status endpoint of another service and spending time on that.
@adriancole @bplotnick thoughts?
This is the same issue as Yelp/py_zipkin#8
The problem is that int((1.0 / x) * 100)
is always 1 for x > 50.
This has been formalized in Zipkin 2.5. We can get the route via https://docs.pylonsproject.org/projects/pyramid/en/latest/api/request.html#pyramid.request.Request.matched_route
This issue is effectively the same as #56, but with a different name.
culprit is this logic which needs to change.
While possible to derive a span's timestamp and duration at query time, it is best to report them to zipkin. This reduces the guesswork on the server side, and particularly makes the cassandra storage implementation more effective. This also makes the duration query operable.
First, we'd need to report span.timestamp and duration in the first place :) This might start with an upstream change to py_zipkin to capture this. Here's the normal doc on that zipkin.io/pages/instrumenting.html
Then, we'd need to ensure that the special-case of continuing a client-originated span on the server is addressed (where B3 headers are read).
Here are tests that should be in place to ensure that span.timestamp and duration are set properly.
client creates a span and propagates it to a server
server creates a new trace
server creates a new trace with externally supplied ids
Does this make sense? Can anyone here help implement this?
In gevent, threading_local
always creates a new object and scraps of requests
attribute attached to it during module load
Reassign empty list if attribute is not found attached.
Actually, I think it will happen with any exception. The update_binary_annotations
function isn't called when the handler raises an exception, so we don't get annotations like response_status_code
and http.route
.
The child spans right now get the same service_name
as the service. There should be a way to provide that while creating child span.
here value
should also be converted to str (and fail if not able to) so that collector doesn't crash with non conformant values.
It will be nice to have another binary annotation which doesn't have query string so that all the traces can be searched using web UI search box.
This pyramid construct is often different than the http.uri
annotation we include by default.
In addition to this it'd be kinda cool to allow users to turn binary annotations added by default on/off. Something like setting a 'default_binary_annotations' list in the pyramid registry. Just an idea.
It'd be nice to know how much time individual calls spend logging zipkin spans.
support for custom versions of
this will support generating zipkin attributes and determining if a request is traced in a flexible and customized manner
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.