GithubHelp home page GithubHelp logo

streadway / ngx_txid Goto Github PK

View Code? Open in Web Editor NEW
120.0 9.0 22.0 184 KB

$txid variable for nginx - a sortable unique id in 20 case-insensitive characters

License: BSD 2-Clause "Simplified" License

Makefile 19.37% C 53.97% Perl 18.71% Perl 6 7.95%

ngx_txid's Introduction

Overview

ngx_txid is a module that exposes $txid: a cached, request scoped, 20 character, base32hex encoded, temporally and lexically sortable, case insensitive, 96 bit identifier.

Use $txid to correlate logs or upstream requests.

Build

  • Configure and nginx with --add-module=path/to/ngx_txid

Example

Assuming you want to to store your logs in column format. You'll need a join key for each request even across connections regardless of how accurate the scheduler is from your timer_resolution directive. The remaining columns can be specific per data logged.

    log_format agent     '$txid $http_user_agent';
    log_format referrer  '$txid $http_referer';
    log_format peer      '$txid $remote_addr';
    log_format status    '$txid $status';
    log_format request   '$txid $request';
    log_format conn      '$txid conn:$connection reqs:$connection_requests pipe:$pipe';

    server {
        listen       80;
        server_name  example.com;
        access_log   logs/example.com/conns.log conn;
        access_log   logs/example.com/agents.log agent;

        location / {
            remove_header    X-Request-Id;
            add_header       X-Request-Id $txid;
            proxy_set_header X-Request-Id $txid;
            proxy_pass       http://localhost:8080;
        }
    }

Background

The design of this transaction ID should meet the following requirements:

  • Be roughly numerically temporally sortable with ~second granularity.
  • Have a representation that is roughly lexically sortable with ~second granularity.
  • Have a probability of less than 1e-9 for collision at 1 million transactions per second.
  • Be efficient and easy to decode into fixed size C types
  • Always be available at the risk of higher collision probability
  • Use as few bytes as possible
  • Work with IPv4 and IPv6 networks

Technique

Use a monotonic millisecond resolution clock in the high 42 bits and system entropy for the low 54 bits. Use enough entropy bits to satisfy a collision probability at a desired global request rate.

+------------- 64 bits------------+--- 32 bits ----+
+------ 42 bits ------+--22 bits--|----------------+
| msec since 1970-1-1 | random    | random         |
+---------------------+-----------+----------------+

A request rate of 1 million per second across all servers means 1000 random values per millisecond. Estimating the collision probability using the birthday paradox can be done with this formula: 1 - e^(-((m^2)/(2*n))) where m is the number of ids and n is the number of random values possible.

When using 54 bits of entropy:

1mil req/s  = 1 - exp(-((1000^2) /(2*2^54))) = 2.775558e-11
10mil req/s = 1 - exp(-((10000^2)/(2*2^54))) = 2.775558e-09

The odds of collision are small even at 10 million requests per second.

Nginx keeps track of the current clock in increments of the configuration directive timer_resolution. The clock resolution for $txid is 1ms, so a timer resolution greater than 1ms means that the probability of collision will increase. If you have a timer_resolution of 10ms, 1 million requests per second would require 10,000 random values per second in the worst case.

Encoding

Base32hex is used with a lower case alphabet and without padding characters is chosen for the following reasons:

  • lexically sort order equivalent to numeric sort order
  • case insensitive equality
  • lower case is easer for visual compares
  • denser than hex encoding by 4 bytes

Other techniques

  • snowflake: uses time(41) + unique id(10) + sequence(12).

    • Pro: guaranteed unique sequences
    • Pro: fits in 63 bits
    • Cons: requires unique id coordination for each server - 16 workers processes per host means a limit of 64 instances of nginx
    • Cons: only 11 bits available for unique id, needs monitoring
    • Cons: total ordering only possible in the same process
    • Cons: service interruption possible when clocks lose synchronization
  • flake: uses time + mac id + sequence.

    • Pro: guaranteed unique sequences
    • Cons: uses 128 bits
    • Cons: wastes 22 bits of timestamp data
    • Cons: only a single process per host can generate ids - needs to synchronize access to the sequence from each worker process
    • Cons: service interruption possible when clocks lose synchronization
    • Cons: needs cross platform MAC Address lookup.
  • UUIDv4: 122 bits of entropy

    • Pro: very low probability of collision
    • Cons: unsortable
  • UUID with timestamp: 48 bits of time + 74 bits entropy

    • Pro: very low probability of collision
    • Cons: string representation is not temporally local
  • httpd mod_unique_id: host ip(32) + pid(32) + time(32) + sequence (16) + thread id (32)

    • Pro: deterministic
    • Cons: uses 144 bits
    • Cons: assumes unique IPv4 for the hostnamme's interface
    • Cons: unsortable case-sensitive custom representation - base64 with a custom alphabet
    • Cons: hard limit of 65535 ids per second per pid - small tolerance for clock steps

Testing

This module uses the perl Test::Nginx module with tests in the t directory with a specialized build of nginx of the version specified in Makefile. A local install of dependent CPAN modules will be downloaded with cpanminus into t/ext. Sudo/root access is not required.

make test

Memory profiling of the tests can be performed when valgrind is installed.

make grind

ngx_txid's People

Contributors

gui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ngx_txid's Issues

Access $txid inside lua script

Hi There,

I'm trying to access the $txid variable inside Lua script and for some reason it doesn't exist.
Example:
guid = ngx.var.txid

It is crashing because no txid variable exist.

right shift count >= width of type

When compiling with warnings as errors which is default on launchpad:

cc -c -pipe  -O -W -Wall -Wpointer-arith -Wno-unused-parameter -Werror -g   -I src/core -I src/event -I src/event/modules -I src/os/unix -I /build/buildd/nginx-1.4.7/debian/modules/nginx_upstream_check_module -I /build/buildd/nginx-1.4.7/debian/openssl-1.0.1g/.openssl/include -I objs -I src/http -I src/http/modules -I src/mail \
        -o objs/addon/ngx_txid/ngx_txid_module.o \
        /build/buildd/nginx-1.4.7/debian/modules/ngx_txid/ngx_txid_module.c
cc1: warnings being treated as errors
/build/buildd/nginx-1.4.7/debian/modules/ngx_txid/ngx_txid_module.c: In function 'ngx_txid_make':
/build/buildd/nginx-1.4.7/debian/modules/ngx_txid/ngx_txid_module.c:68: error: right shift count >= width of type
/build/buildd/nginx-1.4.7/debian/modules/ngx_txid/ngx_txid_module.c:69: error: right shift count >= width of type
/build/buildd/nginx-1.4.7/debian/modules/ngx_txid/ngx_txid_module.c:70: error: right shift count >= width of type
make[3]: *** [objs/addon/ngx_txid/ngx_txid_module.o] Error 1

Full build: https://launchpadlibrarian.net/176957261/buildlog_ubuntu-lucid-i386.nginx_1.4.7-1~ppa5_FAILEDTOBUILD.txt.gz

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.