juspay / fencer Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 2.0 343 KB

Fencer is a port of https://github.com/lyft/ratelimit into Haskell.

Home Page: https://hub.docker.com/r/juspayin/fencer

License: Other

Haskell 82.25% Nix 13.98% Go 3.09% Shell 0.68%

fencer's People

Contributors

Stargazers

Watchers

Forkers

iammrinal0 mdimjasevic

fencer's Issues

Add statistics

(will expand later)

Allow configuring the gRPC port

lyft/ratelimit already has an environment variable to do this, ideally we should use the same variable name. This variable should also be documented as per #11.

Rewrite the README without assuming knowledge of lyft/ratelimit

Implement tests for configuration parsing

There are no tests for configuration parsing, hence they should be added. Given that there are no tests in the project, this will also set up the test configuration of the project.

Replicate counter identity rules

Looks like we do not need to remove counters when rules are reloaded. The behavior of lyft/ratelimit is as follows:

When rules are reloaded, existing counters are not touched.
When the reloaded rules are the same but one limit has changed, the counter is not reset, and the new limit is taken into account.
When the reloaded rules are the same but one unit has changed, a new counter is used, and the old counter is not removed.

So we need to:

add ratelimit unit to the CounterKey
document in code that counters don't need to be removed, and explain why
document the behavior in README/elsewhere
write tests

Ensure that Fencer.Proto does not go out of date

Currently Fencer/Proto.hs is generated from a protobuf file in the repo. It also depends on proto3-suite. So, when updating the version of proto3-suite we might forget to rerun compile-proto-file – this just happened to me, for instance.

Possible solutions:

Do nothing.
Don't check Fencer/Proto.hs into the repo, always generate it during the build.
Check Fencer/Proto.hs into the repo but error out during the build if the generated file differs.

Decrease Nix closure size

Currently our closure takes about 700 MB:

$ storepath=$(nix-build)
$ nix-store --export $(nix-store -qR $storepath) > out-fencer
$ ls -lah out-fencer
-rw-r--r--  1 yom  staff   708M Sep 27 17:22 out-fencer

The biggest offenders are:

740 MB  └── nix
740 MB      └── store
393 MB          ├─⊕ i2nf6hslid0ak8fzgpj2m082sprihi92-clang-7.1.0
133 MB          ├─⊕ hfy6ibq1hwayv24yrh98f3avc47nmr00-llvm-7.1.0
 63 MB          ├─⊕ 0cifmxw0r6lijng796a3z3nwq67ma5b3-llvm-7.1.0-lib
 27 MB          ├─⊕ 5qk8vydayrvkq313s1srxhrnblhqvahk-grpc-1.2.0-e2cfe9d
 27 MB          ├─⊕ q959m66x0cfryybzapv7c4v0ki2jfr1a-clang-7.1.0-lib
 18 MB          ├─⊕ bhywddq18j79xr274n45byvqjb8fs52j-Libsystem-osx-10.12.6
 11 MB          ├─⊕ i7hz49am5vla34lmpkw5aqkjgd2b98hb-binutils-2.31.1
...

This primarily affects our Docker image size.

Write a manual for running lyft/ratelimit

Write a manual for running lyft/ratelimit. This should also include examples with submitting requests to the service.

"LOG_LEVEL=info" is not supported

We should support the same log level names lyft/ratelimit does. Currently we support "LOG_LEVEL=Info" but not "LOG_LEVEL=info".

Implement near_limit for statsd

Fix branch name template in contribution guidelines

At https://github.com/juspay/fencer/blob/master/CONTRIBUTING.md#git-policy, the branch name template is not rendered properly.

Test that descriptor key can not be empty

https://github.com/lyft/ratelimit/blob/master/test/config/config_test.go#L185-L194

Migrate to DockerHub

Set up Travis CI

Set up the Travis continuous integration for Fencer:

Add a .travis.yml file,
port secrets from GitLab CI settings. Variables should be encrypted and store in the .travis.yml file and should be only made available to jobs that actually require them,
Consult the Travis CI documentation to see how to support Nix.

DomainDefinition: rate limit descriptor list must not be empty

$ grpcurl -proto proto/rls.proto -plaintext -d '{"domain":"hi", "descriptors":[]}' localhost:8081 envoy.service.ratelimit.v2.RateLimitService.ShouldRateLimit

ERROR:
  Code: Unknown
  Message: rate limit descriptor list must not be empty

Make sure that the domainDefinitionDescriptors field has at least one list element and that JSON parsing fails otherwise. Use the NonEmpty list type constructor.

Document all env vars, including logging vars

DomainDefinition: domain id must not be empty

$ grpcurl -proto proto/rls.proto -plaintext -d '{"domain":"", "descriptors":[]}' localhost:8081 envoy.service.ratelimit.v2.RateLimitService.ShouldRateLimit

ERROR:
  Code: Unknown
  Message: rate limit domain must not be emptyÿ

Make sure that parsing fails if the domain id is empty.

Document known difference: lyft/ratelimit merely concatenates domains and keys, we don't

lyft/ratelimit treats domain: mongo, key: cps_database and domain: mongo_cps, key: database as literally the same. I don't think we should do the same, but we should document it.

Use CLOCK_MONOTONIC to avoid leap seconds problems?

getTimestamp will report the same timestamp twice on a leap second. We might want to use the clock package instead, either with Monotonic or MonotonicCoarse timestamps.

Deploy Fencer on Juspay infrastructure

Depends on #101.

Write more configuration parsing tests

With issue #2 a few configuration parsing tests were added. Write more of such tests.

Fix, test, and document behavior for configuration loading

What happens if at startup, one rule (out of several) in one domain (out of several) can't be parsed?
Same, but during config reloading, not during startup.
What happens when configuration contains duplicate domains?
What happens when a domain contains duplicate rules?
What happens with YAML files containing several documents?
What is the minimal accepted configuration file? Is it the same for Fencer and for lyft/ratelimit?

Also:

(done in #46, not documented) Turns out that lyft/ratelimit loads all files, not just .yml files.
(done in #46, not documented) And it loads them recursively.
Would it look at .foo/bar.yaml if RUNTIME_IGNOREDOTFILES is enabled?
RUNTIME_IGNOREDOTFILES in general is not tested by our testsuite by our testsuite, but should be.

Also:

When lyft/ratelimit finds no configs in the dir, it responds to requests with OK.
When lyft/ratelimit doesn't find the dir symlink, it responds to requests with OK.
(partly done in #35, not tested properly, not documented) When lyft/ratelimit finds the config dir but even one file is corrupted (broken YAML or correct YAML but broken field names), it responds to requests with ERROR.
When lyft/ratelimit can't read a file (e.g. chmod 0-ed), other files are loaded correctly and it responds to requests with OK.

And more:

Q: What happens to symlinks to files outside the config dir?
A: I checked and they are loaded just fine.
Q: Symlinks to directories outside the config dir?
A: They are not followed.
Q: Symlink cycles?
A: They are detected ("Too many levels of symbolic links") and other files are loaded correctly.

When are headers returned by lyft/ratelimit?

rls.proto:

  // A list of headers to add to the response
  repeated HeaderValue headers = 3;

When are those headers returned? Do we ever have to return them? If not, add a comment in Fencer.Server.

Disallow duplicates in configs

Duplicate domains are not allowed: https://github.com/lyft/ratelimit/blob/master/test/config/config_test.go#L174-L183
Duplicate keys are not allowed: https://github.com/lyft/ratelimit/blob/master/test/config/config_test.go#L196-L205 (can use https://hackage.haskell.org/package/yaml-0.11.2.0/docs/Data-Yaml.html#v:decodeFileWithWarnings)

Split out contribution rules and the styleguide into CONTRIBUTING.md

Investigate using cabal2nix

Currently we can't use cabal2nix because of these flags:

, configureFlags ? [], enableSharedExecutables ? true, enableSharedLibraries ? true

We can probably use overrides instead to the same effect.

Upgrade to newer gRPC

grpc-haskell has recently been upgraded to support grpc-1.22. I updated .nix files in branch artyom/grpc, but lyft/ratelimit integration tests are crashing Fencer:

fencer(2509,0x70000b9b1000) malloc: *** error for object 0x7f98d3c0cae0: pointer being freed was not allocated
fencer(2509,0x70000b9b1000) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6

Filed as awakesecurity/gRPC-haskell#91.

Rewrite lyft/ratelimit integration tests from Go to Haskell

There are a number of integration tests in Lyft's ratelimit implementation in Go. They should be rewritten in Haskell in Juspay's implementation of ratelimit. In particular, the following tests should be rewritten:

https://github.com/lyft/ratelimit/blob/master/test/service/ratelimit_test.go#L88-L212

Creating Docker image with Nix defaults created time to EPOCH time

After doing docker load -i fencer.tar.gz we can see the time that it shows as 49 years ago or you can do this to get the exact time:

$ docker inspect -f '{{ .Created }}' juspay/fencer
1970-01-01T00:00:01Z

nix-build shouldn't be using 'withHoogle' at all

Currently nix-build builds with withHoogle. What I'd like is for nix-build to have withHoogle = false by default.

CI should push both normal dependencies and shell-only dependencies to Cachix

Cachix is for developers as well as for the CI, so even if shell-only dependencies are useless for CI, we should still build them and push to Cachix.

Make sure we don't forget any exported tests

Currently it's possible the a .Test module will export some tests but not all of them will be used in test/Main.hs.

To prevent this, let's use a different scheme for test exports:

-- | Tests for "Fencer.Logic".
module Fencer.Logic.Test (tests) where

tests :: TestTree
tests = testGroup "Logic tests" [test_logicLimitUnitChange]

-- test/Main.hs
module Main (main) where

import qualified Fencer.Logic.Test
import qualified Fencer.Rules.Test
import qualified Fencer.Server.Test
import qualified Fencer.Types.Test

tests :: TestTree
tests = testGroup "All tests"
  [ Fencer.Types.Test.tests
  , Fencer.Logic.Test.tests
  , Fencer.Rules.Test.tests
  -- 'after' is needed to avoid running the 'logic' and 'server' tests
  -- concurrently. Running them concurrently is problematic because
  -- both create a server (binding the same port) so if they create it
  -- at the same time, one of the test groups will fail. The 'after'
  -- function makes the 'server' tests run after the 'logic' tests.
  , after AllFinish "test_logic" Fencer.Server.Test.tests
  ]

Rework the counter logic test

In a comment to a pull request, @neongreen suggested several changes to the test that checks the counter logic. Implement what has been suggested there.

Intermediate keys should have no rate limits

Currently the implementation allows for intermediate rule keys to have rate limits, even though in the domain logic it makes sense only for leaf keys to have rate limits. Fix the logic to allow only leaf rule keys to have rate limits and add a test if necessary.

Build and publish a Docker image for Fencer

Test gRPC error responses

when making a request with an empty domain, "rate limit domain must not be empty"
when making a request with an empty descriptor list, "rate limit descriptor list must not be empty"
when no config loaded, "no rate limit configuration loaded"

Platform portability: use a temporary file instead of /dev/null

In tests we write to /dev/null to make sure the testing output isn't garbled. However, Windows don't have such a file. Therefore, a suggestion is to use a temporary file (created with e.g. the platform-agnostic temporary package) to replace /dev/null.

Document the difference regarding returned `limitRemaining`

limitRemaining is returned if it's not zero:

[:~/monadfix/juspay/fencer] master+* ± grpcurl -proto proto/rls.proto -plaintext -d '{"domain":"mongo_cps", "descriptors":[{"entries":[{"key":"database","value":"default"}]}]}' localhost:8081 envoy.service.ratelimit.v2.RateLimitService.ShouldRateLimit
{
  "overallCode": "OK",
  "statuses": [
    {
      "code": "OK",
      "currentLimit": {
        "requestsPerUnit": 2,
        "unit": "MINUTE"
      },
      "limitRemaining": 1
    }
  ]
}

limitRemaining is omitted if it's zero:

[:~/monadfix/juspay/fencer] master+* ± grpcurl -proto proto/rls.proto -plaintext -d '{"domain":"mongo_cps", "descriptors":[{"entries":[{"key":"database","value":"default"}]}]}' localhost:8081 envoy.service.ratelimit.v2.RateLimitService.ShouldRateLimit
{
  "overallCode": "OK",
  "statuses": [
    {
      "code": "OK",
      "currentLimit": {
        "requestsPerUnit": 2,
        "unit": "MINUTE"
      }
    }
  ]
}

I think our behavior is nicer (and unlikely to break any consumers), but we should document the difference.

Add instructions for Hlint

The Contribution Guidelines could use instructions for how to add a Git hook for Hlint. Add them.

Server test failing non-deterministically

The only server test we have so far Fencer.Server.Test.test_responseNoRules non-deterministically fails with:

    When no rules have been loaded, all requests error out: FAIL
      test/Fencer/Server/Test.hs:44:
      Got wrong gRPC error response
      expected: ClientIOError (GRPCIOBadStatusCode StatusUnknown (StatusDetails {unStatusDetails = "rate limit descriptor list must not be empty"}))
       but got: ClientIOError (GRPCIOBadStatusCode StatusUnavailable (StatusDetails {unStatusDetails = "Endpoint read failed"}))

or with:

    When no rules have been loaded, all requests error out: FAIL
      test/Fencer/Server/Test.hs:48:
      Expected an error response, got a normal response: status = StatusOk, result = RateLimitResponse {rateLimitResponseOverallCode = Enumerated {enumerated = Right RateLimitResponse_CodeOK}, rateLimitResponseStatuses = [], rateLimitResponseHeaders = []}

or with:

      Got wrong gRPC error response
      expected: ClientIOError (GRPCIOBadStatusCode StatusUnknown (StatusDetails {unStatusDetails = "rate limit descriptor list must not be empty"}))
       but got: ClientIOError GRPCIOTimeout

Pin down the root cause and why it fails only sometimes.

Contribution rules – should be figured out together with writing a release blogpost
CLA? – not needed
Copyright, maintainer email, etc

Don't build docs on CI

It takes several minutes to download doc packages and build the Hoogle index, we should skip that.

juspay / fencer Goto Github PK

fencer's People

Contributors

Stargazers

Watchers

Forkers

fencer's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs