GithubHelp home page GithubHelp logo

hamburgscleanest / guzzle-advanced-throttle Goto Github PK

View Code? Open in Web Editor NEW
128.0 7.0 16.0 407 KB

A Guzzle middleware that can throttle requests according to (multiple) defined rules. It is also possible to define a caching strategy, e.g. get the response from cache when the rate limit is exceeded or always get a cached value to spare your rate limits. Using wildcards in host names is also supported.

License: MIT License

PHP 100.00%
guzzle api middleware throttle throttle-requests rate-limiting rate-limit rate-limiter request-handler advanced

guzzle-advanced-throttle's Introduction

hamburgscleanest/guzzle-advanced-throttle

Latest Version on Packagist Software License Build Status Coverage Status Quality Score Total Downloads

A Guzzle middleware that throttles requests according to (multiple) defined rules.

It is also possible to define a caching strategy. For example, the response can be read from a cache when exceeding rate limits. The cached value can also be preferred to spare your rate limits (force-cache).

Using wildcards in hostnames is also supported.

Install

Via Composer

composer require hamburgscleanest/guzzle-advanced-throttle

Usage

General use

Let's say you wanted to implement the following rules:

20 requests every 1 seconds

100 requests every 2 minutes


  1. First, you have to define the rules in a hamburgscleanest\GuzzleAdvancedThrottle\RequestLimitRuleset:
$rules = new RequestLimitRuleset([
        'https://www.google.com' => [
            [
                'max_requests'     => 20,
                'request_interval' => 1
            ],
            [
                'max_requests'     => 100,
                'request_interval' => 120
            ]
        ]
    ]);

  1. Your handler stack might look like this:
 $stack = new HandlerStack();
 $stack->setHandler(new CurlHandler());

  1. Push hamburgscleanest\GuzzleAdvancedThrottle\Middleware\ThrottleMiddleware to the stack.

It should always be the first middleware on the stack.

 $throttle = new ThrottleMiddleware($rules);

 // Invoke the middleware
 $stack->push($throttle());
 
 // OR: alternatively call the handle method directly
 $stack->push($throttle->handle());

  1. Pass the stack to the client
$client = new Client(['base_uri' => 'https://www.google.com', 'handler' => $stack]);

Either the base_uri has to be the same as the defined host in the rules array or you have to request absolute URLs for the middleware to have an effect.

// relative
$response = $client->get('test');

// absolute
$response = $client->get('https://www.google.com/test');

Caching


Beforehand

Responses with an error status code 4xx or 5xx are not cached (even with force-cache enabled)! Note: Currently, also redirect responses (3xx) are not cached.


Available storage adapters

array (default)

This adapter works out of the box. However, it does not persist anything. This one only works within the same scope. It's set as a default because it doesn't need extra configuration.

The recommended adapter is the laravel one.


laravel (Illuminate/Cache) - recommended

You need to provide a config (Illuminate\Config\Repository) for this adapter.


custom (Implements hamburgscleanest\GuzzleAdvancedThrottle\Cache\Interfaces\StorageInterface)

When you create a new implementation, pass the class name to the RequestLimitRuleset::create method. You'll also need to implement any sort of configuration parsing your instance needs. Please see LaravelAdapter for an example.

Usage
$rules = new RequestLimitRuleset(
    [ ... ], 
    'force-cache', // caching strategy
    MyCustomAdapter::class // storage adapter
    );
    
$throttle = new ThrottleMiddleware($rules);

// Invoke the middleware
$stack->push($throttle());  

Laravel Drivers

General settings

These values can be set for every adapter.

    'cache' => [
        'ttl' => 900, // How long should responses be cached for (in seconds)?
        'allow_empty' => true // When this is set to false, empty responses won't be cached.
    ]

File
    'cache' => [
        'driver'  => 'file',
        'options' => [
            'path' => './cache'
        ],
        ...
    ]

Redis
    'cache' => [
        'driver'  => 'redis',
        'options' => [
            'database' => [
                'cluster' => false,
                'default' => [
                    'host'     => '127.0.0.1',
                    'port'     => 6379,
                    'database' => 0,
                ],
            ]
        ],
        ...
    ]

Memcached
    'cache' => [
        'driver'  => 'memcached',
        'options' => [
            'servers' => [
                [
                    'host'   => '127.0.0.1',
                    'port'   => 11211,
                    'weight' => 100,
                ],
            ]
        ],
        ...
    ]

Pass the config repository in the constructor of RequestLimitRuleset
$rules = new RequestLimitRuleset(
    [ ... ], 
    'cache', // caching strategy
    'laravel', // storage adapter
    new Repository(require '../config/laravel-guzzle-limiter.php') // config repository
    );

The same adapter will be used to store the internal request timers.


The adapters can be defined in the ruleset
$rules = new RequestLimitRuleset(
    [ ... ], 
    'cache', // caching strategy
    'array' // storage adapter
    );

Without caching - no-cache

Just throttle the requests. The responses are not cached. Exceeding the rate limits results in a 429 - Too Many Requests exception.

$rules = new RequestLimitRuleset(
    [ ... ], 
    'no-cache', // caching strategy
    'array' // storage adapter
    );

With caching (default) - cache

The middleware tries to fall back to a cached value when the rate limits are exceeded before throwing a 429 - Too Many Requests exception.

$rules = new RequestLimitRuleset(
    [ ... ], 
    'cache', // caching strategy
    'array' // storage adapter
    );

With forced caching - force-cache

Always use cached responses when available to spare your rate limits. As long as there is a response in the cache for the current request, it returns the cached response. It will only actually send the request when no response is in the cache. Otherwise, it throws a 429 - Too Many Requests exception.

You might want to disable the caching of empty responses with this option (see General Driver Settings).

$rules = new RequestLimitRuleset(
    [ ... ], 
    'force-cache', // caching strategy
    'array' // storage adapter
    );

Custom caching strategy

The custom caching strategy must implement the CacheStrategy interface. It is advised to use the Cacheable abstraction to implement base functionality. For reference implementations, please check ForceCache and Cache.

To use the new caching strategy, you'll need to pass the fully qualified class name to RequestLimitRuleset.

Usage
$rules = new RequestLimitRuleset([ ... ], 
                                MyCustomCacheStrategy::class, 
                                'array', 
                                new Repository(...));
                                
$throttle = new ThrottleMiddleware($rules);
...                                

Wildcards

If you want to define the same rules for multiple different hosts, you can use wildcards. A possible use case can be subdomains:

$rules = new RequestLimitRuleset([
        'https://www.{subdomain}.mysite.com' => [
            [
                'max_requests'     => 50,
                'request_interval' => 2
            ]
        ]
    ]);

This host matches https://www.en.mysite.com, https://www.de.mysite.com, https://www.fr.mysite.com, etc.


Changes

Please see CHANGELOG for more information on what has changed recently.


Testing

composer test

Contributing

Please see CONTRIBUTING and CODE_OF_CONDUCT for details.


Security

If you discover any security-related issues, please email [email protected] instead of using the issue tracker.


Credits


License

The MIT License (MIT). Please see License File for more information.

guzzle-advanced-throttle's People

Contributors

berenddeboer avatar dependabot-preview[bot] avatar eduardokum avatar huisman303 avatar lightguard avatar scrutinizer-auto-fixer avatar timopruesse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

guzzle-advanced-throttle's Issues

Let script u_sleep() before 429 status showed up

Is it possible to make php script sleep milliseconds before got 429 response? Library calculates estimated time before 429, right? Why not sleep this time instead of throwing 429 exception?

Improve configuration (avoid duplication)

Turn this:

$rules = new RequestLimitRuleset([
        [
            'host'             => 'https://www.google.com',
            'max_requests'     => 20,
            'request_interval' => 1
        ],
        [
            'host'             => 'https://www.google.com',
            'max_requests'     => 100,
            'request_interval' => 120
        ]
    ]);

Into this:

$rules = new RequestLimitRuleset([
        'https://www.google.com' => [
            [
                'max_requests'     => 20,
                'request_interval' => 1
            ],
            [
                'max_requests'     => 100,
                'request_interval' => 120
            ]
        ]
    ]);

Essentially the key for each group becomes the host.
As this is a breaking change, it needs to be noted properly.

Respect request parameters for caching

At the moment responses are cached according to the requested URI. That's not really accurate and will return confusing results sometimes. The caching mechanism needs also to be based on the query parameters (or the request body for non GET requests).

So when https://www.test.de/test?query=true is cached it shouldn't be returned for a request to https://www.test.de/test?query=false. A cached response must only be returned when all the query parameters or the request bodies are the same.

Cache driver default doesn't use env CACHE_DRIVER

From the configuration file laravel-guzzle-throttle.php the driver is set to default as standard, however the default driver set by CACHE_DRIVER in the .env file was ignored and it rises a hamburgscleanest\GuzzleAdvancedThrottle\Exceptions\RedisDatabaseNotSetException.

Everything works correctly if I set the driver value to env('CACHE_DRIVER') instead of 'default'.

Why this behaviour?

Allow to use other implementations of StorageInterface and Cacheable

Is your feature request related to a problem? Please describe.
Currently, only those implementations within the project are recognized by Guzzle Advanced Throttle. If I create my own implementations I can't use them without having to patch/hack RequestLimitRuleset

Describe the solution you'd like
RequestLimitRuleset to instantiate and use a class if it implements/extends the proper interface/parent without throwing an error. If the class does not, it will throw an exception.

Describe alternatives you've considered
Maintaining my own patch

Additional context
I should get a PR out soon(TM) for this

Problem with Laravel cache drivers

There currently is a problem with the cache drivers for Laravel / Illuminate. I already started working on it yesterday on the bus. Will hopefully be fixed today or tomorrow. It mainly affects Redis and Memcached.

Host definitions with wildcards

At the moment it is only possible to define a concrete host in the configuration file.
It should be possible to do something like this (pseudo wildcard format):

'host' => 'https://www.{locale}.test.com'

It would then match the following domains for example: https://www.en.test.com, https://www.de.test.com

Additionally it should be possible to place the wildcard anywhere. The following also needs to work:

'host' => 'https://www.test.com/{any}'

Would match: https://www.test.com/siteOne, https://www.test.com/siteTwo, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.