GithubHelp home page GithubHelp logo

krukov / cashews Goto Github PK

View Code? Open in Web Editor NEW
334.0 10.0 21.0 815 KB

Cache with async power

License: MIT License

Python 99.92% Makefile 0.08%
python cache cache-control asycnio asynchronous aiohttp caching fastapi redis asyncio

cashews's Introduction

๐Ÿฅ” CASHEWS ๐Ÿฅ”

Async cache framework with simple API to build fast and reliable applications

pip install cashews
pip install cashews[redis]
pip install cashews[diskcache]
pip install cashews[dill] # can cache in redis more types of objects
pip install cashews[speedup] # for bloom filters

Why

Cache plays a significant role in modern applications and everybody wants to use all the power of async programming and cache. There are a few advanced techniques with cache and async programming that can help you build simple, fast, scalable and reliable applications. This library intends to make it easy to implement such techniques.

Features

  • Easy to configure and use
  • Decorator-based API, decorate and play
  • Different cache strategies out-of-the-box
  • Support for multiple storage backends (In-memory, Redis, DiskCache)
  • Set TTL as a string ("2h5m"), as timedelta or use a function in case TTL depends on key parameters
  • Transactionality
  • Middlewares
  • Client-side cache (10x faster than simple cache with redis)
  • Bloom filters
  • Different cache invalidation techniques (time-based or tags)
  • Cache any objects securely with pickle (use hash key)
  • 2x faster than aiocache (with client side caching)

Usage Example

from cashews import cache

cache.setup("mem://")  # configure as in-memory cache, but redis/diskcache is also supported

# use a decorator-based API
@cache(ttl="3h", key="user:{request.user.uid}")
async def long_running_function(request):
    ...

# or for fine-grained control, use it directly in a function
async def cache_using_function(request):
    await cache.set(key=request.user.uid, value=request.user, expire="20h")
    ...

More examples here

Table of Contents

Configuration

cashews provides a default cache, that you can setup in two different ways:

from cashews import cache

# via url
cache.setup("redis://0.0.0.0/?db=1&socket_connect_timeout=0.5&suppress=0&hash_key=my_secret&enable=1")
# or via kwargs
cache.setup("redis://0.0.0.0/", db=1, wait_for_connection_timeout=0.5, suppress=False, hash_key=b"my_key", enable=True)

Alternatively, you can create a cache instance yourself:

from cashews import Cache

cache = Cache()
cache.setup(...)

Optionally, you can disable cache with disable/enable parameter (see Disable Cache):

cache.setup("redis://redis/0?enable=1")
cache.setup("mem://?size=500", disable=True)
cache.setup("mem://?size=500", enable=False)

You can setup different Backends based on a prefix:

cache.setup("redis://redis/0")
cache.setup("mem://?size=500", prefix="user")

await cache.get("accounts")  # will use the redis backend
await cache.get("user:1")  # will use the memory backend

Available Backends

In-memory

The in-memory cache uses fixed-sized LRU dict to store values. It checks expiration on get and periodically purge expired keys.

cache.setup("mem://")
cache.setup("mem://?check_interval=10&size=10000")

Redis

Requires redis package.\

This will use Redis as a storage.

This backend uses pickle module to serialize values, but the cashes can store values with sha1-keyed hash.

Use secret and digestmod parameters to protect your application from security vulnerabilities.

The digestmod is a hashing algorithm that can be used: sum, md5 (default), sha1 and sha256

The secret is a salt for a hash.

Pickle can't serialize any type of object. In case you need to store more complex types

you can use dill - set pickle_type="dill". Dill is great, but less performance. If you need complex serializer for sqlalchemy objects you can set pickle_type="sqlalchemy"

Any connection errors are suppressed, to disable it use suppress=False - a CacheBackendInteractionError will be raised

If you would like to use client-side cache set client_side=True

Client side cache will add cashews: prefix for each key, to customize it use client_side_prefix option.

cache.setup("redis://0.0.0.0/?db=1&minsize=10&suppress=false&hash_key=my_secret", prefix="func")
cache.setup("redis://0.0.0.0/2", password="my_pass", socket_connect_timeout=0.1, retry_on_timeout=True, hash_key="my_secret")
cache.setup("redis://0.0.0.0", client_side=True, client_side_prefix="my_prefix:", pickle_type="dill")

For using secure connections to redis (over ssl) uri should have rediss as schema

cache.setup("rediss://0.0.0.0/", ssl_ca_certs="path/to/ca.crt", ssl_keyfile="path/to/client.key",ssl_certfile="path/to/client.crt",)

DiskCache

Requires diskcache package.

This will use local sqlite databases (with shards) as storage.

It is a good choice if you don't want to use redis, but you need a shared storage, or your cache takes a lot of local memory. Also, it is a good choice for client side local storage.

You can setup disk cache with FanoutCache parameters

** Warning ** cache.scan and cache.get_match does not work with this storage (works only if shards are disabled)

cache.setup("disk://")
cache.setup("disk://?directory=/tmp/cache&timeout=1&shards=0")  # disable shards
Gb = 1073741824
cache.setup("disk://", size_limit=3 * Gb, shards=12)

Basic API

There are a few basic methods to work with cache:

from cashews import cache

cache.setup("mem://")  # configure as in-memory cache

await cache.set(key="key", value=90, expire=60, exist=None)  # -> bool
await cache.set_raw(key="key", value="str")  # -> bool

await cache.get("key", default=None)  # -> Any
await cache.get_raw("key")
await cache.get_many("key1", "key2", default=None)
async for key, value in cache.get_match("pattern:*", batch_size=100):
    ...

await cache.incr("key") # -> int

await cache.delete("key")
await cache.delete_many("key1", "key2")
await cache.delete_match("pattern:*")

async for key in cache.scan("pattern:*"):
    ...

await cache.expire("key", timeout=10)
await cache.get_expire("key")  # -> int seconds to expire

await cache.ping(message=None)  # -> bytes
await cache.clear()

await cache.is_locked("key", wait=60)  # -> bool
async with cache.lock("key", expire=10):
    ...
await cache.set_lock("key", value="value", expire=60)  # -> bool
await cache.unlock("key", "value")  # -> bool

await cache.get_keys_count()  # -> int - total number of keys in cache
await cache.close()

Disable Cache

Cache can be disabled not only at setup, but also in runtime. Cashews allow you to disable/enable any call of cache or specific commands:

from cashews import cache, Command

cache.setup("mem://")  # configure as in-memory cache

cache.disable(Command.DELETE)
cache.disable()
cache.enable(Command.GET, Command.SET)
cache.enable()

with cache.disabling():
  ...

Strategies

Simple cache

This is a typical cache strategy: execute, store and return from cache until it expires.

from datetime import timedelta
from cashews import cache

cache.setup("mem://")

@cache(ttl=timedelta(hours=3), key="user:{request.user.uid}")
async def long_running_function(request):
    ...

Fail cache (Failover cache)

Return cache result, if one of the given exceptions is raised (at least one function call should succeed prior to that).

from cashews import cache

cache.setup("mem://")

# note: the key will be "__module__.get_status:name:{name}"
@cache.failover(ttl="2h", exceptions=(ValueError, MyException))
async def get_status(name):
    value = await api_call()
    return {"status": value}

If exceptions didn't get will catch all exceptions or use default if it is set by:

cache.set_default_fail_exceptions(ValueError, MyException)

Hit cache

Expire cache after given numbers of call cache_hits.

from cashews import cache

cache.setup("mem://")

@cache.hit(ttl="2h", cache_hits=100, update_after=2)
async def get(name):
    value = await api_call()
    return {"status": value}

Early

Cache strategy that tries to solve Cache stampede problem with a hot cache recalculating result in a background.

from cashews import cache  # or: from cashews import early

# if you call this function after 7 min, cache will be updated in a background
@cache.early(ttl="10m", early_ttl="7m")
async def get(name):
    value = await api_call()
    return {"status": value}

Soft

Like a simple cache, but with a fail protection base on soft ttl.

from cashews import cache

cache.setup("mem://")

# if you call this function after 7 min, cache will be updated and return a new result.
# If it fail on recalculation will return current cached value (if it is not more than 10 min old)
@cache.soft(ttl="10m", soft_ttl="7m")
async def get(name):
    value = await api_call()
    return {"status": value}

Iterators

All upper decorators can be used only with coroutines. Cashing async iterators works differently. To cache async iterators use iterator decorator

from cashews import cache

cache.setup("mem://")


@cache.iterator(ttl="10m", key="get:{name}")
async def get(name):
    async for item in get_pages(name):
        yield ...

Locked

Decorator that can help you to solve Cache stampede problem. Lock the following function calls until the first one is finished. This guarantees exactly one function call for given ttl.

โš ๏ธ **Warning: this decorator will not cache the result To do it you can combine this decorator with any cache decorator or use parameter lock=True with @cache()

from cashews import cache

cache.setup("mem://")

@cache.locked(ttl="10s")
async def get(name):
    value = await api_call()
    return {"status": value}

Rate limit

Rate limit for a function call: if rate limit is reached raise an RateLimitError exception.

โš ๏ธ **Warning: this decorator will not cache the result To do it you can combine this decorator with any cache failover decorator`

from cashews import cache, RateLimitError

cache.setup("mem://")

# no more than 10 calls per minute or ban for 10 minutes - raise RateLimitError
@cache.rate_limit(limit=10, period="1m", ttl="10m")
async def get(name):
    value = await api_call()
    return {"status": value}



# no more than 100 calls in 10 minute window. if rate limit will rich -> return from cache
@cache.failover(ttl="10m", exceptions=(RateLimitError, ))
@cache.slice_rate_limit(limit=100, period="10m")
async def get_next(name):
    value = await api_call()
    return {"status": value}

Circuit breaker

Circuit breaker pattern. Count the number of failed calls and if the error rate reaches the specified value, it will raise CircuitBreakerOpen exception

โš ๏ธ **Warning: this decorator will not cache the result To do it you can combine this decorator with any cache failover decorator`

from cashews import cache, CircuitBreakerOpen

cache.setup("mem://")

@cache.circuit_breaker(errors_rate=10, period="1m", ttl="5m")
async def get(name):
    ...


@cache.failover(ttl="10m", exceptions=(CircuitBreakerOpen, ))
@cache.circuit_breaker(errors_rate=10, period="10m", ttl="5m", half_open_ttl="1m")
async def get_next(name):
    ...

Bloom filter (experimental)

Simple Bloom filter:

from cashews import cache

cache.setup("mem://")

@cache.bloom(capacity=10_000, false_positives=1)
async def email_exists(email: str) -> bool:
    ...

for email in all_users_emails:
    await email_exists.set(email)

await email_exists("[email protected]")

Cache condition

By default, any successful result of the function call is stored, even if it is a None. Caching decorators have the parameter - condition, which can be:

  • a callable object that receives the result of a function call or an exception, args, kwargs and a cache key
  • a string: "not_none" or "skip_none" to do not cache None values in
from cashews import cache, NOT_NONE

cache.setup("mem://")

@cache(ttl="1h", condition=NOT_NONE)
async def get():
    ...


def skit_test_result(result, args, kwargs, key=None) -> bool:
    return result and result != "test"

@cache(ttl="1h", condition=skit_test_result)
async def get():
    ...

It is also possible to cache an exception that the function can raise, to do so use special conditions (only for simple, hit and early)

from cashews import cache, with_exceptions, only_exceptions

cache.setup("mem://")

@cache(ttl="1h", condition=with_exceptions(MyException, TimeoutError))
async def get():
    ...


@cache(ttl="1h", condition=only_exceptions(MyException, TimeoutError))
async def get():
    ...

Also caching decorators have the parameter time_condition - min latency in seconds (can be set like ttl) of getting the result of a function call to be cached.

from cashews import cache

cache.setup("mem://")

@cache(ttl="1h", time_condition="3s")  # to cache for 1 hour if execution takes more than 3 seconds
async def get():
    ...

Template Keys

Often, to compose a cache key, you need all the parameters of the function call. By default, Cashews will generate a key using the function name, module names and parameters

from cashews import cache

cache.setup("mem://")

@cache(ttl=timedelta(hours=3))
async def get_name(user, *args, version="v1", **kwargs):
    ...

# a key template will be "__module__.get_name:user:{user}:{__args__}:version:{version}:{__kwargs__}"

await get_name("me", version="v2")
# a key will be "__module__.get_name:user:me::version:v2"
await get_name("me", version="v1", foo="bar")
# a key will be "__module__.get_name:user:me::version:v1:foo:bar"
await get_name("me", "opt", "attr", opt="opt", attr="attr")
# a key will be "__module__.get_name:user:me:opt:attr:version:v1:attr:attr:opt:opt"

For more advanced usage it better to define a cache key manually:

from cashews import cache

cache.setup("mem://")

@cache(ttl="2h", key="user_info:{user_id}")
async def get_info(user_id: str):
    ...

You may use objects in a key and access to an attribute through a template:

@cache(ttl="2h", key="user_info:{user.uuid}")
async def get_info(user: User):
    ...

You may use built-in functions to format template values (lower, upper, len, jwt, hash)

@cache(ttl="2h", key="user_info:{user.name:lower}:{password:hash(sha1)}")
async def get_info(user: User, password: str):
    ...


@cache(ttl="2h", key="user:{token:jwt(client_id)}")
async def get_user_by_token(token: str) -> User:
    ...

Or define your own transformation functions:

from cashews import default_formatter, cache

cache.setup("mem://")

@default_formatter.register("prefix")
def _prefix(value, chars=3):
    return value[:chars].upper()


@cache(ttl="2h", key="servers-user:{user.index:prefix(4)}")  # a key will be "servers-user:DWQS"
async def get_user_servers(user):
    ...

or register type formatters:

from decimal import Decimal
from cashews import default_formatter, cache

@default_formatter.type_format(Decimal)
def _decimal(value: Decimal) -> str:
    return str(value.quantize(Decimal("0.00")))


@cache(ttl="2h", key="price-{item.price}:{item.currency:upper}")  # a key will be "price-10.00:USD"
async def convert_price(item):
    ...

Not only function arguments can participate in a key formation. Cashews have a template_context. You may use any variable registered in it:

from cashews import cache, key_context, register_key_context

cache.setup("mem://")
register_key_context("client_id")


@cache(ttl="2h", key="user:{client_id}")
async def get_current_user():
  pass

...
with key_context(client_id=135356):
    await get_current_user()

Template for a class method

from cashews import cache

cache.setup("mem://")

class MyClass:

    @cache(ttl="2h")
    async def get_name(self, user, version="v1"):
         ...

# a key template will be "__module__:MyClass.get_name:self:{self}:user:{user}:version:{version}

await MyClass().get_name("me", version="v2")
# a key will be "__module__:MyClass.get_name:self:<__module__.MyClass object at 0x105edd6a0>:user:me:version:v1"

As you can see, there is an ugly reference to the instance in the key. That is not what we expect to see. That cache will not work properly. There are 3 solutions to avoid it:

  1. define __str__ magic method in our class
class MyClass:

    @cache(ttl="2h")
    async def get_name(self, user, version="v1"):
         ...

    def __str__(self) -> str:
        return self._host

await MyClass(host="http://example.com").get_name("me", version="v2")
# a key will be "__module__:MyClass.get_name:self:http://example.com:user:me:version:v1"
  1. Set a key template
class MyClass:

    @cache(ttl="2h", key="{self._host}:name:{user}:{version}")
    async def get_name(self, user, version="v1"):
         ...

await MyClass(host="http://example.com").get_name("me", version="v2")
# a key will be "http://example.com:name:me:v1"
  1. Use noself or noself_cache if you want to exclude self from a key
from cashews import cache, noself, noself_cache

cache.setup("mem://")

class MyClass:

    @noself(cache)(ttl="2h")
    async def get_name(self, user, version="v1"):
         ...

    @noself_cache(ttl="2h")  # for python <= 3.8
    async def get_name(self, user, version="v1"):
         ...
# a key template will be "__module__:MyClass.get_name:user:{user}:version:{version}

await MyClass().get_name("me", version="v2")
# a key will be "__module__:MyClass.get_name:user:me:version:v1"

TTL

Cache time to live (ttl) is a required parameter for all cache decorators. TTL can be:

  • an integer as the number of seconds
  • a timedelta
  • a string like in golang e.g 1d2h3m50s
  • a callable object like a function that receives args and kwargs of the decorated function and returns one of the previous format for TTL

Examples:

from cashews import cache
from datetime import timedelta

cache.setup("mem://")

@cache(ttl=60 * 10)
async def get(item_id: int) -> Item:
    pass

@cache(ttl=timedelta(minutes=10))
async def get(item_id: int) -> Item:
    pass

@cache(ttl="10m")
async def get(item_id: int) -> Item:
    pass

def _ttl(item_id: int) -> str:
    return "2h" if item_id > 10 else "1h"

@cache(ttl=_ttl)
async def get(item_id: int) -> Item:
    pass

What can be cached

Cashews mostly use built-in pickle to store data but also support other pickle-like serialization like dill. Some types of objects are not picklable, in this case, cashews has API to define custom encoding/decoding:

from cashews.serialize import register_type


async def my_encoder(value: CustomType, *args, **kwargs) -> bytes:
    ...


async def my_decoder(value: bytes, *args, **kwargs) -> CustomType:
    ...


register_type(CustomType, my_encoder, my_decoder)

Cache invalidation

Cache invalidation - one of the main Computer Science well-known problems.

Sometimes, you want to invalidate the cache after some action is triggered. Consider this example:

from cashews import cache

cache.setup("mem://")

@cache(ttl="1h", key="items:page:{page}")
async def items(page=1):
    ...

@cache.invalidate("items:page:*")
async def create_item(item):
   ...

Here, the cache for items will be invalidated every time create_item is called There are two problems:

  1. with redis backend you cashews will scan a full database to get a key that match a pattern (items:page:*) - not good for performance reasons
  2. what if we do not specify a key for cache:
@cache(ttl="1h")
async def items(page=1):
    ...

Cashews provide the tag system: you can tag cache keys, so they will be stored in a separate SET to avoid high load on redis storage. To use the tags in a more efficient way please use it with the client side feature.

from cashews import cache

cache.setup("redis://", client_side=True)

@cache(ttl="1h", tags=["items", "page:{page}"])
async def items(page=1):
    ...


await cache.delete_tags("page:1")
await cache.delete_tags("items")

# low level api
cache.register_tag("my_tag", key_template="key{i}")

await cache.set("key1", "value", expire="1d", tags=["my_tag"])

You can invalidate future call of cache request by context manager:

from cashews import cache, invalidate_further

@cache(ttl="3h")
async def items():
    ...

async def add_item(item: Item) -> List[Item]:
    ...
    with invalidate_further():
        await items

Cache invalidation on code change

Often, you may face a problem with an invalid cache after the code is changed. For example:

@cache(ttl=timedelta(days=1), key="user:{user_id}")
async def get_user(user_id):
    return {"name": "Dmitry", "surname": "Krykov"}

Then, the returned value was changed to:

-    return {"name": "Dmitry", "surname": "Krykov"}
+    return {"full_name": "Dmitry Krykov"}

Since the function returns a dict, there is no simple way to automatically detect that kind of cache invalidity

One way to solve the problem is to add a prefix for this cache:

@cache(ttl=timedelta(days=1), prefix="v2")
async def get_user(user_id):
    return {"full_name": "Dmitry Krykov"}

but it is so easy to forget to do it...

The best defense against this problem is to use your own datacontainers, like dataclasses, with defined __repr__ method. This will add distinctness and cashews can detect changes in such structures automatically by checking object representation.

from dataclasses import dataclass

from cashews import cache

cache.setup("mem://")

@dataclass
class User:
    name: str
    surname: str

# or define your own class with __repr__ method

class User:

    def __init__(self, name, surname):
        self.name, self.surname = name, surname

    def __repr__(self):
        return f"{self.name} {self.surname}"

# Will detect changes of a structure
@cache(ttl="1d", prefix="v2")
async def get_user(user_id):
    return User("Dima", "Krykov")

Detect the source of a result

Decorators give us a very simple API but also make it difficult to understand where the result is coming from - cache or direct call.

To solve this problem cashews has detect context manager:

from cashews import cache

with cache.detect as detector:
    response = await something_that_use_cache()
    calls = detector.calls

print(calls)
# >>> {"my:key": [{"ttl": 10, "name": "simple", "backend": "redis"}, ], "fail:key": [{"ttl": 10, "exc": RateLimit}, "name": "fail", "backend": "mem"],}

E.g. A simple middleware to use it in a web app:

@app.middleware("http")
async def add_from_cache_headers(request: Request, call_next):
    with cache.detect as detector:
        response = await call_next(request)
        if detector.keys:
            key = list(detector.keys.keys())[0]
            response.headers["X-From-Cache"] = key
            expire = await cache.get_expire(key)
            response.headers["X-From-Cache-Expire-In-Seconds"] = str(expire)
    return response

Middleware

Cashews provide the interface for a "middleware" pattern:

import logging
from cashews import cache

logger = logging.getLogger(__name__)


async def logging_middleware(call, cmd: Command, backend: Backend, *args, **kwargs):
    key = args[0] if args else kwargs.get("key", kwargs.get("pattern", ""))
    logger.info("=> Cache request: %s ", cmd.value, extra={"args": args, "cache_key": key})
    return await call(*args, **kwargs)


cache.setup("mem://", middlewares=(logging_middleware, ))

Callbacks

One of the middleware that is preinstalled in cache instance is CallbackMiddleware. This middleware also add to a cache a new interface that allow to add a function that will be called before given command will be triggered

from cashews import cache, Command


def callback(key, result):
  print(f"GET key={key}")

with cache.callback(callback, cmd=Command.GET):
    await cache.get("test")  # also will print "GET key=test"

Transactional

Applications are more often based on a database with transaction (OLTP) usage. Usually cache supports transactions poorly. Here is just a simple example of how we can make our cache inconsistent:

async def my_handler():
    async with db.transaction():
        await db.insert(user)
        await cache.set(f"key:{user.id}", user)
        await api.service.register(user)

Here the API call may fail, the database transaction will rollback, but the cache will not. Of course, in this code, we can solve it by moving the cache call outside transaction, but in real code it may not so easy. Another case: we want to make bulk operations with a group of keys to keep it consistent:

async def login(user, token, session):
    ...
    old_session = await cache.get(f"current_session:{user.id}")
    await cache.incr(f"sessions_count:{user.id}")
    await cache.set(f"current_session:{user.id}", session)
    await cache.set(f"token:{token.id}", user)
    return old_session

Here we want to have some way to protect our code from race conditions and do operations with cache simultaneously.

Cashews support transaction operations:

โš ๏ธ **Warning: transaction operations are set, set_many, delete, delete_many, delete_match and incr

from cashews import cache
...

@cache.transaction()
async def my_handler():
    async with db.transaction():
        await db.insert(user)
        await cache.set(f"key:{user.id}", user)
        await api.service.register(user)

# or
async def login(user, token, session):
    async with cache.transaction() as tx:
        old_session = await cache.get(f"current_session:{user.id}")
        await cache.incr(f"sessions_count:{user.id}")
        await cache.set(f"current_session:{user.id}", session)
        await cache.set(f"token:{token.id}", user)
        if ...:
            tx.rollback()
    return old_session

Transactions in cashews support different modes of "isolation"

  • fast (0-7% overhead) - memory based, can't protect of race conditions, but may use for atomicity
  • locked (default - 4-9% overhead) - use kind of shared lock per cache key (in case of redis or disk backend), protect of race conditions
  • serializable (7-50% overhead) - use global shared lock - one transaction per time (almost useless)
from cashews import cache, TransactionMode
...

@cache.transaction(TransactionMode.SERIALIZABLE, timeout=1)
async def my_handler():
   ...

Contrib

This library is framework agnostic, but includes several "batteries" for most popular tools.

Fastapi

You may find a few middlewares useful that can help you to control a cache in you web application based on fastapi.

  1. CacheEtagMiddleware - middleware add Etag and check 'If-None-Match' header based on Etag
  2. CacheRequestControlMiddleware - middleware check and add Cache-Control header
  3. CacheDeleteMiddleware - clear cache for an endpoint based on Clear-Site-Data header

Example:

from fastapi import FastAPI, Header, Query
from fastapi.responses import StreamingResponse

from cashews import cache
from cashews.contrib.fastapi import (
    CacheDeleteMiddleware,
    CacheEtagMiddleware,
    CacheRequestControlMiddleware,
    cache_control_ttl,
)

app = FastAPI()
app.add_middleware(CacheDeleteMiddleware)
app.add_middleware(CacheEtagMiddleware)
app.add_middleware(CacheRequestControlMiddleware)
metrics_middleware = create_metrics_middleware()
cache.setup(os.environ.get("CACHE_URI", "redis://"))



@app.get("/")
@cache.failover(ttl="1h")
@cache(ttl=cache_control_ttl(default="4m"), key="simple:{user_agent:hash}", time_condition="1s")
async def simple(user_agent: str = Header("No")):
    ...


@app.get("/stream")
@cache(ttl="1m", key="stream:{file_path}")
async def stream(file_path: str = Query(__file__)):
    return StreamingResponse(_read_file(file_path=file_path))


async def _read_file(_read_file):
    ...

Also cashews can cache stream responses

Prometheus

You can easily provide metrics using the Prometheus middleware.

from cashews import cache
from cashews.contrib.prometheus import create_metrics_middleware

metrics_middleware = create_metrics_middleware(with_tag=False)
cache.setup("redis://", middlewares=(metrics_middleware,))

Development

Setup

  • Clone the project.
  • After creating a virtual environment, install pre-commit:
    pip install pre-commit && pre-commit install --install-hooks

Tests

To run tests you can use tox:

pip install tox
tox -e py  // tests for inmemory backend
tox -e py-diskcache  // tests for diskcache backend
tox -e py-redis  // tests for redis backend  - you need to run redis
tox -e py-integration  // tests for integrations with aiohttp and fastapi

tox // to run all tests for all python that is installed on your machine

Or use pytest, but 2 tests always fail, it is OK:

pip install .[tests,redis,diskcache,speedup] fastapi aiohttp requests

pytest // run all tests with all backends
pytest -m "not redis" // all tests without tests for redis backend

cashews's People

Contributors

0xalcibiades avatar a-kirami avatar dependabot[bot] avatar krukov avatar m4hbod avatar mnixry avatar nickderobertis avatar rebzzel avatar unmade avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cashews's Issues

Cache `bytes` more efficient

await cache.set("key", b"test")

>>>redis: "SET" "test:key" "\x80\x05\x95\b\x00\x00\x00\x00\x00\x00\x00C\x04test\x94."

Should be

>>>redis: "SET" "test:key" "test"

lock performance

I'm benchmarking my multi-backend cache framework against several similar ones, including cashews. One benchmark is concurrency call with stampede/thundering-herd/dog-piling protection. My original benchmark send 200k requests with 1000 concurrency level, my framework and aiocache stampede finish in less than 1 minute but cashews seems hang forever. So I write a simple one with asyncio.gather, but this one raise exception:
cashews.exceptions.LockedError: Key lock:__main__:foo:uid:5884 already locked

Is this the right way(set lock=True) to use cashews with thundering-herd protection? Simple benchmark code:

import random
import redis
import asyncio
from cashews import cache

cache.setup("redis://", max_connections=100, wait_for_connection_timeout=300)


@cache(ttl=None, lock=True)
async def foo(uid):
    await asyncio.sleep(0.1)
    return uid


async def bench():
    r = redis.Redis(host="localhost", port=6379)
    r.flushall()
    await asyncio.gather(*[foo(random.randint(0, 10000)) for _ in range(5000)])


asyncio.run(bench())

Case sensitive keys (do not convert silently all string arguments to lowercase by default)

I was surprised when I occasionally found that all strings are converted to lowercase when building a key.

import asyncio
import logging
from datetime import timedelta

from cashews import cache

logger = logging.getLogger(__name__)
logging.basicConfig(level="DEBUG")


async def logging_middleware(call, *args, backend=None, cmd=None, **kwargs):
    key = args[0] if args else kwargs.get("key", kwargs.get("pattern", ""))
    logger.info(f"{args}; {kwargs}")
    # logger.info("=> Cache request: %s ", cmd, extra={"command": cmd, "cache_key": key})
    return await call(*args, **kwargs)


cache.setup("mem://", middlewares=(logging_middleware, ))


@cache(ttl=timedelta(minutes=1))
async def get_name(user, version="v1"):
    return user


async def main():
    await get_name("Bob")
    result_1 = await get_name("Bob")
    result_2 = await get_name("bob")
    print(result_1)
    print(result_2)

    value = await cache.get("__main__:get_name:user:bob:version:v1")
    print(f"value: {value}")
    value = await cache.get("__main__:get_name:user:Bob:version:v1")
    print(f"value: {value}")


asyncio.run(main())

Question

  1. Is it a bug or a future?
  2. Is there an easy way to disable strings lowercase when building a key? So if I pass my_string=FooBar, it will be saved as FooBar, not as foobar.

Check backend setup

Hi, I am currently trying to integrate Redis cache to my application, while I am not quite sure about the setting up of Redis cache using cashews. For example, after I started a Redis server at terminal I got
redis-cli 127.0.0.1:6379> ping PONG
I wonder what is the correct format of the cache.setup using cashews.cache.

Any feedback will be greatly appreciated.

Typing support

There are lack of type annotation in library:

Expectation:
almost all methods should have full type annotations, enable mypy job to check typing errors

Method get() doesn't deserialize data after set_raw()

I'm expecting, that method get() deserialize raw pickle data stored in Redis, but it doesn't work in case of set_raw().

Code example:

key = 'test:123'
value = [1, 2, 3]

# works fine
await cache.set(key, value)
print(await cache.get(key))
# [1, 2, 3]
print(await cache.get_raw(key))
# b':_\x80\x05\x95\x0b\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03e.'

# returns raw data
value2 = pickle.dumps(value)
await cache.set_raw(key, value2)
print(await cache.get(key))
# b'\x80\x04\x95\x0b\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03e.'

I see a difference in prefix :_ - what does it mean?

`*args` support is broken: only strings are supported as `*args`

I've been using a fork (3.x "cashews" version), but today I've decided to finally switch to your latest version 4.2.1. After switching I see that some problems were introduced somewhere after the 3.x version.

Here is a reproducible code sample:

import asyncio

from cashews import cache


@cache(ttl="1s")
async def get_name(user, *args, version="v1", **kwargs):
    ...


asyncio.run(get_name("foo", 999.0, spam="eggs"))

Traceback:

Traceback (most recent call last):
  File "/demo/_local.py", line 12, in <module>
    asyncio.run(get_name("foo", 999.0, spam="eggs"))
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/wrapper.py", line 248, in _call
    return await decorator(*args, **kwargs)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/decorators/cache/simple.py", line 37, in _wrap
    _cache_key = get_cache_key(func, _key_template, args, kwargs)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/key.py", line 69, in get_cache_key
    return template_to_pattern(_key_template, _formatter=default_formatter, **key_values)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 124, in template_to_pattern
    return _formatter.format(template, **values)
  File "/usr/lib/python3.10/string.py", line 161, in format
    return self.vformat(format_string, args, kwargs)
  File "/usr/lib/python3.10/string.py", line 165, in vformat
    result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
  File "/usr/lib/python3.10/string.py", line 218, in _vformat
    result.append(self.format_field(obj, format_spec))
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 94, in format_field
    value = super().format_field(value, format_spec if format_spec not in self._functions else "")
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 72, in format_field
    return format(self._format_field(value))
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 65, in _format_field
    return self.__type_format[_type](value)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 18, in _decode_array
    return ":".join([format_value(value) for value in values])
TypeError: sequence item 0: expected str instance, float found

Process finished with exit code 1

To reproduce the error you can pass any non-string value.

Also, this bug shows a lack of test coverage for the new code.

Use "Faker" package in tests

https://faker.readthedocs.io/en/master/

It is not a good idea to hard-code manually some specific values for every test. The first problem is that it is a manual redundant work. The second problem is that your test works for "foobar" input, but you do not know whether it will work a "spameggs" or "abcdef" string.

Bad

def test_do_something():
    do_something("foo bar")  # Poor test coverage

Good

def test_do_something(faker: Faker):
    do_something(faker.str())

Good too

def test_do_something(faker: Faker):
    do_something("foo bar")  # It is critical to check exactly this input 
    do_something(faker.str())

A general rule is to hard-code input values only when these specific values must be tested. E.g. it is fine to hard-code "foo bar" input for your test if you have a feeling that the test may fail for the value and it is critical to cover it.

I am using Faker somewhere for 1-2 years and I am fine with it.
There is also another package https://hypothesis.readthedocs.io/en/latest/ which looks very promising, but I have no experience with it.

@Krukov What is your opinion on this?

Error on decoding value with underscore

If value contains "_" we can't handle sign check properly

import asyncio
from cashews import cache


async def func():
    cache.setup('redis://0.0.0.0')

    print(await cache.set('key', 'value_1'))
    print(await cache.get('key'))


asyncio.run(func())

Redis:

1652905105.964530 [0 172.17.0.1:59130] "SET" "key" "\x80\x05\x95\x0b\x00\x00\x00\x00\x00\x00\x00\x8c\avalue_1\x94."
1652905105.966367 [0 172.17.0.1:59130] "GET" "key"
1652905105.968774 [0 172.17.0.1:59130] "UNLINK" "key"

Related to #61

Disabling cache doesnโ€™t work reliably in tests

Hi!

We use cashews in the upcoming implementation of Fedora Message Notifications (see the fmn-next branch), e.g. to cache requests to backend services which we donโ€™t want to hammer too hard. For testing, we want to disable the cache so every such request would get its own version of (mocked) backend results.

I'm currently working on adding caching to one of these backend services and am struggling because one of the tests gets the cached result of a previously run test (which then fails the test). This is with 5.0.0, I've tested it with 4.7.1 where it works as I expect it. Bisecting the 4.7.1..5.0.0 range, I tracked the change in behavior down to commit 4429f01 which adds the @lru_cache() decorator to Cache._get_backend_and_config(). Commenting out that line fixes the issue for me.

Hereโ€™s a small script test_cashews_disabled.py reproducing the issue:

import pytest
from cashews import cache

SIDE_EFFECT = "the side-effect"


@pytest.fixture(autouse=True)
def setup_mem_cache():
    cache.setup("mem://")


@pytest.fixture(autouse=True)
def disable_cache(setup_mem_cache):
    with cache.disabling():
        yield


@pytest.fixture
def change_the_side_effect():
    global SIDE_EFFECT

    SIDE_EFFECT = "the changed side-effect"

    yield


@cache(ttl="1h")
async def function_to_test():
    return SIDE_EFFECT


def test_unrelated():
    assert "BOOP"


@pytest.mark.asyncio
async def test_with_original_side_effect():
    assert await function_to_test() == "the side-effect"


@pytest.mark.asyncio
async def test_with_different_side_effect(change_the_side_effect):
    assert await function_to_test() == "the changed side-effect"

To run, e.g. install cashews, pytest and pytest-asyncio into a virtualenv and run pytest -v test_cashews_disabled.py.

Hereโ€™s the result I get with 5.0.0:

(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pip show cashews
Name: cashews
Version: 5.0.0
Summary: cache tools with async power
Home-page: https://github.com/Krukov/cashews/
Author: Dmitry Kryukov
Author-email: [email protected]
License: MIT
Location: /home/nils/.virtualenvs/cashews_disabled_test/lib/python3.11/site-packages
Requires: 
Required-by: 
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pytest -v test_cashews_disabled.py 
======================================== test session starts =========================================
platform linux -- Python 3.11.1, pytest-7.2.1, pluggy-1.0.0 -- /home/nils/.virtualenvs/cashews_disabled_test/bin/python
cachedir: .pytest_cache
rootdir: /home/nils/test/python/cashews_disabled_test
plugins: asyncio-0.20.3
asyncio: mode=Mode.STRICT
collected 3 items                                                                                    

test_cashews_disabled.py::test_unrelated PASSED                                                [ 33%]
test_cashews_disabled.py::test_with_original_side_effect PASSED                                [ 66%]
test_cashews_disabled.py::test_with_different_side_effect FAILED                               [100%]

============================================== FAILURES ==============================================
__________________________________ test_with_different_side_effect ___________________________________

change_the_side_effect = None

    @pytest.mark.asyncio
    async def test_with_different_side_effect(change_the_side_effect):
>       assert await function_to_test() == "the changed side-effect"
E       AssertionError: assert 'the side-effect' == 'the changed side-effect'
E         - the changed side-effect
E         ?    --------
E         + the side-effect

test_cashews_disabled.py:43: AssertionError
====================================== short test summary info =======================================
FAILED test_cashews_disabled.py::test_with_different_side_effect - AssertionError: assert 'the side-effect' == 'the changed side-effect'
==================================== 1 failed, 2 passed in 0.03s =====================================
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test>

And here's the same with 4.7.1:

(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pip show cashews
Name: cashews
Version: 4.7.1
Summary: cache tools with async power
Home-page: https://github.com/Krukov/cashews/
Author: Dmitry Kryukov
Author-email: [email protected]
License: MIT
Location: /home/nils/.virtualenvs/cashews_disabled_test/lib/python3.11/site-packages
Requires: 
Required-by: 
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pytest -v test_cashews_disabled.py 
======================================== test session starts =========================================
platform linux -- Python 3.11.1, pytest-7.2.1, pluggy-1.0.0 -- /home/nils/.virtualenvs/cashews_disabled_test/bin/python
cachedir: .pytest_cache
rootdir: /home/nils/test/python/cashews_disabled_test
plugins: asyncio-0.20.3
asyncio: mode=Mode.STRICT
collected 3 items                                                                                    

test_cashews_disabled.py::test_unrelated PASSED                                                [ 33%]
test_cashews_disabled.py::test_with_original_side_effect PASSED                                [ 66%]
test_cashews_disabled.py::test_with_different_side_effect PASSED                               [100%]

========================================= 3 passed in 0.02s ==========================================
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test>

Implement set_many function

I see that the package has a get_many which is very useful. However for a use case like get 50 user objects, we will need to set many users to cache at once. Can we implement it?

Decorator tip

@cache.list(key="user:{}", ttl="10m")
def get_users(ids: List[int]):
        ...
        return users

A decorator like this would be useful to cache multiple objects in one call. If you agree I can take a stab at pull request.

fix: passed 'args' and 'kwargs' are not considered when building a cache key

If I am not wrong, this is a serious bug.

Problem

A function is cached always and only once, regardless of passed arguments.

import asyncio

from cashews import cache

cache.setup("mem://")


@cache(ttl="3h")
async def func(*args, **kwargs):
    print(f"{args}; {kwargs}")
    return args, kwargs


async def main():
    await func("foo")
    await func("bar")
    await func("spam")
    await func("eggs")

asyncio.run(main())

Result:

('foo',); {}

Expected result:

The function must be called on every unique args and kwargs.

UnSecureDataError when reading JSON value

Affects version: 3.3.0

I have the following key defining a stringified JSON as value stored in Redis:

$ redis-cli -h localhost -p 16379
localhost:16379> get packing:product_scan_220000000111
"{\"shipping_information_id\": 1, \"scanned_product\": {\"id\": 43, \"barcode\": \"220000000111\", \"dimensions\": [334, 36]}, \"added_products\": [], \"packing_auto\": true}"
localhost:16379> 

Using this code:

from cashews import cache

...
cache.setup(f"redis://{cfg.redis.host}:{cfg.redis.port}/0",
            password=redis_password,
            prefix="packing")

result = await cache.get("packing:product_scan_220000000111")

I get an UnSecureDataError:

Traceback (most recent call last):
  File "/home/automation/lib/python3.7/site-packages/aiorun.py", line 212, in new_coro
    await coro
  File "/home/automation/packing/application.py", line 136, in main
    ":product_scan_220000000111")
  File "/home/automation/packing/application.py", line 39, in cache_logging_middleware
    return await call(*args, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/validation.py", line 63, in _invalidate_middleware
    return await call(*args, key=key, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/wrapper.py", line 50, in _auto_init
    return await call(*args, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/disable_control.py", line 12, in _is_disable_middleware
    return await call(*args, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/serialize.py", line 35, in get
    return await self._get_value(await super().get(key), key, default=default)
  File "/home/automation/lib/python3.7/site-packages/cashews/serialize.py", line 39, in _get_value
    return self._process_value(value, key, default=default)
  File "/home/automation/lib/python3.7/site-packages/cashews/serialize.py", line 59, in _process_value
    raise UnSecureDataError()
cashews.serialize.UnSecureDataError

It the serializer parses the value and splits it on '_' for some check, but I completely miss the background for this.

Is this expected?
How can I read this value properly in cashews?

Cache with performance condition

What if I wan't to cache function results if it takes more than 1 second (for example). Currently we can use condition parameter of decorator, but we need to measure latency and return to use it in condition function.

Plz make it simple

Connecting Redis on Kubernetes with backend using cashews

Hi, thank you a lot for this repo, and I really appreciate it a lot. I have tried to use cashews with Redis on my local environment, and it works pretty well thanks to your previous assistance.

Currently, I deployed both the Redis service and the application on separate Kubernetes containers, and I wonder if it is possible to use cashews on backend to get access to the Redis server on another Kubernetes container. Any feedback will be greatly appreciated.

fix: functions without parameters are not cached

Reproducible code sample

import asyncio

from cashews import cache

cache.setup("mem://")


@cache(ttl="10m")
async def get():
    print("Start!")
    await asyncio.sleep(2)
    print("End!")
    return "foobar"


async def func():
    tasks = [get(), get()]
    await asyncio.gather(*tasks)


asyncio.run(func())

Expected behavior

Functions without parameters must be cached. The same as Python's 'functools.cache' decorator works.

fix: client side cahe always thinks that `None` value is already in cache

Here is the wrong part in the code:

class BcastClientSide(Redis):


    async def set(self, key: str, value, *args, **kwargs):
        if await self._local_cache.get(key) == value:
            # If value in current client_cache - skip resetting
            return 0
        # the rest of the code...

When a function returns None, the result will not be saved in the cache because the code think that the result is already saved.

Solution:

        if await self._local_cache.get(key, default=_empty) == value:
            # If value in current client_cache - skip resetting
            return 0

Method keys_match() returns different keys types for Redis and memory

Method keys_match() returns keys in bytes type for Redis cache and in str type for memory cache. Is it a bug or feature?

Code example:

# setup two caches - for redis & memory
cache.setup(f'redis://{host}:{port}', password=pwd, prefix='test_redis')
cache.setup('mem://', prefix='test_mem')

# set values for memory and redis
await cache.set('test_redis:abc', 123)
await cache.set('test_mem:abc', 321)

# keys_match result
async def foo(mask):
  async for key in cache.keys_match(mask):
    print(key, type(key))

await foo('test_redis:*')
# b'test_redis:abc' <class 'bytes'>
await foo('test_mem:*')
# test_mem:abc <class 'str'>

docs: add a note that '@cache.locked' must be used in chain with '@cache'

Confusing, because someone may think that a result will be cached:

@cache.locked(ttl="10m")
async def get(name):
    value = await api_call()
    return {"status": value}

I thought that the value will be cached until I've looked at the source code.

More clear example:

@cache.locked(ttl="10m")
@cache(ttl="10m")
async def get(name):
    value = await api_call()
    return {"status": value}

I do not think that someone will install a caching package only to lock a function, so better to give a ready-to-use and not confusing example. What do you think?

Also, someone may want to write a custom decorator to avoid copy/pasting the same ttl over multiple decorators. E.g. @cache(ttl="10m", lock=True).

size and key of in memory cache

Hi,

I wonder what is the unit of the size of in memory cache. For example, I used cache.setup("mem://") , while when I called the function 3 times with different arguments, it seems that all only previous 2 arguments are stored.

In addition, I don't use any key as argument for the cache decorator, and I wonder if cashews will hash the function name and the arguments of the functions as the key?

Any feedback will be greatly appreciated.

fix: functions without 'return' are not cached

Reproducible code sample

import asyncio

from cashews import cache

cache.setup("mem://")


@cache(ttl="10m")
async def get():
    print("Start!")
    await asyncio.sleep(2)
    print("End!")
    # return "foobar"


async def func():
    await get()
    await get()


asyncio.run(func())

Expected behavior

Functions without 'return' must be cached. The same as Python's 'functools.cache' decorator works.

futurecoder

Hi, sorry if this is a bit weird, wasn't sure how else to reach you.

I saw that you liked several of my projects on GitHub (birdseye, snoop, heartrate, sorcery) so I thought you might be interested by my latest and most ambitious project: https://futurecoder.io/

It uses several of my libraries, including birdseye and snoop.

get_many doesn't work with several backends

Seems that this method uses only one backend:

# create different backends - for redis and memory
cache.setup('mem://', prefix='mem')
cache.setup('redis://', prefix='redis')

# set values
await cache.set('mem:123', 'memory')
await cache.set('redis:123', 'redis')

# call get_many

await cache.get_many('mem:123', 'redis:123')
# returns ('memory', None)
await cache.get_many('redis:123', 'mem:123')
# returns ('redis', None)

Alias/tags for cache keys for invalidation

from cashews import cache

@cache(ttl="2h", tag="users")
async def get_users(space):
    ...
    

async cache.invalidate(tags=["users"], space="test")  # remove a key that belongs to the "test" space
async cache.invalidate(tags=["users"]) # remove all keys with tag users

`client_side=True` in setup adds always prefix "cashews:" to key

When setting up a cache with client_side=True like this:

cache.setup(f"redis://{cfg.redis.endpoint}:{cfg.redis.port}/0",
            middlewares=(
                      add_prefix('my_wanted_prefix:'),
            ),
            password=redis_password,
            client_side=True,
            retry_on_timeout=True)

await cache.set("test", "testvalue")

the Redis Client shows:

$ redis-cli
127.0.0.1:6379> keys *
1) "cashews:my_wanted_prefix:test"

Without the option client_side=True the result is as expected:

$ redis-cli
127.0.0.1:6379> keys *
1) "my_wanted_prefix:test"

Why is this default prefix used here, how to avoid it?

Transaction for cache changes

Most of the web frameworks uses databases and uses transactions, usually each request handler wrapped in the transaction and usually the transaction is committed only in case of successful processing. Cashews do not have any transaction and it may lead to inconsistent cache

async def handler(request):
     async with db.transaction as tx:
           ...
           await tx.insert(....)
           await cache.set("key", "value")
           ...
           await api.set(...).  # in case of error database transaction will rollback changes but cache not

The idea is to make some kind of transaction - temporary state that available only inside async context and collect changes in memory and commit it only on manually call of commit

'cache.incr' and 'cache.keys_match' API commands examples from documentation do not work

Copy/pasted from the documentation:

import asyncio
from cashews import cache

cache.setup("mem://")  # configure as in-memory cache

async def func():
    await cache.set(key="key", value={"any": True}, expire=60, exist=None)  # -> bool
    await cache.get("key")  # -> Any
    await cache.get_many("key1", "key2")
    await cache.incr("key") # -> int
    await cache.delete("key")
    await cache.delete_match("pattern:*")
    await cache.keys_match("pattern:*") # -> List[str]
    await cache.expire("key", timeout=10)
    await cache.get_expire("key")  # -> int seconds to expire
    await cache.ping(message=None)  # -> bytes
    await cache.clear()
    await cache.is_locked("key", wait=60)  # -> bool
    async with cache.lock("key", expire=10):
       ...
    await cache.set_lock("key", value="value", expire=60)  # -> bool
    await cache.unlock("key", "value")  # -> bool


asyncio.run(func())

Run result:

Traceback (most recent call last):
  File "/local.py", line 46, in <module>
    asyncio.run(func())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
    return future.result()
  File "/local.py", line 31, in func
    await cache.incr("key") # -> int
  File "/venv/lib/python3.10/site-packages/cashews/validation.py", line 63, in _invalidate_middleware
    return await call(*args, key=key, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/wrapper.py", line 45, in _auto_init
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/disable_control.py", line 12, in _is_disable_middleware
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/backends/memory.py", line 75, in incr
    value = int(self._get(key, 0)) + 1
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'dict'

Process finished with exit code 1

If comment out await cache.incr("key") # -> int to skip the problematic command:

Traceback (most recent call last):
  File "/_local.py", line 46, in <module>
    asyncio.run(func())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
    return future.result()
  File "/_local.py", line 34, in func
    await cache.keys_match("pattern:*") # -> List[str]
  File "/venv/lib/python3.10/site-packages/cashews/validation.py", line 62, in _invalidate_middleware
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/wrapper.py", line 45, in _auto_init
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/disable_control.py", line 12, in _is_disable_middleware
    return await call(*args, **kwargs)
TypeError: object async_generator can't be used in 'await' expression

Process finished with exit code 1

I do not know how to fix await cache.keys_match("pattern:*") because [key async for key in cache.keys_match("pattern:*") does not work too.

Support soft ttl

along the lines of: aio-libs/aiocache#256

and as a use case: when the cached operation has a transient failure (for example when fetching data from a 3rd party). In this case I may want to return the prior value if it is within a more generous ttl.

As an example:

cached_value, age = read_from_cache(key)

if cached_value and age < ttl_soft:
   return cached_value

try:
   new_value = expensive_operation(key)
   save_cache(new_value, now)
   return new_value
except ex:
   if age < ttl:
      return cached_value
   raise ex

sqlalchemy object caching issue

Greetings! I ran into sqlalchemy object caching issues. From time to time I get errors like this:

"  File \"/home/boot/stellar/backend/./app/v1/security/auth.py\", line 127, in get_current_user\n    permissions_v2 = await service_permissions.get_permissions_by_role_cache(\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/wrapper.py\", line 272, in _call\n    return await decorator(*args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/decorators/cache/simple.py\", line 43, in _wrap\n    await backend.set(_cache_key, result, expire=ttl)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/validation.py\", line 62, in _invalidate_middleware\n    return await call(*args, key=key, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/wrapper.py\", line 34, in _auto_init\n    return await call(*args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/disable_control.py\", line 12, in _is_disable_middleware\n    return await call(*args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/backends/client_side.py\", line 132, in set\n    return await super().set(self._prefix + key, value, *args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/serialize.py\", line 107, in set\n    value = pickle.dumps(value, protocol=pickle.HIGHEST_PROTOCOL, fix_imports=False)\n",
      "_pickle.PicklingError: Can't pickle <function __init__ at 0x7fc386fbb7f0>: it's not the same object as sqlalchemy.orm.instrumentation.__init__\n"

I found the answer that such objects should be serialized and deserialized differently:
https://docs.sqlalchemy.org/en/14/core/serializer.html

Tell me how good / bad idea it is to cache alchemy objects, and not the endpoint itself.

If this is not bad, what can you say about adding functionality that could fix this error?

High increasing load on the CPU

Found on the master branch.

after updating our product in production to the master branch (cashews), the CPU load increased.

image

indicators earlier.

image

i use:

cashews = {git = "https://github.com/Krukov/cashews.git", rev = "master"}
cache = Cache()
cache.setup(
    settings_redis.dsn,
    client_side=True,
    retry_on_timeout=True,
    hash_key=settings_redis.hash,
    pickle_type="sqlalchemy",
)

Update README with migration notes: simple decorator usage is broken after `NotConfiguredError` introducing

Broken example from the README (a little modified to show that this code does not work anymore)

import asyncio
from cashews import cache

@cache(ttl=1)
async def long_running_function(foo):
    print("Hello")

asyncio.run(long_running_function(foo="bar"))

Error: cashews.exceptions.NotConfiguredError: run cache.setup(...) before using cache

Versions

cashews 5.0.0
Python 3.10

TODO

  • Fix decorators usage examples in the README
  • Explain migration to 5.0.0 when only low-level API or a @cache decorator are required.

Migration of aiocache RedLock

Hello,

thanks for this great maintained library. I'm migrating my code from aiocache, which appears to be no longer maintained,

I just stumbled upon the implementation of the Redlock:

Former aiocache implementation:

async with RedLock(cache, key, lease=LEASE_LOCK_FUNC_CACHE):
    result = await cache.get(key)
    if result is not None:
        # Reset TTL on existing key
        await cache.expire(key, ttl=EXPIRE_FUNC_CACHE)
        # After a minimum waiting time resent answer to
        # repeating request
        logging.debug(
            '%s (%s): Cache hit, subsequently sending answer',
            subj_str, body)
        return bytes.fromhex(result)

    logging.debug('%s (%s): Cache missed, first call of message '
                  'handler', subj_str, body)
    result = await handler_class.answer(body)
    await cache.set(key, result.hex(), ttl=EXPIRE_FUNC_CACHE)
    return result

Comparable cashews implementation:

async with cache.lock(key + '-lock', expire=LEASE_LOCK_FUNC_CACHE):
    result = await cache.get(key)
    if result is not None:
        # Reset TTL on existing key
        await cache.expire(key, timeout=EXPIRE_FUNC_CACHE)
        # After a minimum waiting time resent answer to
        # repeating request
        logging.debug(
            '%s (%s): Cache hit, subsequently sending answer',
            subj_str, body)
        return bytes.fromhex(result)

    logging.debug('%s (%s): Cache missed, first call of message '
                  'handler', subj_str, body)
    result = await handler_class.answer(body)
    await cache.set(key, result.hex(), expire=EXPIRE_FUNC_CACHE)
    return result

Thus, for the key which is assembled from the function name and its arguments, for example, I need to add the suffix -lock for the lock key to not interfer with the key itself. The Redlock in aiocache added this suffix implicitly, see https://github.com/aio-libs/aiocache/blob/9c8b07fe759990dcb2d4d5f4e40d13d2cc36d58f/aiocache/lock.py#L68.

Most probably this changed behavior is wanted, I just wanted to ask, whether you wanted to implement it in this way, and to point this out for anybody else migrating from aiocache.

auto strategy to cache

The idea is in to have a decorator that will decide what and how to cache based on a statistic:
The statistic collected for a few ( 10 - 100 ) keys with a step (each 3rd call eg.)

  • execution latency per key and average
  • call rate per key ( result deviation will include it)
  • calling parameters deviation/histogram and results crossings
  • results deviation/histogram per key (with time between result change deviation)

Based on high latency we can suggest to use cache , based on call rate and time between changes we can predict time to live for cache, based on calling parameters deviation and correlation between result we can guess the key template.

fix: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 2: invalid start byte

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../venv/lib/python3.10/site-packages/cashews/wrapper.py:244: in _call
    result = await decorator(func)(*args, **kwargs)
../../venv/lib/python3.10/site-packages/cashews/decorators/cache/simple.py:37: in _wrap
    _cache_key = get_cache_key(func, _key_template, args, kwargs)
../../venv/lib/python3.10/site-packages/cashews/key.py:54: in get_cache_key
    return _get_cache_key(func, template, args, kwargs)
../../venv/lib/python3.10/site-packages/cashews/key.py:73: in _get_cache_key
    key_values = get_call_values(func, args, kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

func = <function RestfulApiABC._check_request_params at 0x7ff7d84c1c80>
args = (<src.apis.restful_api.RestfulApiABC object at 0x7ff7ce876a30>, b'\x16F\xbd\xb0\xcf\xcdN\xd7Y)\xfa\x1d\x96\xb1u\x81')
kwargs = {}

    def get_call_values(func: Callable, args, kwargs) -> Dict:
        """
        Return dict with arguments and their values for function call with given positional and keywords arguments
        :param func: Target function
        :param args: call positional arguments
        :param kwargs: call keyword arguments
        :param func_args: arguments that will be included in results (transformation function for values if passed as dict)
        """
        key_values = {}
        for _key, _value in _get_call_values(func, args, kwargs).items():
            key_values[_key] = _value
            if isinstance(key_values[_key], bytes):
>               key_values[_key] = key_values[_key].decode()
E               UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 2: invalid start byte

../../venv/lib/python3.10/site-packages/cashews/key.py:151: UnicodeDecodeError

Why do you need to call .decode() at all? Why not to keep values as is? I think that a caching package must not perform data validation, so any passed data must be processed successfully.

Invalid register tag with optional arguments

If the function accepts optional arguments, tag registration does not work correctly

For example as in the code below, slightly modified this example https://github.com/Krukov/cashews/blob/8fb81ca97bb548587fd59cb657f06e1664751189/examples/invalidation_by_tags.py

import asyncio
import random
import typing as t

from cashews import cache

redis_url = "redis://"
cache.setup(redis_url)


@cache(ttl="1h", tags=["items", "user_data:{user_id}"])
async def get_items(user_id: int, some_id: t.Optional[int] = None):  # new Optional argument some_id
    return [f"{user_id}_{random.randint(1, 10)}" for i in range(10)]


FIRST_USER = 1
SECOND_USER = 2


async def main():
    first_user_items = await get_items(FIRST_USER)
    second_user_items = await get_items(SECOND_USER)

    # check that results were cached
    assert await get_items(FIRST_USER) == first_user_items
    assert await get_items(SECOND_USER) == second_user_items

    # invalidate cache first user
    await cache.delete_tags(f"user_data:{FIRST_USER}")
    assert await get_items(FIRST_USER) != first_user_items  #  exception AssertError
    assert await get_items(SECOND_USER) == second_user_items 

if __name__ == "__main__":
    asyncio.run(main())

The key is generated as follows:
__main__:get_items:user_id:1:some_id:

The pattern is generated as follows:
re.compile('^__main__:get_items:user_id:(?P<user_id>.+):some_id:(?P<some_id>.+)$', re.MULTILINE)

Full matching is not performed in the _match_patterns function:

@staticmethod
def _match_patterns(key: Key, patterns: List[Pattern]) -> Optional[Match]:
    for pattern in patterns:
        match = pattern.fullmatch(key).  # not fullmatch
        if match:
            return match
    return None

If you run my example code, then at the first start, an error will be called:
cashews.exceptions.TagNotRegisteredError: tag: {'user_data:1', 'items'} not registered: call cache.register_tag before using tags

If you run it again, this error will disappear, but the cache will not invalidate and raise error AssertionError

invalidate cache by key

What did I do:

@cache.invalidate("foo:client_id:{client_id}", args_map={"client_id": "client_id"})
async def bar(self, client_id):
    ...

What do I expect:

It works ! =)

What do I receive:

File ".../python3.9/site-packages/cashews/validation.py", line 40, in _wrap
    backend.delete_match(target.format({k: str(v) if v is not None else "" for k, v in _args.items()}))
KeyError: 'client_id'

What do I suggest:

backend.delete_match(target.format(**{k: str(v) if v is not None else "" for k, v in _args.items()}))

Minor code clean up

  • Change from
    return await self._client.mget(keys[0], *keys[1:])
    to the
    return await self._client.mget(*keys)

  • Change from
    if isinstance(value, int) or value.isdigit():
       return int(value)
    to the
    if isinstance(value, int)
        return value
    if value.isdigit():
        return int(value)

  • Refactor magic hidden converting under the hood from microseconds to seconds based only on variable type:
           if isinstance(expire, float):
               pexpire = int(expire * 1000)
               expire = None
  • This seems to be unnecessary because on __del__ call self._client will be deleted too:

      def close(self):
         del self._client
         self._client = None
         self.__is_init = False
    
     __del__ = close

  • Using class none: pass seems to be redundant because None is pickled and unpickled back. I do not understand the purpose of the
          value = pickle.loads(value, fix_imports=False, encoding="bytes")
    
          if value is none:
              return None
    because you can remove at all class none: pass class, and just do return value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.