GithubHelp home page GithubHelp logo

krukov / cashews Goto Github PK

View Code? Open in Web Editor NEW
338.0 338.0 22.0 827 KB

Cache with async power

License: MIT License

Python 99.92% Makefile 0.08%
aiohttp asycnio asynchronous asyncio cache cache-control cache-stampede caching dog-piling fastapi python redis

cashews's Issues

fix: functions without parameters are not cached

Reproducible code sample

import asyncio

from cashews import cache

cache.setup("mem://")


@cache(ttl="10m")
async def get():
    print("Start!")
    await asyncio.sleep(2)
    print("End!")
    return "foobar"


async def func():
    tasks = [get(), get()]
    await asyncio.gather(*tasks)


asyncio.run(func())

Expected behavior

Functions without parameters must be cached. The same as Python's 'functools.cache' decorator works.

Support soft ttl

along the lines of: aio-libs/aiocache#256

and as a use case: when the cached operation has a transient failure (for example when fetching data from a 3rd party). In this case I may want to return the prior value if it is within a more generous ttl.

As an example:

cached_value, age = read_from_cache(key)

if cached_value and age < ttl_soft:
   return cached_value

try:
   new_value = expensive_operation(key)
   save_cache(new_value, now)
   return new_value
except ex:
   if age < ttl:
      return cached_value
   raise ex

fix: client side cahe always thinks that `None` value is already in cache

Here is the wrong part in the code:

class BcastClientSide(Redis):


    async def set(self, key: str, value, *args, **kwargs):
        if await self._local_cache.get(key) == value:
            # If value in current client_cache - skip resetting
            return 0
        # the rest of the code...

When a function returns None, the result will not be saved in the cache because the code think that the result is already saved.

Solution:

        if await self._local_cache.get(key, default=_empty) == value:
            # If value in current client_cache - skip resetting
            return 0

Invalid register tag with optional arguments

If the function accepts optional arguments, tag registration does not work correctly

For example as in the code below, slightly modified this example https://github.com/Krukov/cashews/blob/8fb81ca97bb548587fd59cb657f06e1664751189/examples/invalidation_by_tags.py

import asyncio
import random
import typing as t

from cashews import cache

redis_url = "redis://"
cache.setup(redis_url)


@cache(ttl="1h", tags=["items", "user_data:{user_id}"])
async def get_items(user_id: int, some_id: t.Optional[int] = None):  # new Optional argument some_id
    return [f"{user_id}_{random.randint(1, 10)}" for i in range(10)]


FIRST_USER = 1
SECOND_USER = 2


async def main():
    first_user_items = await get_items(FIRST_USER)
    second_user_items = await get_items(SECOND_USER)

    # check that results were cached
    assert await get_items(FIRST_USER) == first_user_items
    assert await get_items(SECOND_USER) == second_user_items

    # invalidate cache first user
    await cache.delete_tags(f"user_data:{FIRST_USER}")
    assert await get_items(FIRST_USER) != first_user_items  #  exception AssertError
    assert await get_items(SECOND_USER) == second_user_items 

if __name__ == "__main__":
    asyncio.run(main())

The key is generated as follows:
__main__:get_items:user_id:1:some_id:

The pattern is generated as follows:
re.compile('^__main__:get_items:user_id:(?P<user_id>.+):some_id:(?P<some_id>.+)$', re.MULTILINE)

Full matching is not performed in the _match_patterns function:

@staticmethod
def _match_patterns(key: Key, patterns: List[Pattern]) -> Optional[Match]:
    for pattern in patterns:
        match = pattern.fullmatch(key).  # not fullmatch
        if match:
            return match
    return None

If you run my example code, then at the first start, an error will be called:
cashews.exceptions.TagNotRegisteredError: tag: {'user_data:1', 'items'} not registered: call cache.register_tag before using tags

If you run it again, this error will disappear, but the cache will not invalidate and raise error AssertionError

fix: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 2: invalid start byte

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../venv/lib/python3.10/site-packages/cashews/wrapper.py:244: in _call
    result = await decorator(func)(*args, **kwargs)
../../venv/lib/python3.10/site-packages/cashews/decorators/cache/simple.py:37: in _wrap
    _cache_key = get_cache_key(func, _key_template, args, kwargs)
../../venv/lib/python3.10/site-packages/cashews/key.py:54: in get_cache_key
    return _get_cache_key(func, template, args, kwargs)
../../venv/lib/python3.10/site-packages/cashews/key.py:73: in _get_cache_key
    key_values = get_call_values(func, args, kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

func = <function RestfulApiABC._check_request_params at 0x7ff7d84c1c80>
args = (<src.apis.restful_api.RestfulApiABC object at 0x7ff7ce876a30>, b'\x16F\xbd\xb0\xcf\xcdN\xd7Y)\xfa\x1d\x96\xb1u\x81')
kwargs = {}

    def get_call_values(func: Callable, args, kwargs) -> Dict:
        """
        Return dict with arguments and their values for function call with given positional and keywords arguments
        :param func: Target function
        :param args: call positional arguments
        :param kwargs: call keyword arguments
        :param func_args: arguments that will be included in results (transformation function for values if passed as dict)
        """
        key_values = {}
        for _key, _value in _get_call_values(func, args, kwargs).items():
            key_values[_key] = _value
            if isinstance(key_values[_key], bytes):
>               key_values[_key] = key_values[_key].decode()
E               UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 2: invalid start byte

../../venv/lib/python3.10/site-packages/cashews/key.py:151: UnicodeDecodeError

Why do you need to call .decode() at all? Why not to keep values as is? I think that a caching package must not perform data validation, so any passed data must be processed successfully.

Disabling cache doesn’t work reliably in tests

Hi!

We use cashews in the upcoming implementation of Fedora Message Notifications (see the fmn-next branch), e.g. to cache requests to backend services which we don’t want to hammer too hard. For testing, we want to disable the cache so every such request would get its own version of (mocked) backend results.

I'm currently working on adding caching to one of these backend services and am struggling because one of the tests gets the cached result of a previously run test (which then fails the test). This is with 5.0.0, I've tested it with 4.7.1 where it works as I expect it. Bisecting the 4.7.1..5.0.0 range, I tracked the change in behavior down to commit 4429f01 which adds the @lru_cache() decorator to Cache._get_backend_and_config(). Commenting out that line fixes the issue for me.

Here’s a small script test_cashews_disabled.py reproducing the issue:

import pytest
from cashews import cache

SIDE_EFFECT = "the side-effect"


@pytest.fixture(autouse=True)
def setup_mem_cache():
    cache.setup("mem://")


@pytest.fixture(autouse=True)
def disable_cache(setup_mem_cache):
    with cache.disabling():
        yield


@pytest.fixture
def change_the_side_effect():
    global SIDE_EFFECT

    SIDE_EFFECT = "the changed side-effect"

    yield


@cache(ttl="1h")
async def function_to_test():
    return SIDE_EFFECT


def test_unrelated():
    assert "BOOP"


@pytest.mark.asyncio
async def test_with_original_side_effect():
    assert await function_to_test() == "the side-effect"


@pytest.mark.asyncio
async def test_with_different_side_effect(change_the_side_effect):
    assert await function_to_test() == "the changed side-effect"

To run, e.g. install cashews, pytest and pytest-asyncio into a virtualenv and run pytest -v test_cashews_disabled.py.

Here’s the result I get with 5.0.0:

(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pip show cashews
Name: cashews
Version: 5.0.0
Summary: cache tools with async power
Home-page: https://github.com/Krukov/cashews/
Author: Dmitry Kryukov
Author-email: [email protected]
License: MIT
Location: /home/nils/.virtualenvs/cashews_disabled_test/lib/python3.11/site-packages
Requires: 
Required-by: 
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pytest -v test_cashews_disabled.py 
======================================== test session starts =========================================
platform linux -- Python 3.11.1, pytest-7.2.1, pluggy-1.0.0 -- /home/nils/.virtualenvs/cashews_disabled_test/bin/python
cachedir: .pytest_cache
rootdir: /home/nils/test/python/cashews_disabled_test
plugins: asyncio-0.20.3
asyncio: mode=Mode.STRICT
collected 3 items                                                                                    

test_cashews_disabled.py::test_unrelated PASSED                                                [ 33%]
test_cashews_disabled.py::test_with_original_side_effect PASSED                                [ 66%]
test_cashews_disabled.py::test_with_different_side_effect FAILED                               [100%]

============================================== FAILURES ==============================================
__________________________________ test_with_different_side_effect ___________________________________

change_the_side_effect = None

    @pytest.mark.asyncio
    async def test_with_different_side_effect(change_the_side_effect):
>       assert await function_to_test() == "the changed side-effect"
E       AssertionError: assert 'the side-effect' == 'the changed side-effect'
E         - the changed side-effect
E         ?    --------
E         + the side-effect

test_cashews_disabled.py:43: AssertionError
====================================== short test summary info =======================================
FAILED test_cashews_disabled.py::test_with_different_side_effect - AssertionError: assert 'the side-effect' == 'the changed side-effect'
==================================== 1 failed, 2 passed in 0.03s =====================================
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test>

And here's the same with 4.7.1:

(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pip show cashews
Name: cashews
Version: 4.7.1
Summary: cache tools with async power
Home-page: https://github.com/Krukov/cashews/
Author: Dmitry Kryukov
Author-email: [email protected]
License: MIT
Location: /home/nils/.virtualenvs/cashews_disabled_test/lib/python3.11/site-packages
Requires: 
Required-by: 
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test> pytest -v test_cashews_disabled.py 
======================================== test session starts =========================================
platform linux -- Python 3.11.1, pytest-7.2.1, pluggy-1.0.0 -- /home/nils/.virtualenvs/cashews_disabled_test/bin/python
cachedir: .pytest_cache
rootdir: /home/nils/test/python/cashews_disabled_test
plugins: asyncio-0.20.3
asyncio: mode=Mode.STRICT
collected 3 items                                                                                    

test_cashews_disabled.py::test_unrelated PASSED                                                [ 33%]
test_cashews_disabled.py::test_with_original_side_effect PASSED                                [ 66%]
test_cashews_disabled.py::test_with_different_side_effect PASSED                               [100%]

========================================= 3 passed in 0.02s ==========================================
(cashews_disabled_test) nils@makake:~/test/python/cashews_disabled_test>

Use "Faker" package in tests

https://faker.readthedocs.io/en/master/

It is not a good idea to hard-code manually some specific values for every test. The first problem is that it is a manual redundant work. The second problem is that your test works for "foobar" input, but you do not know whether it will work a "spameggs" or "abcdef" string.

Bad

def test_do_something():
    do_something("foo bar")  # Poor test coverage

Good

def test_do_something(faker: Faker):
    do_something(faker.str())

Good too

def test_do_something(faker: Faker):
    do_something("foo bar")  # It is critical to check exactly this input 
    do_something(faker.str())

A general rule is to hard-code input values only when these specific values must be tested. E.g. it is fine to hard-code "foo bar" input for your test if you have a feeling that the test may fail for the value and it is critical to cover it.

I am using Faker somewhere for 1-2 years and I am fine with it.
There is also another package https://hypothesis.readthedocs.io/en/latest/ which looks very promising, but I have no experience with it.

@Krukov What is your opinion on this?

`*args` support is broken: only strings are supported as `*args`

I've been using a fork (3.x "cashews" version), but today I've decided to finally switch to your latest version 4.2.1. After switching I see that some problems were introduced somewhere after the 3.x version.

Here is a reproducible code sample:

import asyncio

from cashews import cache


@cache(ttl="1s")
async def get_name(user, *args, version="v1", **kwargs):
    ...


asyncio.run(get_name("foo", 999.0, spam="eggs"))

Traceback:

Traceback (most recent call last):
  File "/demo/_local.py", line 12, in <module>
    asyncio.run(get_name("foo", 999.0, spam="eggs"))
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/wrapper.py", line 248, in _call
    return await decorator(*args, **kwargs)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/decorators/cache/simple.py", line 37, in _wrap
    _cache_key = get_cache_key(func, _key_template, args, kwargs)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/key.py", line 69, in get_cache_key
    return template_to_pattern(_key_template, _formatter=default_formatter, **key_values)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 124, in template_to_pattern
    return _formatter.format(template, **values)
  File "/usr/lib/python3.10/string.py", line 161, in format
    return self.vformat(format_string, args, kwargs)
  File "/usr/lib/python3.10/string.py", line 165, in vformat
    result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
  File "/usr/lib/python3.10/string.py", line 218, in _vformat
    result.append(self.format_field(obj, format_spec))
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 94, in format_field
    value = super().format_field(value, format_spec if format_spec not in self._functions else "")
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 72, in format_field
    return format(self._format_field(value))
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 65, in _format_field
    return self.__type_format[_type](value)
  File "/demo-vUAg5yd0-py3.10/lib/python3.10/site-packages/cashews/formatter.py", line 18, in _decode_array
    return ":".join([format_value(value) for value in values])
TypeError: sequence item 0: expected str instance, float found

Process finished with exit code 1

To reproduce the error you can pass any non-string value.

Also, this bug shows a lack of test coverage for the new code.

Method get() doesn't deserialize data after set_raw()

I'm expecting, that method get() deserialize raw pickle data stored in Redis, but it doesn't work in case of set_raw().

Code example:

key = 'test:123'
value = [1, 2, 3]

# works fine
await cache.set(key, value)
print(await cache.get(key))
# [1, 2, 3]
print(await cache.get_raw(key))
# b':_\x80\x05\x95\x0b\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03e.'

# returns raw data
value2 = pickle.dumps(value)
await cache.set_raw(key, value2)
print(await cache.get(key))
# b'\x80\x04\x95\x0b\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03e.'

I see a difference in prefix :_ - what does it mean?

get_many doesn't work with several backends

Seems that this method uses only one backend:

# create different backends - for redis and memory
cache.setup('mem://', prefix='mem')
cache.setup('redis://', prefix='redis')

# set values
await cache.set('mem:123', 'memory')
await cache.set('redis:123', 'redis')

# call get_many

await cache.get_many('mem:123', 'redis:123')
# returns ('memory', None)
await cache.get_many('redis:123', 'mem:123')
# returns ('redis', None)

Implement set_many function

I see that the package has a get_many which is very useful. However for a use case like get 50 user objects, we will need to set many users to cache at once. Can we implement it?

Decorator tip

@cache.list(key="user:{}", ttl="10m")
def get_users(ids: List[int]):
        ...
        return users

A decorator like this would be useful to cache multiple objects in one call. If you agree I can take a stab at pull request.

Method keys_match() returns different keys types for Redis and memory

Method keys_match() returns keys in bytes type for Redis cache and in str type for memory cache. Is it a bug or feature?

Code example:

# setup two caches - for redis & memory
cache.setup(f'redis://{host}:{port}', password=pwd, prefix='test_redis')
cache.setup('mem://', prefix='test_mem')

# set values for memory and redis
await cache.set('test_redis:abc', 123)
await cache.set('test_mem:abc', 321)

# keys_match result
async def foo(mask):
  async for key in cache.keys_match(mask):
    print(key, type(key))

await foo('test_redis:*')
# b'test_redis:abc' <class 'bytes'>
await foo('test_mem:*')
# test_mem:abc <class 'str'>

Update README with migration notes: simple decorator usage is broken after `NotConfiguredError` introducing

Broken example from the README (a little modified to show that this code does not work anymore)

import asyncio
from cashews import cache

@cache(ttl=1)
async def long_running_function(foo):
    print("Hello")

asyncio.run(long_running_function(foo="bar"))

Error: cashews.exceptions.NotConfiguredError: run cache.setup(...) before using cache

Versions

cashews 5.0.0
Python 3.10

TODO

  • Fix decorators usage examples in the README
  • Explain migration to 5.0.0 when only low-level API or a @cache decorator are required.

Cache with performance condition

What if I wan't to cache function results if it takes more than 1 second (for example). Currently we can use condition parameter of decorator, but we need to measure latency and return to use it in condition function.

Plz make it simple

Case sensitive keys (do not convert silently all string arguments to lowercase by default)

I was surprised when I occasionally found that all strings are converted to lowercase when building a key.

import asyncio
import logging
from datetime import timedelta

from cashews import cache

logger = logging.getLogger(__name__)
logging.basicConfig(level="DEBUG")


async def logging_middleware(call, *args, backend=None, cmd=None, **kwargs):
    key = args[0] if args else kwargs.get("key", kwargs.get("pattern", ""))
    logger.info(f"{args}; {kwargs}")
    # logger.info("=> Cache request: %s ", cmd, extra={"command": cmd, "cache_key": key})
    return await call(*args, **kwargs)


cache.setup("mem://", middlewares=(logging_middleware, ))


@cache(ttl=timedelta(minutes=1))
async def get_name(user, version="v1"):
    return user


async def main():
    await get_name("Bob")
    result_1 = await get_name("Bob")
    result_2 = await get_name("bob")
    print(result_1)
    print(result_2)

    value = await cache.get("__main__:get_name:user:bob:version:v1")
    print(f"value: {value}")
    value = await cache.get("__main__:get_name:user:Bob:version:v1")
    print(f"value: {value}")


asyncio.run(main())

Question

  1. Is it a bug or a future?
  2. Is there an easy way to disable strings lowercase when building a key? So if I pass my_string=FooBar, it will be saved as FooBar, not as foobar.

Error on decoding value with underscore

If value contains "_" we can't handle sign check properly

import asyncio
from cashews import cache


async def func():
    cache.setup('redis://0.0.0.0')

    print(await cache.set('key', 'value_1'))
    print(await cache.get('key'))


asyncio.run(func())

Redis:

1652905105.964530 [0 172.17.0.1:59130] "SET" "key" "\x80\x05\x95\x0b\x00\x00\x00\x00\x00\x00\x00\x8c\avalue_1\x94."
1652905105.966367 [0 172.17.0.1:59130] "GET" "key"
1652905105.968774 [0 172.17.0.1:59130] "UNLINK" "key"

Related to #61

auto strategy to cache

The idea is in to have a decorator that will decide what and how to cache based on a statistic:
The statistic collected for a few ( 10 - 100 ) keys with a step (each 3rd call eg.)

  • execution latency per key and average
  • call rate per key ( result deviation will include it)
  • calling parameters deviation/histogram and results crossings
  • results deviation/histogram per key (with time between result change deviation)

Based on high latency we can suggest to use cache , based on call rate and time between changes we can predict time to live for cache, based on calling parameters deviation and correlation between result we can guess the key template.

invalidate cache by key

What did I do:

@cache.invalidate("foo:client_id:{client_id}", args_map={"client_id": "client_id"})
async def bar(self, client_id):
    ...

What do I expect:

It works ! =)

What do I receive:

File ".../python3.9/site-packages/cashews/validation.py", line 40, in _wrap
    backend.delete_match(target.format({k: str(v) if v is not None else "" for k, v in _args.items()}))
KeyError: 'client_id'

What do I suggest:

backend.delete_match(target.format(**{k: str(v) if v is not None else "" for k, v in _args.items()}))

Migration of aiocache RedLock

Hello,

thanks for this great maintained library. I'm migrating my code from aiocache, which appears to be no longer maintained,

I just stumbled upon the implementation of the Redlock:

Former aiocache implementation:

async with RedLock(cache, key, lease=LEASE_LOCK_FUNC_CACHE):
    result = await cache.get(key)
    if result is not None:
        # Reset TTL on existing key
        await cache.expire(key, ttl=EXPIRE_FUNC_CACHE)
        # After a minimum waiting time resent answer to
        # repeating request
        logging.debug(
            '%s (%s): Cache hit, subsequently sending answer',
            subj_str, body)
        return bytes.fromhex(result)

    logging.debug('%s (%s): Cache missed, first call of message '
                  'handler', subj_str, body)
    result = await handler_class.answer(body)
    await cache.set(key, result.hex(), ttl=EXPIRE_FUNC_CACHE)
    return result

Comparable cashews implementation:

async with cache.lock(key + '-lock', expire=LEASE_LOCK_FUNC_CACHE):
    result = await cache.get(key)
    if result is not None:
        # Reset TTL on existing key
        await cache.expire(key, timeout=EXPIRE_FUNC_CACHE)
        # After a minimum waiting time resent answer to
        # repeating request
        logging.debug(
            '%s (%s): Cache hit, subsequently sending answer',
            subj_str, body)
        return bytes.fromhex(result)

    logging.debug('%s (%s): Cache missed, first call of message '
                  'handler', subj_str, body)
    result = await handler_class.answer(body)
    await cache.set(key, result.hex(), expire=EXPIRE_FUNC_CACHE)
    return result

Thus, for the key which is assembled from the function name and its arguments, for example, I need to add the suffix -lock for the lock key to not interfer with the key itself. The Redlock in aiocache added this suffix implicitly, see https://github.com/aio-libs/aiocache/blob/9c8b07fe759990dcb2d4d5f4e40d13d2cc36d58f/aiocache/lock.py#L68.

Most probably this changed behavior is wanted, I just wanted to ask, whether you wanted to implement it in this way, and to point this out for anybody else migrating from aiocache.

UnSecureDataError when reading JSON value

Affects version: 3.3.0

I have the following key defining a stringified JSON as value stored in Redis:

$ redis-cli -h localhost -p 16379
localhost:16379> get packing:product_scan_220000000111
"{\"shipping_information_id\": 1, \"scanned_product\": {\"id\": 43, \"barcode\": \"220000000111\", \"dimensions\": [334, 36]}, \"added_products\": [], \"packing_auto\": true}"
localhost:16379> 

Using this code:

from cashews import cache

...
cache.setup(f"redis://{cfg.redis.host}:{cfg.redis.port}/0",
            password=redis_password,
            prefix="packing")

result = await cache.get("packing:product_scan_220000000111")

I get an UnSecureDataError:

Traceback (most recent call last):
  File "/home/automation/lib/python3.7/site-packages/aiorun.py", line 212, in new_coro
    await coro
  File "/home/automation/packing/application.py", line 136, in main
    ":product_scan_220000000111")
  File "/home/automation/packing/application.py", line 39, in cache_logging_middleware
    return await call(*args, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/validation.py", line 63, in _invalidate_middleware
    return await call(*args, key=key, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/wrapper.py", line 50, in _auto_init
    return await call(*args, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/disable_control.py", line 12, in _is_disable_middleware
    return await call(*args, **kwargs)
  File "/home/automation/lib/python3.7/site-packages/cashews/serialize.py", line 35, in get
    return await self._get_value(await super().get(key), key, default=default)
  File "/home/automation/lib/python3.7/site-packages/cashews/serialize.py", line 39, in _get_value
    return self._process_value(value, key, default=default)
  File "/home/automation/lib/python3.7/site-packages/cashews/serialize.py", line 59, in _process_value
    raise UnSecureDataError()
cashews.serialize.UnSecureDataError

It the serializer parses the value and splits it on '_' for some check, but I completely miss the background for this.

Is this expected?
How can I read this value properly in cashews?

Typing support

There are lack of type annotation in library:

Expectation:
almost all methods should have full type annotations, enable mypy job to check typing errors

Check backend setup

Hi, I am currently trying to integrate Redis cache to my application, while I am not quite sure about the setting up of Redis cache using cashews. For example, after I started a Redis server at terminal I got
redis-cli 127.0.0.1:6379> ping PONG
I wonder what is the correct format of the cache.setup using cashews.cache.

Any feedback will be greatly appreciated.

Minor code clean up

  • Change from
    return await self._client.mget(keys[0], *keys[1:])
    to the
    return await self._client.mget(*keys)

  • Change from
    if isinstance(value, int) or value.isdigit():
       return int(value)
    to the
    if isinstance(value, int)
        return value
    if value.isdigit():
        return int(value)

  • Refactor magic hidden converting under the hood from microseconds to seconds based only on variable type:
           if isinstance(expire, float):
               pexpire = int(expire * 1000)
               expire = None
  • This seems to be unnecessary because on __del__ call self._client will be deleted too:

      def close(self):
         del self._client
         self._client = None
         self.__is_init = False
    
     __del__ = close

  • Using class none: pass seems to be redundant because None is pickled and unpickled back. I do not understand the purpose of the
          value = pickle.loads(value, fix_imports=False, encoding="bytes")
    
          if value is none:
              return None
    because you can remove at all class none: pass class, and just do return value.

fix: functions without 'return' are not cached

Reproducible code sample

import asyncio

from cashews import cache

cache.setup("mem://")


@cache(ttl="10m")
async def get():
    print("Start!")
    await asyncio.sleep(2)
    print("End!")
    # return "foobar"


async def func():
    await get()
    await get()


asyncio.run(func())

Expected behavior

Functions without 'return' must be cached. The same as Python's 'functools.cache' decorator works.

docs: add a note that '@cache.locked' must be used in chain with '@cache'

Confusing, because someone may think that a result will be cached:

@cache.locked(ttl="10m")
async def get(name):
    value = await api_call()
    return {"status": value}

I thought that the value will be cached until I've looked at the source code.

More clear example:

@cache.locked(ttl="10m")
@cache(ttl="10m")
async def get(name):
    value = await api_call()
    return {"status": value}

I do not think that someone will install a caching package only to lock a function, so better to give a ready-to-use and not confusing example. What do you think?

Also, someone may want to write a custom decorator to avoid copy/pasting the same ttl over multiple decorators. E.g. @cache(ttl="10m", lock=True).

lock performance

I'm benchmarking my multi-backend cache framework against several similar ones, including cashews. One benchmark is concurrency call with stampede/thundering-herd/dog-piling protection. My original benchmark send 200k requests with 1000 concurrency level, my framework and aiocache stampede finish in less than 1 minute but cashews seems hang forever. So I write a simple one with asyncio.gather, but this one raise exception:
cashews.exceptions.LockedError: Key lock:__main__:foo:uid:5884 already locked

Is this the right way(set lock=True) to use cashews with thundering-herd protection? Simple benchmark code:

import random
import redis
import asyncio
from cashews import cache

cache.setup("redis://", max_connections=100, wait_for_connection_timeout=300)


@cache(ttl=None, lock=True)
async def foo(uid):
    await asyncio.sleep(0.1)
    return uid


async def bench():
    r = redis.Redis(host="localhost", port=6379)
    r.flushall()
    await asyncio.gather(*[foo(random.randint(0, 10000)) for _ in range(5000)])


asyncio.run(bench())

`client_side=True` in setup adds always prefix "cashews:" to key

When setting up a cache with client_side=True like this:

cache.setup(f"redis://{cfg.redis.endpoint}:{cfg.redis.port}/0",
            middlewares=(
                      add_prefix('my_wanted_prefix:'),
            ),
            password=redis_password,
            client_side=True,
            retry_on_timeout=True)

await cache.set("test", "testvalue")

the Redis Client shows:

$ redis-cli
127.0.0.1:6379> keys *
1) "cashews:my_wanted_prefix:test"

Without the option client_side=True the result is as expected:

$ redis-cli
127.0.0.1:6379> keys *
1) "my_wanted_prefix:test"

Why is this default prefix used here, how to avoid it?

Alias/tags for cache keys for invalidation

from cashews import cache

@cache(ttl="2h", tag="users")
async def get_users(space):
    ...
    

async cache.invalidate(tags=["users"], space="test")  # remove a key that belongs to the "test" space
async cache.invalidate(tags=["users"]) # remove all keys with tag users

Connecting Redis on Kubernetes with backend using cashews

Hi, thank you a lot for this repo, and I really appreciate it a lot. I have tried to use cashews with Redis on my local environment, and it works pretty well thanks to your previous assistance.

Currently, I deployed both the Redis service and the application on separate Kubernetes containers, and I wonder if it is possible to use cashews on backend to get access to the Redis server on another Kubernetes container. Any feedback will be greatly appreciated.

Transaction for cache changes

Most of the web frameworks uses databases and uses transactions, usually each request handler wrapped in the transaction and usually the transaction is committed only in case of successful processing. Cashews do not have any transaction and it may lead to inconsistent cache

async def handler(request):
     async with db.transaction as tx:
           ...
           await tx.insert(....)
           await cache.set("key", "value")
           ...
           await api.set(...).  # in case of error database transaction will rollback changes but cache not

The idea is to make some kind of transaction - temporary state that available only inside async context and collect changes in memory and commit it only on manually call of commit

High increasing load on the CPU

Found on the master branch.

after updating our product in production to the master branch (cashews), the CPU load increased.

image

indicators earlier.

image

i use:

cashews = {git = "https://github.com/Krukov/cashews.git", rev = "master"}
cache = Cache()
cache.setup(
    settings_redis.dsn,
    client_side=True,
    retry_on_timeout=True,
    hash_key=settings_redis.hash,
    pickle_type="sqlalchemy",
)

sqlalchemy object caching issue

Greetings! I ran into sqlalchemy object caching issues. From time to time I get errors like this:

"  File \"/home/boot/stellar/backend/./app/v1/security/auth.py\", line 127, in get_current_user\n    permissions_v2 = await service_permissions.get_permissions_by_role_cache(\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/wrapper.py\", line 272, in _call\n    return await decorator(*args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/decorators/cache/simple.py\", line 43, in _wrap\n    await backend.set(_cache_key, result, expire=ttl)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/validation.py\", line 62, in _invalidate_middleware\n    return await call(*args, key=key, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/wrapper.py\", line 34, in _auto_init\n    return await call(*args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/disable_control.py\", line 12, in _is_disable_middleware\n    return await call(*args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/backends/client_side.py\", line 132, in set\n    return await super().set(self._prefix + key, value, *args, **kwargs)\n",
      "  File \"/home/boot/stellar/backend/.venv/lib/python3.10/site-packages/cashews/serialize.py\", line 107, in set\n    value = pickle.dumps(value, protocol=pickle.HIGHEST_PROTOCOL, fix_imports=False)\n",
      "_pickle.PicklingError: Can't pickle <function __init__ at 0x7fc386fbb7f0>: it's not the same object as sqlalchemy.orm.instrumentation.__init__\n"

I found the answer that such objects should be serialized and deserialized differently:
https://docs.sqlalchemy.org/en/14/core/serializer.html

Tell me how good / bad idea it is to cache alchemy objects, and not the endpoint itself.

If this is not bad, what can you say about adding functionality that could fix this error?

futurecoder

Hi, sorry if this is a bit weird, wasn't sure how else to reach you.

I saw that you liked several of my projects on GitHub (birdseye, snoop, heartrate, sorcery) so I thought you might be interested by my latest and most ambitious project: https://futurecoder.io/

It uses several of my libraries, including birdseye and snoop.

'cache.incr' and 'cache.keys_match' API commands examples from documentation do not work

Copy/pasted from the documentation:

import asyncio
from cashews import cache

cache.setup("mem://")  # configure as in-memory cache

async def func():
    await cache.set(key="key", value={"any": True}, expire=60, exist=None)  # -> bool
    await cache.get("key")  # -> Any
    await cache.get_many("key1", "key2")
    await cache.incr("key") # -> int
    await cache.delete("key")
    await cache.delete_match("pattern:*")
    await cache.keys_match("pattern:*") # -> List[str]
    await cache.expire("key", timeout=10)
    await cache.get_expire("key")  # -> int seconds to expire
    await cache.ping(message=None)  # -> bytes
    await cache.clear()
    await cache.is_locked("key", wait=60)  # -> bool
    async with cache.lock("key", expire=10):
       ...
    await cache.set_lock("key", value="value", expire=60)  # -> bool
    await cache.unlock("key", "value")  # -> bool


asyncio.run(func())

Run result:

Traceback (most recent call last):
  File "/local.py", line 46, in <module>
    asyncio.run(func())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
    return future.result()
  File "/local.py", line 31, in func
    await cache.incr("key") # -> int
  File "/venv/lib/python3.10/site-packages/cashews/validation.py", line 63, in _invalidate_middleware
    return await call(*args, key=key, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/wrapper.py", line 45, in _auto_init
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/disable_control.py", line 12, in _is_disable_middleware
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/backends/memory.py", line 75, in incr
    value = int(self._get(key, 0)) + 1
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'dict'

Process finished with exit code 1

If comment out await cache.incr("key") # -> int to skip the problematic command:

Traceback (most recent call last):
  File "/_local.py", line 46, in <module>
    asyncio.run(func())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
    return future.result()
  File "/_local.py", line 34, in func
    await cache.keys_match("pattern:*") # -> List[str]
  File "/venv/lib/python3.10/site-packages/cashews/validation.py", line 62, in _invalidate_middleware
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/wrapper.py", line 45, in _auto_init
    return await call(*args, **kwargs)
  File "/venv/lib/python3.10/site-packages/cashews/disable_control.py", line 12, in _is_disable_middleware
    return await call(*args, **kwargs)
TypeError: object async_generator can't be used in 'await' expression

Process finished with exit code 1

I do not know how to fix await cache.keys_match("pattern:*") because [key async for key in cache.keys_match("pattern:*") does not work too.

size and key of in memory cache

Hi,

I wonder what is the unit of the size of in memory cache. For example, I used cache.setup("mem://") , while when I called the function 3 times with different arguments, it seems that all only previous 2 arguments are stored.

In addition, I don't use any key as argument for the cache decorator, and I wonder if cashews will hash the function name and the arguments of the functions as the key?

Any feedback will be greatly appreciated.

fix: passed 'args' and 'kwargs' are not considered when building a cache key

If I am not wrong, this is a serious bug.

Problem

A function is cached always and only once, regardless of passed arguments.

import asyncio

from cashews import cache

cache.setup("mem://")


@cache(ttl="3h")
async def func(*args, **kwargs):
    print(f"{args}; {kwargs}")
    return args, kwargs


async def main():
    await func("foo")
    await func("bar")
    await func("spam")
    await func("eggs")

asyncio.run(main())

Result:

('foo',); {}

Expected result:

The function must be called on every unique args and kwargs.

Cache `bytes` more efficient

await cache.set("key", b"test")

>>>redis: "SET" "test:key" "\x80\x05\x95\b\x00\x00\x00\x00\x00\x00\x00C\x04test\x94."

Should be

>>>redis: "SET" "test:key" "test"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.