GithubHelp home page GithubHelp logo

explosion / preshed Goto Github PK

View Code? Open in Web Editor NEW
78.0 13.0 19.0 178 KB

๐Ÿ’ฅ Cython hash tables that assume keys are pre-hashed

License: MIT License

Python 24.28% C 18.26% Shell 0.82% Cython 56.64%
cython hashing hash-table hash-tables python

preshed's People

Contributors

adrianeboyd avatar danieldk avatar henningpeters avatar honnibal avatar ines avatar jtmoulia avatar polm avatar svlandeg avatar syllog1sm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

preshed's Issues

pip install preshed fails for official python:3.7 docker container

Hi, I'm trying to install spacy in the official python:3.7 docker container but I'm getting an error: Failed building wheel for cymem.

Here's the command and the full log:

docker run -it --rm python:3.7 bash -c 'pip install preshed'

Note that running this command using the python:3.6 container works fine.

Let me know if any additional information would be useful!

preshed 3.0.5 began failing pip installation

Preshed is a nested dependency in my project from spacey. My builds just picked up the version 3.0.5, and has begun failing with the following error. When I pin back to 3.0.4, my pip install begins working again.

Was there an expecting breaking change in 3.0.5 that I am missing?

2020-12-09T19:24:59.6110019Z Collecting preshed==3.0.5
2020-12-09T19:24:59.6210606Z Downloading preshed-3.0.5.tar.gz (14 kB)
2020-12-09T19:24:59.6399340Z Preparing wheel metadata: started
2020-12-09T19:25:00.1628380Z Preparing wheel metadata: finished with status 'error'
2020-12-09T19:25:00.1645557Z ๏ฟฝ[91m ERROR: Command errored out with exit status 1:
2020-12-09T19:25:00.1646588Z command: /usr/bin/python3.7 /usr/local/lib/python3.7/dist-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpf5g9mv31
2020-12-09T19:25:00.1653575Z cwd: /tmp/pip-download-0u0d7ve0/preshed
2020-12-09T19:25:00.1654277Z Complete output (530 lines):
2020-12-09T19:25:00.1654674Z
2020-12-09T19:25:00.1655035Z Error compiling Cython file:
2020-12-09T19:25:00.1656451Z ------------------------------------------------------------
2020-12-09T19:25:00.1657242Z ...
2020-12-09T19:25:00.1657641Z from libc.stdint cimport uint64_t, uint32_t
2020-12-09T19:25:00.1658129Z from cymem.cymem cimport Pool
2020-12-09T19:25:00.1662967Z ^
2020-12-09T19:25:00.1664461Z ------------------------------------------------------------
2020-12-09T19:25:00.1665029Z
2020-12-09T19:25:00.1665822Z preshed/bloom.pxd:2:0: 'cymem/cymem.pxd' not found
2020-12-09T19:25:00.1666259Z

readthedocs build fails because of cymem version mismatch

Preshed is a dependency of textacy by way of spaCy. Unfortunately there seems to be a mismatch in cymem versions that prevents readthedocs from compiling the documentation for my package. The build process returns error: cymem 1.31.0 is installed but cymem<1.31.0,>=1.30 is required by {'preshed'}. Looks like spaCy requires cymem>=1.30,<1.32, which I guess is where this mismatch arises. Do you have any recommendations?

PreshMap hangs on small numbers of paired insertions and deletions

The first test hangs on the getitem in the assert. The second test is fine. The first test fails as soon as the range is at least 10. (I don't know if other small numbers of paired insertions and deletions cause this problem, like adding and removing two at a time, but adding one and removing that same item repeatedly is what needs to happen in the tokenizer cache when special cases are being added.)

My first thought was that it might be a resizing bug, but a quick test (that might have been flawed) made it look like it wasn't due to resizing, so I don't know what's going on?

Test cases written as new pytest tests:

from ..maps import PreshMap


def test_one_and_empty():
    table = PreshMap()
    for i in range(10):
        table[i] = i
        del table[i]
    assert table[0] == None


def test_many_and_empty():
    table = PreshMap()
    for i in range(10):
        table[i] = i
    for i in range(10):
        del table[i]
    assert table[0] == None

Missing git tag for 1.0.0 release

Latest PyPI release version is v1.0.0 but there is no corresponding git tag. It will be great to have such tag for reference purposes.

PreshMap can not contain 0 value

Is it a feature or a bug that a PreshMap can not contain 0 as value?

Consider this code:

self._alias_index = PreshMap()
self._alias_index[342] = 1
if 342 in self._alias_index:
    print("yay")
else:
    print("not so yay")

which prints yay

self._alias_index = PreshMap()
self._alias_index[342] = 0
if 342 in self._alias_index:
    print("yay")
else:
    print("not so yay")

which prints not so yay

Tests fail to run: No module named 'preshed.bloom'

cd /usr/ports/devel/py-preshed/work-py39/preshed-4.0.0 && /usr/bin/env XDG_DATA_HOME=/usr/ports/devel/py-preshed/work-py39  XDG_CONFIG_HOME=/usr/ports/devel/py-preshed/work-py39  XDG_CACHE_HOME=/usr/ports/devel/py-preshed/work-py39/.cache  HOME=/usr/ports/devel/py-preshed/work-py39 PATH=/usr/local/libexec/ccache:/usr/ports/devel/py-preshed/work-py39/.bin:/home/yuri/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin PKG_CONFIG_LIBDIR=/usr/ports/devel/py-preshed/work-py39/.pkgconfig:/usr/local/libdata/pkgconfig:/usr/local/share/pkgconfig:/usr/libdata/pkgconfig MK_DEBUG_FILES=no MK_KERNEL_SYMBOLS=no SHELL=/bin/sh NO_LINT=YES PREFIX=/usr/local  LOCALBASE=/usr/local  CC="cc" CFLAGS="-O2 -pipe  -fstack-protector-strong -fno-strict-aliasing "  CPP="cpp" CPPFLAGS=""  LDFLAGS=" -fstack-protector-strong " LIBS=""  CXX="c++" CXXFLAGS="-O2 -pipe -fstack-protector-strong -fno-strict-aliasing  "  MANPREFIX="/usr/local" CCACHE_DIR="/tmp/.ccache" BSD_INSTALL_PROGRAM="install  -s -m 555"  BSD_INSTALL_LIB="install  -s -m 0644"  BSD_INSTALL_SCRIPT="install  -m 555"  BSD_INSTALL_DATA="install  -m 0644"  BSD_INSTALL_MAN="install  -m 444" PYTHONPATH=/usr/ports/devel/py-preshed/work-py39/stage/usr/local/lib/python3.9/site-packages /usr/local/bin/python3.9 -m pytest
========================================================================================== test session starts ==========================================================================================
platform freebsd13 -- Python 3.9.16, pytest-7.2.2, pluggy-1.0.0
Using --randomly-seed=2340475823
rootdir: /usr/ports/devel/py-preshed/work-py39/preshed-4.0.0
plugins: forked-1.6.0, hypothesis-6.72.0, cov-2.9.0, mypy-plugins-1.10.1, randomly-3.12.0, timeout-2.1.0, rerunfailures-11.1.2, flaky-3.7.0, xdist-2.5.0, env-0.6.2, mock-3.10.0
collected 0 items / 4 errors                                                                                                                                                                            

================================================================================================ ERRORS =================================================================================================
_____________________________________________________________________________ ERROR collecting preshed/tests/test_bloom.py ______________________________________________________________________________
ImportError while importing test module '/usr/ports/devel/py-preshed/work-py39/preshed-4.0.0/preshed/tests/test_bloom.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
preshed/tests/test_bloom.py:5: in <module>
    from preshed.bloom import BloomFilter
E   ModuleNotFoundError: No module named 'preshed.bloom'
____________________________________________________________________________ ERROR collecting preshed/tests/test_counter.py _____________________________________________________________________________
ImportError while importing test module '/usr/ports/devel/py-preshed/work-py39/preshed-4.0.0/preshed/tests/test_counter.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
preshed/tests/test_counter.py:4: in <module>
    from preshed.counter import PreshCounter
E   ModuleNotFoundError: No module named 'preshed.counter'
____________________________________________________________________________ ERROR collecting preshed/tests/test_hashing.py _____________________________________________________________________________
ImportError while importing test module '/usr/ports/devel/py-preshed/work-py39/preshed-4.0.0/preshed/tests/test_hashing.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
preshed/tests/test_hashing.py:3: in <module>
    from preshed.maps import PreshMap
E   ModuleNotFoundError: No module named 'preshed.maps'
______________________________________________________________________________ ERROR collecting preshed/tests/test_pop.py _______________________________________________________________________________
ImportError while importing test module '/usr/ports/devel/py-preshed/work-py39/preshed-4.0.0/preshed/tests/test_pop.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
preshed/tests/test_pop.py:1: in <module>
    from ..maps import PreshMap
E   ModuleNotFoundError: No module named 'preshed.maps'
======================================================================================== short test summary info ========================================================================================
ERROR preshed/tests/test_bloom.py
ERROR preshed/tests/test_counter.py
ERROR preshed/tests/test_hashing.py
ERROR preshed/tests/test_pop.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 4 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=========================================================================================== 4 errors in 0.32s ===========================================================================================
*** Error code 2

Python-3.9
FreeBSD 13.2

conda skeleton build fails

Building conda packages via conda skeleton pypi preshed fails with following error:

Traceback (most recent call last):
  File "setup.py", line 107, in <module>
    main(MOD_NAMES, use_cython)
  File "setup.py", line 100, in main
    run_setup(exts)
  File "setup.py", line 88, in run_setup
    import headers_workaround
ImportError: No module named 'headers_workaround'

Tests fail on s390x

I am building pressed v4.0.0 on Alpine Linux and got following tests failing on s390x arch (all other arch run tests fine) [1]:

============================= test session starts ==============================
platform linux -- Python 3.11.3, pytest-7.3.1, pluggy-1.0.0
rootdir: /builds/otlabs/aports/testing/py3-preshed/src/preshed-4.0.0
collected 20 items
preshed/tests/test_bloom.py .....FF.                                     [ 40%]
preshed/tests/test_counter.py ....                                       [ 60%]
preshed/tests/test_hashing.py .......                                    [ 95%]
preshed/tests/test_pop.py .                                              [100%]
=================================== FAILURES ===================================
_________________________ test_bloom_from_bytes_legacy _________________________
    def test_bloom_from_bytes_legacy():
        # This is the output from the tests in the legacy format
        data = "0200000000000000600000000000000000000000000000003300000000000000e100000000000000b200000000000000da00000000000000e700000000000000e600000000000000ff000000000000004700000000000000e7000000000000004c000000000000003b00000000000000f700000000000000"
        data = bytes.fromhex(data)
        bf = BloomFilter().from_bytes(data)
        for ii in range(0, 1000, 20):
>           assert ii in bf
E           assert 0 in <preshed.bloom.BloomFilter object at 0x3ff8f39b650>
preshed/tests/test_bloom.py:70: AssertionError
_____________________ test_bloom_from_bytes_legacy_windows _____________________
    def test_bloom_from_bytes_legacy_windows():
        # This is the output from the tests in the legacy Windows format.
        # This is the same as the data in the normal test, but missing the second
        # half of each container.
    
        data = "02000000600000000000000033000000e1000000b2000000da000000e7000000e6000000ff00000047000000e70000004c0000003b000000f7000000"
        data = bytes.fromhex(data)
        bf = BloomFilter().from_bytes(data)
        for ii in range(0, 1000, 20):
>           assert ii in bf
E           assert 0 in <preshed.bloom.BloomFilter object at 0x3ff8f39b730>
preshed/tests/test_bloom.py:87: AssertionError
=========================== short test summary info ============================
FAILED preshed/tests/test_bloom.py::test_bloom_from_bytes_legacy - assert 0 in <preshed.bloom.BloomFilter object at 0x3ff8f39b650>
FAILED preshed/tests/test_bloom.py::test_bloom_from_bytes_legacy_windows - assert 0 in <preshed.bloom.BloomFilter object at 0x3ff8f39b730>
========================= 2 failed, 18 passed in 0.06s =========================
>>> ERROR: py3-preshed: check failed

[1] https://gitlab.alpinelinux.org/otlabs/aports/-/jobs/1021066#L280

preshed.maps wheel

Hi,

I am trying to install spacy from a .whl, and one of the requirements (preshed) is causing a whole lot of trouble.

I downloaded the preshed wheel from here but unfortunately when installing it, it says that building "preshed.maps" failed given that I have no C++ compiler installed.

Are you able to provide a .whl for preshed.maps?

Vlad

Cflag ERROR when installing

the debug info is as belows, is it possible to comment the cflag parts?

Using cached preshed-0.41.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
.....\setup.py",line 22, in rm_cflag
cflags = distutils.sysconfig._config_vars['CFLAGS']
KeyError: 'CFLAGS'

Wheel support for linux aarch64[arm64]

Summary
Installing preshed on aarch64 via pip using command "pip3 install preshed" tries to build wheel from source code

Problem description
preshed doesn't have wheel for aarch64 on PyPI repository. So, while installing preshed via pip on aarch64, pip builds wheel for same resulting in it takes more time to install preshed. Making wheel available for aarch64 will benefit aarch64 users by minimizing preshed installation time.

Expected Output
Pip should be able to download preshed wheel from PyPI repository rather than building it from source code.

@preshed-team, please let me know if I can help you building wheel/uploading to PyPI repository. I am curious to make preshed wheel available for aarch64. It will be a great opportunity for me to work with you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.