GithubHelp home page GithubHelp logo

rapidfuzz / levenshtein Goto Github PK

View Code? Open in Web Editor NEW
228.0 5.0 14.0 6.48 MB

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Home Page: https://rapidfuzz.github.io/Levenshtein

License: GNU General Public License v2.0

Python 36.11% Makefile 0.77% Batchfile 0.96% Cython 10.21% CMake 3.81% C++ 47.64% Shell 0.49%
python levenshtein levenshtein-distance string-matching string-similarity string-comparison hacktoberfest

levenshtein's Introduction

Levenshtein

Continuous Integration PyPI package version Python versions Documentation GitHub license

Introduction

The Levenshtein Python C extension module contains functions for fast computation of:

  • Levenshtein (edit) distance, and edit operations
  • string similarity
  • approximate median strings, and generally string averaging
  • string sequence and set similarity

Requirements

  • Python 3.8 or later

Installation

pip install levenshtein

Documentation

The documentation for the current version can be found at https://rapidfuzz.github.io/Levenshtein/

Support the project

If you are using Levenshtein for your work and feel like giving a bit of your own benefit back to support the project, consider sending us money through GitHub Sponsors or PayPal that we can use to buy us free time for the maintenance of this great library, to fix bugs in the software, review and integrate code contributions, to improve its features and documentation, or to just take a deep breath and have a cup of tea every once in a while. Thank you for your support.

Support the project through GitHub Sponsors or via PayPal:

.

License

Levenshtein is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

See the file COPYING for the full text of GNU General Public License version 2.

levenshtein's People

Contributors

antoinetavant avatar atomflunder avatar datguy1 avatar felixonmars avatar friday avatar guyrosin avatar joncasdam avatar kerstin avatar maxbachmann avatar mgorny avatar miohtama avatar ojomio avatar sandrotosi avatar stromnov avatar timworx avatar wor avatar ztane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

levenshtein's Issues

PyUnicode_AS_UNICODE might return NULL

PyUnicode_AS_UNICODE is used without error checking in many places:

    string1 = PyUnicode_AS_UNICODE(arg1);
    string2 = PyUnicode_AS_UNICODE(arg2);
    dist = lev_u_hamming_distance(len1, string1, string2);

However since Python3.3 PyUnicode_AS_UNICODE usually has to copy the string can fail. In this case the current implementation will run into segmentation faults, since it will dereference a null pointer inside the string similarity algorithm.

Beside this PyUnicode_AS_UNICODE is deprecated since Python3.3 and due to the copy relatively slow.

Missing stubs for some functions

Hello, I recently tried to upgrade Levenshtein to Version 0.19.1 and I noticed that using some functions will throw an error in the IDE using Pylance or Mypy.
Example using distance:

Levenshtein.distance("test", "another test")

The error message:

"distance" is not a known member of module

This is (I think) because the __init__.pyi file has stubs for some but not all available functions.
From what I can tell these are missing from the stub file:

distance, ratio, hamming, jaro, and jaro_winkler.

Would it be possible to add those, or are you recommending to just shut up Pylance by adding a # type: ignore comment on each line? Would you be open for a pull request to be submitted? Thanks in advance for answering!

Make package installable through Conda

Hi there,

Now that you have taken over this package, do you think you could make it installable through Conda as well, maybe publishing it in conda-forge as it was python-levenshtein before?

I would really appreciate this because it would make the package more compatible with pipelines.

Thank you very much in advance.

unexpected behavior with seqratio

@maxbachmann Quick question for you! When I freshly install the module and run the seqratio example from the documentation, I'm getting some unexpected behavior! Running this snippet:

>> import Levenshtein
>> print(Levenshtein.seqratio(['newspaper', 'litter bin', 'tinny', 'antelope'], ['caribou', 'sausage', 'gorn', 'woody']))
>> !python3 --version

gives me:

7.215178571428572
Python 3.8.13

which isn't the expected value according to the documentation!

Make it a rust crate

I compared the running speed of this package with 8 crates of rust. I found that it is more than ten times faster than those crates. So I really hope to see you make a crate that can calculate levenshtein distance.
Or do you know how to use this package in rust? I have tried using PyO3 before, but it is still not as fast as running directly in Python.

Fails to build debian11/python 3.9: Could NOT find Python (missing: Interpreter Development.Module)

Similar to #39

Investigating the pip install logs it's using the latest scikit-build version 0.17.0

Reproduction:

docker run -it debian:bullseye
root@7421095af216:/# apt update
...
root@7421095af216:/# apt install build-essential python3-setuptools python3-pkg-resources python3-all libffi-dev python3-dev python3-venv cmake ninja-build
...
root@7421095af216:/# python3 -V
Python 3.9.2
root@7421095af216:/# python3 -m venv .env
root@7421095af216:/# . .env/bin/activate
(.env) root@7421095af216:/# pip wheel --no-binary=:all: Levenshtein
Collecting Levenshtein
  Downloading Levenshtein-0.20.9.tar.gz (122 kB)
     |████████████████████████████████| 122 kB 4.7 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Collecting rapidfuzz<3.0.0,>=2.3.0
  Downloading rapidfuzz-2.15.1.tar.gz (1.2 MB)
     |████████████████████████████████| 1.2 MB 11.1 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: Levenshtein, rapidfuzz
  Building wheel for Levenshtein (PEP 517) ... error
  ERROR: Command errored out with exit status 1:
   command: /.env/bin/python3 /tmp/tmpojm6smub_in_process.py build_wheel /tmp/tmpkzy5d2j7
       cwd: /tmp/pip-wheel-hl85_a02/levenshtein_d1ceb410c7a44bafaa1d765e08b55054
  Complete output (80 lines):
  
  
  --------------------------------------------------------------------------------
  -- Trying 'Ninja' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  Not searching for unused variables given on the command line.
  -- The C compiler identification is GNU 10.2.1
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is GNU 10.2.1
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done
  -- Generating done
  -- Build files have been written to: /tmp/pip-wheel-hl85_a02/levenshtein_d1ceb410c7a44bafaa1d765e08b55054/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja' generator - success
  --------------------------------------------------------------------------------
  
  Configuring Project
    Working directory:
      /tmp/pip-wheel-hl85_a02/levenshtein_d1ceb410c7a44bafaa1d765e08b55054/_skbuild/linux-x86_64-3.9/cmake-build
    Command:
      /usr/bin/cmake /tmp/pip-wheel-hl85_a02/levenshtein_d1ceb410c7a44bafaa1d765e08b55054 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-wheel-hl85_a02/levenshtein_d1ceb410c7a44bafaa1d765e08b55054/_skbuild/linux-x86_64-3.9/cmake-install -DPYTHON_VERSION_STRING:STRING=3.9.2 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-a94e4xto/overlay/lib/python3.9/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/.env/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.9 -DPYTHON_LIBRARY:PATH=/usr/lib/x86_64-linux-gnu/libpython3.9.so -DPython_EXECUTABLE:PATH=/.env/bin/python3 -DPython_ROOT_DIR:PATH=/.env -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.9 -DPython_LIBRARY:PATH=/usr/lib/x86_64-linux-gnu/libpython3.9.so -DPython3_EXECUTABLE:PATH=/.env/bin/python3 -DPython3_ROOT_DIR:PATH=/.env -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.9 -DPython3_LIBRARY:PATH=/usr/lib/x86_64-linux-gnu/libpython3.9.so -DCMAKE_BUILD_TYPE:STRING=Release
  
  -- The C compiler identification is GNU 10.2.1
  -- The CXX compiler identification is GNU 10.2.1
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  CMake Error at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:165 (message):
    Could NOT find Python (missing: Interpreter Development.Module)
  Call Stack (most recent call first):
    /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:458 (_FPHSA_FAILURE_MESSAGE)
    /usr/share/cmake-3.18/Modules/FindPython.cmake:436 (find_package_handle_standard_args)
    CMakeLists.txt:25 (find_package)
  
  
  -- Configuring incomplete, errors occurred!
  See also "/tmp/pip-wheel-hl85_a02/levenshtein_d1ceb410c7a44bafaa1d765e08b55054/_skbuild/linux-x86_64-3.9/cmake-build/CMakeFiles/CMakeOutput.log".
  Traceback (most recent call last):
    File "/tmp/pip-build-env-a94e4xto/overlay/lib/python3.9/site-packages/skbuild/setuptools_wrap.py", line 662, in setup
      env = cmkr.configure(
    File "/tmp/pip-build-env-a94e4xto/overlay/lib/python3.9/site-packages/skbuild/cmaker.py", line 355, in configure
      raise SKBuildError(msg)
  
  An error occurred while configuring with CMake.
    Command:
      {self._formatArgsForDisplay(cmd)}
    Source directory:
      {os.path.abspath(cmake_source_dir)}
    Working directory:
      {os.path.abspath(CMAKE_BUILD_DIR())}
  Please see CMake's output for more information.
  
  ----------------------------------------
  ERROR: Failed building wheel for Levenshtein
  Building wheel for rapidfuzz (PEP 517) ... done
  Created wheel for rapidfuzz: filename=rapidfuzz-2.15.1-py3-none-any.whl size=67496 sha256=f833a73e3d02842a1086f5245acc72eb7ec24803e43f96f53abb5925ea8cac84
  Stored in directory: /root/.cache/pip/wheels/61/f6/0c/6f43121b33df553ca9e9fabcd158500dec591cc08c6c933a39
Successfully built rapidfuzz
Failed to build Levenshtein
ERROR: Failed to build one or more wheels

jaro_winkler gives values larger than 1.

Hello,

Thanks for the implementation of this library. It is very useful. I was trying to use it to find the root of a word based on similarity. However I came across with some values bigger than 1 with this setting.
jaro_winkler('milyarder', 'milyarderlik',prefix_weight=0.5)
1.0833
jaro_winkler('milyarderlik','milyarder',prefix_weight=1)
1.25
In the documentation (https://rapidfuzz.github.io/Levenshtein/levenshtein.html#jaro-winkler), it is written that the jaro_winkler should be between 0 and 1. I was wondering if I put a invalid prefix weight. In this case, I assumed it should have raised a valueError as mentioned in the documentation.
Thanks in advance.

reference leak in module init

Inside module init strings are created:

#ifdef LEV_PYTHON3
    opcode_names[i].pystring
      = PyUnicode_InternFromString(opcode_names[i].cstring);
#else
    opcode_names[i].pystring
      = PyString_InternFromString(opcode_names[i].cstring);
#endif
    opcode_names[i].len = strlen(opcode_names[i].cstring);

These strings are not NULL checked, so if allocation fails strlen causes a segmentation fault.
In addition the strings are not deallocated since the reference count is never cleared.

Typing stubs are not distributed with the package

Installing the package (0.20.6) with pip only installs the .py files and the shared library. The py.typed marker and the .pyi do not get installed, and are therefore of no use to type checkers.

Output of check_typedpkg listed in PEP-0561:

$ python -m typed_check Levenshtein
Package Levenshtein does not support typing.

using GPU

Is there a way to make this program to use a GPU?

Compatibility with rapidfuzz-cpp 2.0.0

The current version rejects rapidfuzz-cpp 2.0.0 and tries to use the bundled version instead. If I remove the QUIET from find_package() call, I get:

CMake Warning at CMakeLists.txt:44 (find_package):
  Could not find a configuration file for package "rapidfuzz" that is
  compatible with requested version "1.7.0".

  The following configuration files were considered but not accepted:

    /usr/lib64/cmake/rapidfuzz/rapidfuzzConfig.cmake, version: 2.0.0

Distance Normalization

Hi

How is the distance normalized in Levenshtein.ratio(s1, s2)? It'll be helpful if you can provide the formula.

Thanks

dependency rapidfuzz 3.0

Hi,

I just noticed when updating my pip packages, that Levenshtein required rapidfuzz >=2.3.0,<3.0.0 and rapidfuzz 3.0 is available.
Will the next release be compatible with version 3.0? If so, do you already have estimates as to when?

Much Thanks!

Citation

Hey,
I use this package to calculate the string distances for a paper. Do you have some paper in relation with this package to cite? Otherwise I am going to use the Github website.

Thanks for your work!
Max

score_cutoff argument not seeming to work for ratio

I just discovered Levenshtein and am trying it out as a hopefully faster alternative to SequenceMatcher. I only care about high level of matches, so I tried to set the score_cutoff parameter and got an error. Here's the simple example from the documentation, running on Anaconda on Windows, with and without the parameter:

`ratio("lewenstein", "levenshtein", score_cutoff=0.9)
Traceback (most recent call last):

File "C:\Users\mikem\AppData\Local\Temp\ipykernel_14020\1371377410.py", line 1, in
ratio("lewenstein", "levenshtein", score_cutoff=0.9)

TypeError: ratio() takes no keyword arguments

ratio("lewenstein", "levenshtein")
Out[3]: 0.8571428571428571`

Any idea why setting score_cutoff doesn't work?

please provide a source tarball including external dependencies

Hello,
i maintain Levenshtein in Debian and while working to package 0.24.0, i noticed that you're using the extern directory with a git submodule to another project. that setup is also replicated in the source tarball release via github when you tag a release (the PyPI tarballs for 0.23.0, since 0.24.0 is currently not uploaded to that system, seems to be empty so i'm basing my work on github tarballs).

This creates an issue in Debian, since we cannot download anything from internet while building a package, so im asking if you could release a new artifact that includes the source code both for Levenshtein and any external dependency (they can live in extern, is that's where the build system expects them to be).

thanks for considering!

`Callable` missing type argument, making some methods partially-Unknown

For example
image

If it can truly be "anything", you can use Callable[..., Any], but I assume it's meant to be Callable[[Sequence[Hashable]], Sequence[Hashable]]

While at it, Callable, Hashable and Sequence should all be imported from collections.abc.
List and Tuple also don't need to be imported from typing in a type stub, the stdlib names (list and tuple) can be used directly.
Union and Optional can use PEP 604 syntax

From 19.0.3 to 20.0.1 matching blocks now return empty list if no match found

Hello there,
First of all ty for your work and all, but i currently have issue which needs me to pin your package version. The problem in earlier versions matching blocks returned something like the following:

matching_blocks=[(0, 0, 1), (3, 61, 0)]                                                                                                                                   
matching_blocks=[(0, 0, 1), (3, 27, 0)]                                                                                                                         
matching_blocks=[(3, 34, 0)]  #no match found

now:

matching_blocks=[MatchingBlock(a=1, b=12, size=1), MatchingBlock(a=2, b=17, size=1), MatchingBlock(a=6, b=21, size=0)]
matching_blocks=[] #i assume if no matching block at all is found

While one solution on my side just would be to be just add a tuple (len(a),len(b),0) i thought i should rather open a issue and raise the question if that change was intended.

Mypy complaines with newest release (20.07), code still works

Hey,
Me again :D
So i just pushed some unreleated code changes and the pipeline throws are rather confusing error. My tests and stuff still work totally fine.
it seems that editops does not support on a type hint level list of str but only str it self. Even tho its still working with list of str like before.

Here is the error:
error: Argument 1 to "editops" 07:57:22 has incompatible type "Union[str, List[str]]"; expected "str" [arg-type] 07:57:22 squenence_one, 07:57:22 ^~~~~~~~~~~~~ 07:57:22 error: Argument 2 to "editops" 07:57:22 has incompatible type "Union[str, List[str]]"; expected "str" [arg-type] 07:57:22 squenence_two, 07:57:22 ^~~~~~~~~~~~~

So with 20.6 its does not throw that error.
To give more insight of what im doing/using the editops for calculated the word matching rate. So basically just the matching rate based on word with levenshtein. For that I transform the given sentence words into hex. So your code detects those hex numbers as one entity. And so I can calculate the levenhstein distance on a word level.

So im a bit confused if this is intended behavior/change

how to install it offline

how to install it offline on centos7 python3.6
image
image

and when i used

git clone https://github.com/maxbachmann/Levenshtein.git cd Levenshtein pip install .

an error occured
image

All memory allocations should be checked

Currently unsuccessful memory allocations are not always detected which can cause the implementation to dereference null pointers:
e.g.

  list = PyList_New(n);
  for (i = 0; i < n; i++, ops++) {
    PyObject *tuple = PyTuple_New(3);
    PyObject *is = opcode_names[ops->type].pystring;
    Py_INCREF(is);
    PyTuple_SET_ITEM(tuple, 0, is);
    PyTuple_SET_ITEM(tuple, 1, PyInt_FromLong((long)ops->spos));
    PyTuple_SET_ITEM(tuple, 2, PyInt_FromLong((long)ops->dpos));
    PyList_SET_ITEM(list, i, tuple);
  }

should check both list and tuple so they are no null pointer.

Module 'Levenshtein' has no attribute 'distance'

Running Levenshtein 0.20.9 (on Windows 11. Python 3.7.0) I get the error:

AttributeError: module 'Levenshtein' has no attribute 'distance'

The associated Python code is:

import Levenshtein

distance1 = Levenshtein.distance(line1, line2)

I have installed the Levenshtein package several times, rebooted my laptop.

Is my version of Python compatible with Levenshtein 0.20.9?

PS - The pypi site captures that the package is compatible with Python 3.6.0 and later versions.

Thanks.

Not able to build on debian11 arm Python 3.10

Cloned this repo and tried following:
pip3 install .

Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Processing /home/juris/Levenshtein
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting rapidfuzz<3.0.0,>=2.3.0
  Using cached https://www.piwheels.org/simple/rapidfuzz/rapidfuzz-2.10.2-py3-none-any.whl (49 kB)
Collecting jarowinkler<2.0.0,>=1.2.2
  Using cached https://www.piwheels.org/simple/jarowinkler/jarowinkler-1.2.3-py3-none-any.whl (6.7 kB)
Building wheels for collected packages: Levenshtein
  Building wheel for Levenshtein (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for Levenshtein (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [92 lines of output]
      Not searching for unused variables given on the command line.
      CMake Error: CMake was unable to find a build program corresponding to "Ninja".  CMAKE_MAKE_PROGRAM is not set.  You probably need to select a different build tool.
      -- Configuring incomplete, errors occurred!
      See also "/home/juris/Levenshtein/_cmake_test_compile/build/CMakeFiles/CMakeOutput.log".
      Not searching for unused variables given on the command line.
      -- The C compiler identification is GNU 10.2.1
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /usr/bin/cc - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is GNU 10.2.1
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /usr/bin/c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Configuring done
      -- Generating done
      -- Build files have been written to: /home/juris/Levenshtein/_cmake_test_compile/build
      CMake Error at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:165 (message):
        Could NOT find Python (missing: Interpreter Development.Module)
      Call Stack (most recent call first):
        /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:458 (_FPHSA_FAILURE_MESSAGE)
        /usr/share/cmake-3.18/Modules/FindPython.cmake:436 (find_package_handle_standard_args)
        CMakeLists.txt:25 (find_package)
      
      
      -- Configuring incomplete, errors occurred!
      See also "/home/juris/Levenshtein/CMakeFiles/CMakeOutput.log".
        File "/tmp/pip-build-env-np1mmvek/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 637, in setup
          env = cmkr.configure(
        File "/tmp/pip-build-env-np1mmvek/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 328, in configure
          raise SKBuildError(
      
      
      --------------------------------------------------------------------------------
      -- Trying "Ninja" generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying "Ninja" generator - failure
      --------------------------------------------------------------------------------
      
      
      
      --------------------------------------------------------------------------------
      -- Trying "Unix Makefiles" generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying "Unix Makefiles" generator - success
      --------------------------------------------------------------------------------
      
      Configuring Project
        Working directory:
          /home/juris/Levenshtein/_skbuild/linux-armv7l-3.10/cmake-build
        Command:
          cmake /home/juris/Levenshtein -G 'Unix Makefiles' -DCMAKE_INSTALL_PREFIX:PATH=/home/juris/Levenshtein/_skbuild/linux-armv7l-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.7 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-np1mmvek/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPython3_EXECUTABLE:FILEPATH=/usr/local/bin/python3.10 -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.10 -DPython3_LIBRARY:PATH=/usr/local/lib/libpython3.10.a -DPython_EXECUTABLE:FILEPATH=/usr/local/bin/python3.10 -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.10 -DPython_LIBRARY:PATH=/usr/local/lib/libpython3.10.a -DPYTHON_EXECUTABLE:FILEPATH=/usr/local/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.10 -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.10.a -DCMAKE_BUILD_TYPE:STRING=Release
      
      Traceback (most recent call last):
      
      An error occurred while configuring with CMake.
        Command:
          cmake /home/juris/Levenshtein -G 'Unix Makefiles' -DCMAKE_INSTALL_PREFIX:PATH=/home/juris/Levenshtein/_skbuild/linux-armv7l-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.7 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-np1mmvek/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPython3_EXECUTABLE:FILEPATH=/usr/local/bin/python3.10 -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.10 -DPython3_LIBRARY:PATH=/usr/local/lib/libpython3.10.a -DPython_EXECUTABLE:FILEPATH=/usr/local/bin/python3.10 -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.10 -DPython_LIBRARY:PATH=/usr/local/lib/libpython3.10.a -DPYTHON_EXECUTABLE:FILEPATH=/usr/local/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.10 -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.10.a -DCMAKE_BUILD_TYPE:STRING=Release
        Source directory:
          /home/juris/Levenshtein
        Working directory:
          /home/juris/Levenshtein/_skbuild/linux-armv7l-3.10/cmake-build
      Please see CMake's output for more information.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for Levenshtein
Failed to build Levenshtein
ERROR: Could not build wheels for Levenshtein, which is required to install pyproject.toml-based projects

I am not an user of cmake, but from this I understand cmake could not find header files for Python. However I have Python built and installed from source. Additionally plain cmake would find the Python with following simplification of your CMakeLists.txt:

cmake_minimum_required(VERSION 3.12...3.24)

set(Python_FIND_IMPLEMENTATIONS CPython PyPy)

project(Levenshtein LANGUAGES C CXX)

if(CMAKE_VERSION VERSION_LESS 3.18)
    find_package(Python COMPONENTS Interpreter Development REQUIRED)
else()
    set(Python_ARTIFACTS_INTERACTIVE TRUE)
    find_package(Python COMPONENTS Interpreter Development.Module REQUIRED)
endif()

execute_process(
    COMMAND "${Python_EXECUTABLE}" -c
            "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX').split('.')[1])"
    OUTPUT_VARIABLE Python_SOABI
)
message(STATUS "Python was found and returned: ${Python_SOABI}")

When I run on the above file cmake .

-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Python was found and returned: cpython-310-arm-linux-gnueabihf

-- Configuring done
-- Generating done
-- Build files have been written to: /home/juris/cmake_test

lsb_release -a

Description:	Raspbian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye

uname -a
Linux 5.10.92-v7l+ #1514 SMP Mon Jan 17 17:38:03 GMT 2022 armv7l GNU/Linux

Mismatch between different implementations of Levenshtein

Hi there,

thank you for this work. I love this library because it is 100x faster than its competitors (e.g., strsimpy).

However, I have noticed that for the same couple of words, your implementation returns a different value of similarity.

from strsimpy.normalized_levenshtein import NormalizedLevenshtein
import Levenshtein


a = 'database system'
b = 'database systems'

print("Ration from Strsimpy")
normalized_levenshtein = NormalizedLevenshtein()
print(normalized_levenshtein.similarity(a, b))   

print("Ration from Levenshtein")
print(Levenshtein.seqratio(a,b)) 

as a result I get:

Ration from Strsimpy
0.9375
Ration from Levenshtein
0.967741935483871

I have checked with online tools and seems like that the similarity between a and b is 0.9375 (check here https://awsm-tools.com/levenshtein-distance?form%5Bsource%5D=database+system&form%5Btarget%5D=database+systems) which is in line with Strsimpy.

Do you know we get different values of similarity?

Thank a lot
Angelo

License

Hi,
Amazing package! Would you consider switching to a more permissive license such as MIT, ISC, or LGPL so all upstream packages don't also have to be GPL-licensed?
Thank you!

0.20.4: error: no matching function for call to ‘remove_common_suffix(rapidfuzz::detail::Range<unsigned int*>&, rapidfuzz::detail::Range<unsigned char*>&)’

[2/5] Building CXX object src/Levenshtein/CMakeFiles/levenshtein_cpp.dir/Levenshtein-c/_levenshtein.cpp.o
FAILED: src/Levenshtein/CMakeFiles/levenshtein_cpp.dir/Levenshtein-c/_levenshtein.cpp.o 
/usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -Dlevenshtein_cpp_EXPORTS -I/tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c -isystem /usr/include/python3.8 -march=znver2 --param=l1-cache-size=32 --param=l1-cache-line-size=64 -O2 -pipe -frecord-gcc-switches -O3 -DNDEBUG -flto=auto -fno-fat-lto-objects -fPIC -Wall -Wextra -pedantic -MD -MT src/Levenshtein/CMakeFiles/levenshtein_cpp.dir/Levenshtein-c/_levenshtein.cpp.o -MF src/Levenshtein/CMakeFiles/levenshtein_cpp.dir/Levenshtein-c/_levenshtein.cpp.o.d -o src/Levenshtein/CMakeFiles/levenshtein_cpp.dir/Levenshtein-c/_levenshtein.cpp.o -c /tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c/_levenshtein.cpp
In file included from /tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c/_levenshtein.cpp:31:
/tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c/_levenshtein.hpp: In instantiation of ‘finish_distance_computations(const rapidfuzz::detail::Range<unsigned int*>&, const std::vector<_RF_String>&, const std::vector<double>&, std::vector<std::unique_ptr<long unsigned int []> >&, std::unique_ptr<long unsigned int []>&)::<lambda(auto:8)> [with auto:8 = rapidfuzz::detail::Range<unsigned char*>]’:
/tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c/_levenshtein.hpp:42:9:   required from ‘auto visit(const RF_String&, Func&&, Args&& ...) [with Func = finish_distance_computations(const rapidfuzz::detail::Range<unsigned int*>&, const std::vector<_RF_String>&, const std::vector<double>&, std::vector<std::unique_ptr<long unsigned int []> >&, std::unique_ptr<long unsigned int []>&)::<lambda(auto:8)>; Args = {}; RF_String = _RF_String]’
/tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c/_levenshtein.hpp:275:14:   required from here
/tmp/portage/dev-python/Levenshtein-0.20.4/work/Levenshtein-0.20.4/src/Levenshtein/Levenshtein-c/_levenshtein.hpp:280:52: error: no matching function for call to ‘remove_common_suffix(rapidfuzz::detail::Range<unsigned int*>&, rapidfuzz::detail::Range<unsigned char*>&)’
  280 |             rapidfuzz::detail::remove_common_suffix(s1_temp, s2);
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
[...]

(and more like that)

Full log:
dev-python:Levenshtein-0.20.4:20220927-014026.log

ratio method always returns 1.0

I'm using latest(0.18.0) version of this package. Levenshtein.ratio method always return 1.0 as Levensthein edit distance ratio. There may be a bug in this method.

ratio method in 0.16 version is fine.

Levenshtein.matching_blocks raising AttributeError in latest release

Hello, I just tried upgrading to the latest release, v0.20.0, and I noticed the Levenshtein.matching_blocks() function now produces an error when used.

The code with the error message:

>>> import Levenshtein
>>> long_string = "hsfdjuhfkjsnfjknskj"
>>> short_string = "hsfdsjdsai"
>>> ops = Levenshtein.editops(long_string, short_string)
>>> Levenshtein.matching_blocks(ops, long_string, short_string)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\my_name\Documents\Python\my_project\venv\lib\site-packages\Levenshtein\__init__.py", line 164, in matching_blocks
    return _Editops(edit_operations, len1, len2).as_matching_blocks().as_list()
AttributeError: 'list' object has no attribute 'as_list'

When downgrading to version v0.19.3 everything works as intended:

>>> import Levenshtein
>>> long_string = "hsfdjuhfkjsnfjknskj"
>>> short_string = "hsfdsjdsai"
>>> ops = Levenshtein.editops(long_string, short_string)
>>> Levenshtein.matching_blocks(ops, long_string, short_string)
[(0, 0, 4), (10, 4, 1), (13, 5, 1), (16, 7, 1), (19, 10, 0)]

In both examples the rapidfuzz version is 2.3.0.

Thanks in advance for responding!

Levenshtein realisation counts substitution as 2 edits instead of 1

The distance in your documentation is defined as the minimum number of insertions, deletions, and substitutions.

When I calculate the ratio between the following strings, it gives me the following:

  1. "abcde" vs "abcd" is (9-1)/9 = 0.888888889 (correct)
  2. "abcde 1" vs "abcde 2" = (14-2)/14=0.857142857 (incorrect)

I expect a higher similarity in the second case, but the library considers substitution as two edits (deletion and insertion) instead of one.

(Python 3.11) Building error at installation

Hello !
I always used your very nice library with Python 3.8 to 3.10, but when I try to install it with Python 3.11rc1 on Windows 11, I get this traceback :

PS C:\Users\thher> py -m pip install levenshtein
Collecting levenshtein
  Using cached Levenshtein-0.20.2.tar.gz (114 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting rapidfuzz<3.0.0,>=2.3.0
  Using cached rapidfuzz-2.6.1-cp311-cp311-win_amd64.whl (1.3 MB)
Collecting jarowinkler<2.0.0,>=1.2.0
  Using cached jarowinkler-1.2.1-cp311-cp311-win_amd64.whl (62 kB)
Building wheels for collected packages: levenshtein
  Building wheel for levenshtein (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for levenshtein (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [289 lines of output]
      Not searching for unused variables given on the command line.
      -- The C compiler identification is GNU 12.1.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Users/thher/gcc/bin/gcc.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is GNU 12.1.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Users/thher/gcc/bin/c++.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      CMake Error at CMakeLists.txt:9 (message):
        MSVC is required to pass this check.


      -- Configuring incomplete, errors occurred!
      See also "C:/Users/thher/AppData/Local/Temp/pip-install-hdi1djen/levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56/_cmake_test_compile/build/CMakeFiles/CMakeOutput.log".
      Not searching for unused variables given on the command line.
      CMake Error at CMakeLists.txt:2 (PROJECT):
        Generator

          Visual Studio 17 2022

        could not find any instance of Visual Studio.



      -- Configuring incomplete, errors occurred!
      See also "C:/Users/thher/AppData/Local/Temp/pip-install-hdi1djen/levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56/_cmake_test_compile/build/CMakeFiles/CMakeOutput.log".
      Not searching for unused variables given on the command line.
      -- The C compiler identification is GNU 12.1.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Users/thher/gcc/bin/gcc.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is GNU 12.1.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Users/thher/gcc/bin/c++.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      CMake Error at CMakeLists.txt:9 (message):
        MSVC is required to pass this check.


      -- Configuring incomplete, errors occurred!
      See also "C:/Users/thher/AppData/Local/Temp/pip-install-hdi1djen/levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56/_cmake_test_compile/build/CMakeFiles/CMakeOutput.log".
      Not searching for unused variables given on the command line.
      -- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.22622.
      -- The C compiler identification is MSVC 19.29.30146.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is MSVC 19.29.30146.0
      CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.24/Modules/CMakeDetermineCXXCompiler.cmake:162 (if):
        Policy CMP0054 is not set: Only interpret if() arguments as variables or
        keywords when unquoted.  Run "cmake --help-policy CMP0054" for policy
        details.  Use the cmake_policy command to set the policy and suppress this
        warning.

        Quoted variables like "MSVC" will no longer be dereferenced when the policy
        is set to NEW.  Since the policy is not set the OLD behavior will be used.
      Call Stack (most recent call first):
        CMakeLists.txt:4 (ENABLE_LANGUAGE)
      This warning is for project developers.  Use -Wno-dev to suppress it.

      CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.24/Modules/CMakeDetermineCXXCompiler.cmake:183 (elseif):
        Policy CMP0054 is not set: Only interpret if() arguments as variables or
        keywords when unquoted.  Run "cmake --help-policy CMP0054" for policy
        details.  Use the cmake_policy command to set the policy and suppress this
        warning.

        Quoted variables like "MSVC" will no longer be dereferenced when the policy
        is set to NEW.  Since the policy is not set the OLD behavior will be used.
      Call Stack (most recent call first):
        CMakeLists.txt:4 (ENABLE_LANGUAGE)
      This warning is for project developers.  Use -Wno-dev to suppress it.

      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Configuring done
      -- Generating done
      -- Build files have been written to: C:/Users/thher/AppData/Local/Temp/pip-install-hdi1djen/levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56/_cmake_test_compile/build
      -- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.22622.
      -- The C compiler identification is MSVC 19.29.30146.0
      -- The CXX compiler identification is MSVC 19.29.30146.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found PythonInterp: C:/Users/thher/AppData/Local/Programs/Python/Python311/python.exe (found version "3.11")
      -- Found PythonLibs: C:/Users/thher/AppData/Local/Programs/Python/Python311/libs/python311.lib (found version "3.11.0rc1")
      -- Found Python: C:/Users/thher/AppData/Local/Programs/Python/Python311/python.exe (found version "3.11.0") found components: Interpreter Development Development.Module Development.Embed
      Using packaged version of rapidfuzz-cpp
      -- Performing Test Weak Link MODULE -> SHARED (gnu_ld_ignore) - Failed
      -- Performing Test Weak Link MODULE -> SHARED (osx_dynamic_lookup) - Failed
      -- Performing Test Weak Link MODULE -> SHARED (no_flag) - Failed
      _modinit_prefix:PyInit_
      -- Configuring done
      -- Generating done
      CMake Warning:
        Manually-specified variables were not used by the project:

          Python3_EXECUTABLE
          Python3_INCLUDE_DIR
          Python3_LIBRARY
          SKBUILD


      -- Build files have been written to: C:/Users/thher/AppData/Local/Temp/pip-install-hdi1djen/levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56/_skbuild/win-amd64-3.11/cmake-build
      Microsoft (R) Build Engine version 16.11.2+f32259642 pour .NET Framework
      Copyright (C) Microsoft Corporation. Tous droits rservs.

      C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppBuild.targets(517,5): warning MSB8029: Le rpertoire intermdiaire ou le rpertoire de sortie ne peut pas se trouver sous le rpertoire temporaire car cela risque de crer des problŠmes avec la gnration incrmentielle. [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\ZERO_CHECK.vcxproj]
        Checking Build System
      C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\Microsoft.CppBuild.targets(517,5): warning MSB8029: Le rpertoire intermdiaire ou le rpertoire de sortie ne peut pas se trouver sous le rpertoire temporaire car cela risque de crer des problŠmes avec la gnration incrmentielle. [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
        Building Custom Rule C:/Users/thher/AppData/Local/Temp/pip-install-hdi1djen/levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56/src/Levenshtein/CMakeLists.txt
        levenshtein_cpp.cxx
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(543,1): warning C4267: '='ÿ: conversion de 'size_t' en 'long', perte possible de donnes [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(39): message : voir la rfrencel'instanciation de la fonction modŠle 'auto lev_set_median::<lambda_505ef21b6c5a639ad1822147016390c0>::operator ()<uint8_t*,uint8_t*>(uint8_t *,uint8_t *) const' en cours de compilation [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(562): message : voir la rfrencel'instanciation de la fonction modŠle 'auto visit<lev_set_median::<lambda_505ef21b6c5a639ad1822147016390c0>,>(const RF_String &,Func &&)' en cours de compilation [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
                with
                [
                    Func=lev_set_median::<lambda_505ef21b6c5a639ad1822147016390c0>
                ]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(552,34): warning C4244: '='ÿ: conversion de 'int64_t' en '_Ty', perte possible de donnes [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
                with
                [
                    _Ty=long
                ]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(2428,121): warning C4127: l'expression conditionnelle est une constante [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(2436,122): warning C4127: l'expression conditionnelle est une constante [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(2626,119): warning C4127: l'expression conditionnelle est une constante [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(2634,120): warning C4127: l'expression conditionnelle est une constante [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(2810,90): warning C4100: '__pyx_self'ÿ: paramŠtre formel non rfrenc‚ [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(3042,96): warning C4100: '__pyx_self'ÿ: paramŠtre formel non rfrenc‚ [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(3286,99): warning C4100: '__pyx_self'ÿ: paramŠtre formel non rfrenc‚ [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(3530,94): warning C4100: '__pyx_self'ÿ: paramŠtre formel non rfrenc‚ [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(3758,93): warning C4100: '__pyx_self'ÿ: paramŠtre formel non rfrenc‚ [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(4043,94): warning C4100: '__pyx_self'ÿ: paramŠtre formel non rfrenc‚ [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(5648,41): warning C4996: 'PyBytesObject::ob_shash': deprecated in 3.11 [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(5649,41): warning C4996: 'PyBytesObject::ob_shash': deprecated in 3.11 [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\levenshtein_cpp.cxx(7306,5): error C2027: utilisation du type non dfini '_frame' [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Programs\Python\Python311\Include\pytypedefs.h(22): message : voir la dclaration de '_frame' [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
        _levenshtein.cpp
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(543,1): warning C4267: '='ÿ: conversion de 'size_t' en 'long', perte possible de donnes [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(39): message : voir la rfrencel'instanciation de la fonction modŠle 'auto lev_set_median::<lambda_505ef21b6c5a639ad1822147016390c0>::operator ()<uint8_t*,uint8_t*>(uint8_t *,uint8_t *) const' en cours de compilation [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(562): message : voir la rfrencel'instanciation de la fonction modŠle 'auto visit<lev_set_median::<lambda_505ef21b6c5a639ad1822147016390c0>,>(const RF_String &,Func &&)' en cours de compilation [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
                with
                [
                    Func=lev_set_median::<lambda_505ef21b6c5a639ad1822147016390c0>
                ]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(552,34): warning C4244: '='ÿ: conversion de 'int64_t' en '_Ty', perte possible de donnes [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
                with
                [
                    _Ty=long
                ]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.cpp(174,23): warning C4389: '!='ÿ: incompatibilitsigned/unsigned [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.hpp(39): message : voir la rfrencel'instanciation de la fonction modŠle 'auto lev_quick_median::<lambda_3d674032857d8ac4d599d2228b77b1cc>::operator ()<uint8_t*,uint8_t*>(uint8_t *,uint8_t *) const' en cours de compilation [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.cpp(199): message : voir la rfrencel'instanciation de la fonction modŠle 'auto visit<lev_quick_median::<lambda_3d674032857d8ac4d599d2228b77b1cc>,>(const RF_String &,Func &&)' en cours de compilation [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
                with
                [
                    Func=lev_quick_median::<lambda_3d674032857d8ac4d599d2228b77b1cc>
                ]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.cpp(183,23): warning C4389: '!='ÿ: incompatibilitsigned/unsigned [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
      C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\src\Levenshtein\Levenshtein-c\_levenshtein.cpp(195,23): warning C4389: '!='ÿ: incompatibilitsigned/unsigned [C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build\src\Levenshtein\levenshtein_cpp.vcxproj]
        Gnration de code en cours...
        File "C:\Users\thher\AppData\Local\Temp\pip-build-env-snm1r45x\overlay\Lib\site-packages\skbuild\setuptools_wrap.py", line 645, in setup
          cmkr.make(make_args, install_target=cmake_install_target, env=env)
        File "C:\Users\thher\AppData\Local\Temp\pip-build-env-snm1r45x\overlay\Lib\site-packages\skbuild\cmaker.py", line 680, in make
          self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
        File "C:\Users\thher\AppData\Local\Temp\pip-build-env-snm1r45x\overlay\Lib\site-packages\skbuild\cmaker.py", line 704, in make_impl
          raise SKBuildError(


      --------------------------------------------------------------------------------
      -- Trying "Ninja (Visual Studio 17 2022 x64 v143)" generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying "Ninja (Visual Studio 17 2022 x64 v143)" generator - failure
      --------------------------------------------------------------------------------



      --------------------------------------------------------------------------------
      -- Trying "Visual Studio 17 2022 x64 v143" generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying "Visual Studio 17 2022 x64 v143" generator - failure
      --------------------------------------------------------------------------------



      --------------------------------------------------------------------------------
      -- Trying "Ninja (Visual Studio 16 2019 x64 v142)" generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying "Ninja (Visual Studio 16 2019 x64 v142)" generator - failure
      --------------------------------------------------------------------------------



      --------------------------------------------------------------------------------
      -- Trying "Visual Studio 16 2019 x64 v142" generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying "Visual Studio 16 2019 x64 v142" generator - success
      --------------------------------------------------------------------------------

      Configuring Project
        Working directory:
          C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build
        Command:
          cmake 'C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56' -G 'Visual Studio 16 2019' '-DCMAKE_INSTALL_PREFIX:PATH=C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-install' -DPYTHON_VERSION_STRING:STRING=3.11.0rc1 -DSKBUILD:INTERNAL=TRUE '-DCMAKE_MODULE_PATH:PATH=C:\Users\thher\AppData\Local\Temp\pip-build-env-snm1r45x\overlay\Lib\site-packages\skbuild\resources\cmake' '-DPython3_EXECUTABLE:FILEPATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\python.exe' '-DPython3_INCLUDE_DIR:PATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\Include' '-DPython3_LIBRARY:PATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\libs\python311.lib' '-DPython_EXECUTABLE:FILEPATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\python.exe' '-DPython_INCLUDE_DIR:PATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\Include' '-DPython_LIBRARY:PATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\libs\python311.lib' '-DPYTHON_EXECUTABLE:FILEPATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\python.exe' '-DPYTHON_INCLUDE_DIR:PATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\Include' '-DPYTHON_LIBRARY:PATH=C:\Users\thher\AppData\Local\Programs\Python\Python311\libs\python311.lib' -T v142 -A x64 -DCMAKE_BUILD_TYPE:STRING=Release

      Traceback (most recent call last):

      An error occurred while building with CMake.
        Command:
          cmake --build . --target install --config Release --
        Install target:
          install
        Source directory:
          C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56
        Working directory:
          C:\Users\thher\AppData\Local\Temp\pip-install-hdi1djen\levenshtein_8bd27ecdfaa145b1ac36fc5dbf00ff56\_skbuild\win-amd64-3.11\cmake-build
      Please check the install target is valid and see CMake's output for more information.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for levenshtein
Failed to build levenshtein
ERROR: Could not build wheels for levenshtein, which is required to install pyproject.toml-based projects

Segmenatation fault on import

I have had no problems installing this library on my windows machine, but when I tried to do it on a raspberry running raspbian, all I got was a seg fault.

pip3 install Levenshtein

python
import Levenshtein
Segmentation fault

Can't build from source on arm32v7

Hi, I'm trying to install Levenshtein 0.20.4 on Ubuntu Focal arm32v7, which builds from source as there are no wheels available, and getting the following error:
https://hastebin.com/opewiyapoq.sql

I did install cmake (os package), and ninja and rapidfuzz pip packages prior to pip install Levenshtein

Thanks

ModuleNotFoundError: No module named 'Levenshtein.levenshtein_cpp'

Good morning Everyone!
I had this problem few minutes ago, why is there error?

PythonException: 
  An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
  File "<frozen zipimport>", line 262, in load_module
  File "C:\Users\Jeremy\AppData\Local\Temp\spark-bc9b7249-7b7c-45f4-bb98-bfb831cd02ea\userFiles-31251d1d-d5c1-43cc-9182-c4f18afaa741\Levenshtein.zip\Levenshtein\__init__.py", line 35, in <module>
    from Levenshtein.levenshtein_cpp import (
ModuleNotFoundError: No module named 'Levenshtein.levenshtein_cpp'

examples in docs do not accurately reflect usage (or are confusing)

For example, https://maxbachmann.github.io/Levenshtein/levenshtein.html#Levenshtein.ratio shows usage of rapidfuzz in the examples, when it should just be Levenshtein.ratio

In the case of https://maxbachmann.github.io/Levenshtein/levenshtein.html#distance it mentions importing from rapidfuzz.distance import Levenshtein which will conflict with importing this project's Levenshtein module.
While rapidfuzz.distance.Levenshtein.distance works, there is no rapidfuzz.distance.Levenshtein.ratio.

These issues seem to stem from auto-generated docs that pull from the rapidfuzz dependency.

Damerau–Levenshtein distance

Hello.

Is it possible to calculate distance with symbol swaps using this library?
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

If this is not possible now, will it be difficult to add such a feature?

Some implementation of this functionality:

def damerau_levenshtein_distance(s1, s2):
    d = {}
    lenstr1 = len(s1)
    lenstr2 = len(s2)
    for i in range(-1,lenstr1+1):
        d[(i,-1)] = i+1
    for j in range(-1,lenstr2+1):
        d[(-1,j)] = j+1
 
    for i in range(lenstr1):
        for j in range(lenstr2):
            if s1[i] == s2[j]:
                cost = 0
            else:
                cost = 1
            d[(i,j)] = min(
                           d[(i-1,j)] + 1, # deletion
                           d[(i,j-1)] + 1, # insertion
                           d[(i-1,j-1)] + cost, # substitution
                          )
            if i and j and s1[i] == s2[j-1] and s1[i-1] == s2[j]:
                d[(i,j)] = min(d[(i,j)], d[i-2,j-2] + 1) # transposition
 
    return d[lenstr1-1,lenstr2-1]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.