GithubHelp home page GithubHelp logo

matchr's People

Contributors

antzucaro avatar buckhx avatar choilmto avatar garymoon avatar saschagrunert avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

matchr's Issues

JaroWinkler -- panic: runtime error: index out of range

matchr.JaroWinkler("dr", "driveway", true)
--
panic: runtime error: index out of range
github.com/antzucaro/matchr.jaroWinklerBase(0xc82056cf71, 0x2, 0xc82056cfd1, 0x8, 0x101, 0x3fe8000000000000)
    /home/vagrant/workspace/go/src/github.com/antzucaro/matchr/jarowinkler.go:100 +0x59d
github.com/antzucaro/matchr.JaroWinkler(0xc82056cf71, 0x2, 0xc82056cfd1, 0x8, 0xb69801, 0xc8202a8418)
    /home/vagrant/workspace/go/src/github.com/antzucaro/matchr/jarowinkler.go:134 +0x50

As a hotfix, I've changed line :100 to explicitly checking r1 & r2 have an index at i like this:
for i = 0; i < j && len(r1) > i && len(r2) > i && r1[i] == r2[i] && nan(r1[i]); i++ {

[Question] Double Metaphone Max Length of 4

Hi there! I was wondering why there is a hardcoded max length of 4 for double metaphone? (I've also noticed other implementations limit this to 4). Is there a particular design decision behind this?

License?

Have you chosen a license for this project? I'd love to use your Smith-Waterman implementation in a GPL-licensed bioinformatics project of mine:

https://github.com/plsql/jh-bio

Let me know.

Thanks!

NYSIIS should handle number and symbol strings without panic

If NYSIIS function receives a string with only numbers or numbers and symbols it will panic. The function should probably return an empty string instead (""). Line 24 generates a panic because the input is empty if the string is numbers or symbols only.

I have a fix with tests ready for this in a local branch if you want. Or you can use this code right about line 24.

	// if no characters are left return blank
	if len(input) == 0 {
		return ""
	}

Test cases:

	{"2002", ""},
	{"1/2", ""},
	{"", ""},

Difference between libraries

Hi.

I have tried this library and compared it with https://github.com/adrg

In some cases we experience differences in the results:

package main

import (
	"fmt"

	"github.com/adrg/strutil/metrics"
	"github.com/antzucaro/matchr"
)

func main() {
	r2 := "wilson kjell"
	r1 := "wilson mathias"
	fmt.Printf("matchr long distance:%f\n", matchr.JaroWinkler(r1, r2, true))
	fmt.Printf("matchr short distance:%f\n", matchr.JaroWinkler(r1, r2, false))

	m := metrics.NewJaroWinkler()
	fmt.Printf("adrg:%f\n", m.Compare(r2, r1))
}
// matchr long distance:0.694444
// matchr short distance:0.694444
// adrg:0.816667

https://go.dev/play/p/z2IQsqYjIDQ

What is correct distance between these strings?

The origninal implementation (strcmp95) called from perl gives us 0.83523

Thank you.

Remove or modify charAt()

In util.go, the function charAt is never used and can be removed completely. However, if you intend to use it, you may want to change it.

Currently, if the provided index is out of bounds, the function returns 0. However Go, unlike C, allows null characters (\u0000) in strings. It's therefore impossible to distinguish between an out-of-bound index and a null character.

You could fix this by returning -1 instead, since the rune type is just an alias for int32.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.