GithubHelp home page GithubHelp logo

arvkevi / kneed Goto Github PK

View Code? Open in Web Editor NEW
700.0 10.0 72.0 12.54 MB

Knee point detection in Python :chart_with_upwards_trend:

Home Page: https://kneed.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 99.85% Shell 0.15%
systems scientific-computing data-science data-analysis python knee-point elbow-method

kneed's Introduction

Hello, 👋 my name is Kevin Arvai. I'm a data scientist with 10+ years of experience in the genomics field.

Connect 🤝
Since you're here, let me know you stopped by. Share your Python or data science story with me on Twitter or LinkedIn. I love hearing about what people are working on in the open-source community!

Favorite project 🧬
I wrote an app that predicts users' ancestry from their genetic data.

Non-GitHub stuff 💻
I like machine learning, open-source software/data, and genomics.
My Real Python articles, blog posts, and Kaggle profile.

kneed's People

Contributors

arvkevi avatar big-o avatar blackrobe avatar gperakis avatar janson-l avatar jscheffner avatar kev494 avatar m-birke avatar peterdha avatar samhanic avatar shotgunosine avatar tommilligan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kneed's Issues

typo in example

Thank you for making this available.

FROM
plt.plot(kl_online.x_differencece, kl_online.y_difference);

TO
plt.plot(kl_online.x_difference, kl_online.y_difference);

Not finding correct elbow

Not entirely sure what the issue is on this one. The dataset has a very clear elbow. It is a decreasing, convex curve.

The calculated elbow is way off, but I can get the desired index by just grabbing the index of the maximum distance. Will this not always be the case?

find_knee() picks kneepoint from the wrong array

find_knee() finds the index of the knee point correctly, it also assignes the correct value from the normalized array norm_knee_ = self.xsn[self.xmx_idx[mxmx_i]]
but it selects the actual knee from the raw input data, while it should use the values from the evenly spaced array
currently: knee_ = self.x[self.xmx_idx[mxmx_i]]
should be: knee_ = self.Ds_x[self.xmx_idx[mxmx_i]]

Sensitivity Parameter (S) does not work in polynomial fit?

From your README.md example, I was playing with your excellent algorithm kneed. However, it seems for the polynomial fit example, there is no effect of S parameter. Specifically, no difference between

kneedle = KneeLocator(x, y, S=1, curve='convex', direction='decreasing', interp_method='polynomial')

and

kneedle = KneeLocator(x, y, S=28.0, curve='convex', direction='decreasing', interp_method='polynomial')

Will you please have a look?

kneed fails to find elbow point

Hi,

I can not calculate the knee point of the following dataset:

x =[7.58664459e-05, 1.11978460e-04, 2.38899123e-04, 4.90114492e-04, 7.82853340e-04, 1.02036258e-03, 1.31875021e-03, 2.30043176e-03, 4.40276718e-03, 7.70743315e-03, 1.60477105e-02, 3.83262376e-02]
y =[0.27655736, 0.27249274, 0.2646699 , 0.25457005, 0.248055 , 0.2457939 , 0.24485397, 0.24355771, 0.24200204, 0.24073301, 0.23906376, 0.23606901]
Figure_1

The kneed algorithm always chooses the first datapoint, which is obviously not the elbow point.

Thanks

Minimum number of samples

The following

from kneed import KneeLocator
import numpy as np
import matplotlib.pyplot as plt

x = [ 1, 2, 3, 4, 5, 6 ]
y = [ -481., 783., 1019., 1158., 1224., 1293 ]

kneedle = KneeLocator(x, y, S=1.0, curve='concave', direction='increasing')
kneedle.plot_knee_normalized()
plt.show()

gives the right answer:
knee

However, getting samples is time-consuming, so that I would like to come to the same conclusion with fewer samples. With x[:5] and y[:5], the answer is still correct, but with x[:4] and y[:4], no elbow can be found.
So my question is: What is the minimum number of samples required for the detection? Is there a way to make it work with only 4 samples?

Thanks !

IndexError: Line 271

Hi,
I believe we need a safeguard for falling outside of the bound in the find_knee function, Line 271. Since j=i+1, it can fall out the length of the array in certain cases!

to reproduce the error:
x = [0.34, 0.32, 0.30]
y = [1, 2, 6]
kn = KneeLocator(x, y, curve="convex", direction="decreasing")

Write more tests

write tests for:

  1. smaller intervals between values in x.
  2. x arrays < len(10).

Work from #15

Request: Documentation

Your kneedle algorithm implementation work pretty good for me, thank you very much!

Just one minor issue: The package itself lacks of some documentation. What I have been looking for:

  • General documentation for "kneed" module, unfortunately there is no docstring for the "kneed" package once imported:
    Docstring: <no docstring>
    I am missing here a full documentation for the functions and methods provided just by looking into the module's offline documentation

  • Finding the "best" sensitivity parameter is documented here, but I think the statement "S is a measure of how many “flat” points we expect to see in the unmodified data curve before declaring a knee." needs to be represented in the docstring for kneed.KneeLocator() as well. Currently it's just
    :ivar S: Sensitivity, original paper suggests default of 1.0

In my honest opinion, the parameters information in the docstring currently is a little confusing to be honest. I think it would be good if parameter names, descriptions and accepted data types could be cleaned up a little to a more easier-to-read format just like docstrings in numpy or pandas. For example:

:param S: Sensitivity, original paper suggests default of 1.0
:type S: float

to

Parameters
----------
S : float
    Sensitivity, original paper suggests default of 1.0

Kind regards,
DK

standardize the plots

and make them more visually appealing

  • add a legend
  • add a title
  • add labels to the axes

Addition of figsize param in plots ?

Hello Kevin,

Great work with the algo and the plots. One thing that I would personally prefer, is to let the user define the figure size of the plots.
What do you think?

Best,
George

Change print output to warnings

Would it be possible to change the two prints in your code to warnings?
I am using your handy tool to calculate the optimal k for a recursive k-means clustering to create a top-down hierarchical clustering. Naturally there will be no optimal k at any point and then I would end the recursion. The repeated prints slow down the application quite a bit. With the warnings-module I could ignore those.

Rename `invert` to `curvature`

A more informative variable name would be curvature.

  1. Change invert to curvature
  2. Change conditionals to implment 'convex' and 'concave' instead of True and False
  3. Raise exception when no slope.

y_transform doesn't work with unevenly spaced x values

def transform_y(y: Iterable[float], direction: str, curve: str) -> float:

The transformation done to turn the curve into concave/increasing from a concave/decreasing curve doesn't work when you have unevenly spaced x values. When the np.flip(y) function is called it will match up the y values with the x values according to index, but not according to the actual original values. This means, if you have x values which get increasingly close together in the original curve, the transformed curve will actually become a convex/increasing curve.

Unfortunately I can't include a reproducible example right now, but hopefully this will demonstrate what I mean.

output

As you can see, in the original curve, the x values start quite far apart, then get increasingly close together. The y_transform flips the y values, but does not preserve their proper spacing relative to the curve so the shape doesn't change as expected. We can also see that the original curve would have a knee at around 1.5, but the transformed curve would have an elbow at around 3.85.

I don't really know of a way to fix this, but I find it a bit strange to transform the curve this way. I haven't dug through all of the source code, but if I understand the original paper correctly, the difference curve is calculated according to the difference from a straight line. Why not just change what straight line is being compared against for the difference calculation for each of the four convex/concave and increasing/decreasing combinations?

In the case of this example (i.e. concave/decreasing), the normalized straight line would be y = -x + 1 and it looks like that would result in a reasonable difference curve:

# Calculate the difference curve
y_line = -x_normalized + 1

y_difference = y_normalized - y_line
x_difference = x_normalized.copy()

plt.title("Normalized spline & difference curve")
plt.plot(x_normalized, y_normalized, '.', label='original curve');
plt.plot(x_normalized, y_line, '--', label='straight line comparison');
plt.plot(x_difference, y_difference, label='difference curve');
plt.legend()

output2

With that difference curve, it then appears that the knee could be correctly identified at around 0.5 (normalized x values). Then, for an increasing curve, the line to compare against is y = x, and for the convex curves, we take the absolute value of the difference.

update notebooks

The jupyter notebooks need to be updated to reflect changes in the source code.

Fix direction='increasing'

Kneedle isn't working on this curve

x = [-10, -9, -8, -7, -6, -5, -4, -3, -2 , -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
y = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 107, 122, 145, 176, 215, 262, 317, 380, 451, 530, 617]

Create the difference curve using yd = abs(ysn - xsn).

Problem with invert=True

Thanks for writing this library, it looks very useful.

I'm having trouble with the invert=True parameter in Python 3.5. It throws an error on line 22:

        if self.invert:
            self.original_x = self.x, self.original_y = self.y

so i changed it to:

        if self.invert:
            self.original_x, self.original_y = self.x, self.y

That works ,but then it throws an error on line 27:

        if not np.array_equal(np.array(self.x), np.sort(self.x)):
            raise ValueError('x values must be sorted')

My code is using walthrough.ipynb with data from the DataGenerator, and the only thing I changed is this line in [5]:
kneedle = KneeLocator(x, y, S=1.0, invert=True)

Do you have any suggestions for how to get it working?

Implementation Detail

Hi
I was looking at the implementation of the Kneedle class's method if differences to find the optima for Elbow and Knee location.

Could one not use the max of the second derivatives(curvature) to find this? In which case I don't think you have to worry about convexity and direction.

KneeLocator fails if there are flat extrema

This simple example fails:

from kneed import KneeLocator
x = [0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0]
y = [1, 0.787701317715959, 0.7437774524158126, 0.6559297218155198, 0.5065885797950219, 0.36749633967789164, 0.2547584187408492, 0.16251830161054173, 0.10395314787701318, 0.06734992679355783, 0.043923865300146414, 0.027818448023426062, 0.01903367496339678, 0.013177159590043924, 0.010248901903367497, 0.007320644216691069, 0.005856515373352855, 0.004392386530014641]
k = KneeLocator(x, y, curve='convex', direction='decreasing')

Output:
UserWarning: No knee/elbow found

However, if we obtain the normalized knee plot, it is clear that there is a "flat optimum".

knee

It seems that the algorithm should be able to find that point between 0.4 and 0.5.

I've been able to workaround this issue by modifying knee_locator.py in the calculation of self.maxima_indices and self.minima_indices, by using np.greater_equal and np.less_equal rather than np.great and np.less, but I'm not sure if this is a proper solution.

Thanks!

missing comma

File "C:\ProgramData\Anaconda2\lib\site-packages\kneed\data_generator.py", line 11
mu: float = 50, sigma: float = 10, N: int = 100, seed=42
^
SyntaxError: invalid syntax

Knee point changes depending on input length

Hi!

I'm looking for your insight because I have been facing a situation I can't explain.

Let's suppose we have this original array:

areas = [5.0328, 6.9083555555555565, 7.562222222222222, 8.041955555555557, 8.338044444444442, 8.576977777777776,
              8.737955555555555, 8.87591111111111, 8.967111111111112]

Trying to find the knee point this way:

KneeLocator(range(len(areas)), areas, S=1.0, curve='concave', direction='increasing')

If I pick the first 6 items in areas array, this is the result I get

image

This plots were made by using matplotlib, not the plot_knee method. Just in case.

But, the weird thing I found comes when I add the next element into the array. So, for the first 7 items of the same array, the results are

image

There you can see the knee point changing from 1 to 2. But if we check it visually, let's say, the optimal point keeps on 1 where at least the blue plot shows a knee.

My first approach was changing the online/offline parameter. But the behavior remains the same.

I really appreciate your work!

Thanks in advance

PS, I wonder if the normalization applied to the orange curve is changing the result in these cases. I'm going to check that.

Knee point in logistic-like curve

Hi, I have a curve that looks like a logistic function (data array below):

Figure_1

I need to find the knee where it starts to grow, which would be around ~75, but I can't seem to be able to do it. I've tried all the combinations of the curve, direction parameters to no avail.

Is kneed capable of doing this?

Cheers

[array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50., 51., 52.,
       53., 54., 55., 56., 57., 58., 59., 60., 61., 62., 63., 64., 65.,
       66., 67., 68., 69., 70., 71., 72., 73., 74., 75., 76., 77., 78.,
       79., 80., 81., 82., 83., 84., 85., 86., 87., 88., 89., 90., 91.,
       92., 93., 94., 95., 96., 97., 98.]), array([2.00855493e-45, 1.10299045e-43, 4.48168384e-42, 1.22376580e-41,
       5.10688883e-40, 1.18778110e-38, 5.88777891e-35, 4.25317895e-34,
       4.06507035e-33, 6.88084518e-32, 2.99321831e-31, 1.13291723e-30,
       1.05244482e-28, 2.67578448e-27, 1.22522190e-26, 2.36517846e-26,
       8.30369408e-26, 1.24303033e-25, 2.27726918e-25, 1.06330422e-24,
       5.55017673e-24, 1.92068553e-23, 3.31361011e-23, 1.13575247e-22,
       1.75386416e-22, 6.52680518e-22, 2.05106011e-21, 6.37285545e-21,
       4.16125535e-20, 1.12709507e-19, 5.75853420e-19, 1.73333796e-18,
       2.70099890e-18, 7.53254646e-18, 1.38139433e-17, 3.60081965e-17,
       8.08419977e-17, 1.86378584e-16, 5.36224556e-16, 8.89404640e-16,
       2.34045104e-15, 4.72168880e-15, 6.84378992e-15, 2.26898430e-14,
       3.10087652e-14, 2.78081199e-13, 1.06479577e-12, 2.81002203e-12,
       4.22067092e-12, 9.27095863e-12, 1.54519738e-11, 4.53347819e-11,
       1.35564441e-10, 2.35242087e-10, 4.45253545e-10, 9.78613696e-10,
       1.53140922e-09, 2.81648560e-09, 6.70890436e-09, 1.49724785e-08,
       5.59553565e-08, 1.39510811e-07, 7.64761811e-07, 1.40723957e-06,
       4.97638863e-06, 2.12817943e-05, 3.26471410e-05, 1.02599591e-04,
       3.18774179e-04, 5.67297630e-04, 9.22732716e-04, 1.17445643e-03,
       3.59279384e-03, 3.61936491e-02, 6.39493416e-02, 1.29304829e-01,
       1.72272215e-01, 3.46945901e-01, 5.02826602e-01, 6.24800042e-01,
       7.38412957e-01, 7.59931663e-01, 7.73374421e-01, 7.91421897e-01,
       8.29325597e-01, 8.57718637e-01, 8.73286061e-01, 8.77056835e-01,
       8.93173768e-01, 9.05435646e-01, 9.17217910e-01, 9.19119179e-01,
       9.24810910e-01, 9.26306908e-01, 9.28621233e-01, 9.33855835e-01,
       9.37263027e-01, 9.41651642e-01])]

Non Negative Values in Y

Is there a reason why if Y has no negative an error is returned.

>>> x = range(2,20)
>>> y=[]
>>> for i in x:
...     y.append(i*100)
>>> x
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> y
[200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900]
>>> 
>>> kneed.KneeLocator(x,y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/kneed/knee_locator.py", line 61, in __init__
    self.knee, self.norm_knee, self.knee_x = self.find_knee()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/kneed/knee_locator.py", line 74, in find_knee
    mxmx_iter = np.arange(self.xmx_idx[0], len(self.xsn))
IndexError: index 0 is out of bounds for axis 0 with size 0

>>> y[0]=-1
>>> kneed.KneeLocator(x,y)
<kneed.knee_locator.KneeLocator object at 0x106a56d10>

How to use it with Multivariate X, Throwing interpolation axis Error

Hi Team, I am trying to use Knee locator package to get optimal number of features in my dataset. I have a X data frame whose shape is (100, 10) and Y data frame whose shape is (100, 1). Attached Screenshot.

While applying same in Knee locator algorithm it throws error as - "ValueError: x and y arrays must be equal in length along interpolation axis."

I Also tried converting it into Numpy array and then used but same error.

image
image

Unable to detect knee and elbow

import numpy as np
import pandas as pd
from kneed import KneeLocator 

import matplotlib.pyplot as plt

Having the following dummy data:

np.random.seed(42)
y1 = np.random.randint(5,10,5)
y2 = np.random.randint(15,20,5)
y3 = np.random.randint(5,10,5)

y = np.append(y1 , y2)
y = np.append(y , y3)
y_cum = y.cumsum()

df_test = pd.DataFrame({'x': range(len(y)), 'y_cum': y_cum, 'y': y})

If we plot these data we can see an "elbow" (at 4 in the x-axis) and a "knee" (at 9 in the x-axis)
plt.plot(df_test['x'], df_test['y_cum'])
image

But if I run

kneedle = KneeLocator(df_test['x'], df_test['y'], curve='convex', direction='increasing', online=False, S=1)
elbow_point = kneedle.elbow
elbow_point

I get 13 as elbow point

and if i run

kneedle = KneeLocator(df_test['x'], df_test['y'], curve='concave', direction='increasing', online=False, S=1)
elbow_point = kneedle.elbow
elbow_point

I get 1 as elbow point

And if I run

kneedle = KneeLocator(df_test['x'], df_test['y'], curve='convex', direction='increasing', online=True, S=1)
elbow_point = kneedle.elbow
elbow_point

and

kneedle = KneeLocator(df_test['x'], df_test['y'], curve='concave', direction='increasing', online=True, S=1)
elbow_point = kneedle.elbow
elbow_point

so the same as before but with online=True

I get 13 & 1 as elbow points respectively

For completeness these are the randomly generated data:

x : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
y_cum : [8, 17, 24, 33, 42, 58, 75, 92, 109, 128, 136, 143, 152, 158, 166]

IndexError in KneeLocator after upgrade to 0.5.3

After upgrading to 0.5.3 from 0.5.1, I get an IndexError on an existing case. I didn't expect there to be a change in behaviour, as it was only a patch version bump.

>       self.knee_y = self.y[self.x == self.knee][0]
E       IndexError: index 0 is out of bounds for axis 0 with size 0

The line throwing the error is here

Example graphs dont make sense

Hi There, I may be wrong but it seems that the example graphs at the TOP of the readme 'convex' 'increasing' etc. don't make the examples below. It's quite confusing... :D

Thanks, please fix. Love the repo.

Problem with curves with multiple slopes

Hi.

First of all, thanks for developing this software. I find it quite interesting and useful for multiple purposes.

I am trying to use it on some data that I have obtained after processing a surface in which I detect local optima (in the z axis). Due to the acquisition process, there are many noise points at the floor level which I want to remove and I thought that a good approach would be to plot a distance plot and cut at the elbow. This manually works, but I wanted to automate the process and that was when I found kneed.

However, in my plot there are changes in the slope of the curve, as shown below. What I want it to do is to cut at the clearly pronounced elbow (around x=500) to remove all the points with an almost equal height. However, the knee point returned by kneed is one at the very beginning of the plot (x=26).

Any suggestions on how to tackle this?

I attach two pictures: one of the elbow plot and the one obtained with the plot_knee_normalized method.

Thanks in advance for your help.

Regards.

index index2

PyPI Release Current Update

Hi there,

For some reason the current updated package that you have on github is not the same as the one that is currently PyPI released. Could you please update it please so that I may add it to my install reqs? Thanks!

Awesome package.

TypeError: transform_y() takes 3 positional arguments but 4 were given

Hi There I have installed kneed 0.7.0

pip install kneed==0.7.0

I run the example in the documentation:

from kneed import DataGenerator, KneeLocator

x, y = DataGenerator.figure2()

print([round(i, 3) for i in x])
print([round(i, 3) for i in y])

kneedle = KneeLocator(x, y, S=1.0, curve="concave", direction="increasing")

print(round(kneedle.knee, 3))

I get this output:

[0.0, 0.111, 0.222, 0.333, 0.444, 0.556, 0.667, 0.778, 0.889, 1.0]
[-5.0, 0.263, 1.897, 2.692, 3.163, 3.475, 3.696, 3.861, 3.989, 4.091]
Traceback (most recent call last):

  File "/Users/aas358/Development/Trento/july2020/pubmed-symptoms-diseases-associations/evaluation/rank_elbow_method.py", line 54, in <module>
    kneedle = KneeLocator(x, y, S=1.0, curve="concave", direction="increasing")

  File "/opt/anaconda3/envs/dev/lib/python3.7/site-packages/kneed/knee_locator.py", line 182, in __init__
    self.y_normalized, self.direction, self.curve

TypeError: transform_y() takes 3 positional arguments but 4 were given

My python version is 3.7.5

is kneed picking the right knee point?

Hi Kevin,

Let me first say thanks for your package!

I am however wondering whether it picks the proper knee point.
kneed was installed using conda, showing below version
kneed 0.2.4 py_0 conda-forge

I'm running the following code:

y = [7304.988411743468, 6978.98441315824, 6666.605130591402, 6463.195596457663, 6326.525947962969, 
     6048.793513285322, 6032.793220988797, 5762.013547650833, 5742.773572835342, 5398.219072974353, 
     5256.840796466522, 5226.976690998346, 5001.718982839869, 4941.984506219156, 4854.238126768628, 
     4734.606344213364, 4558.749543289275, 4491.096245408976, 4411.612468308233, 4333.007985566374, 
     4234.633219330786, 4139.10266640919, 4056.8041571434956, 4022.4882313410208, 3867.9649688469103, 
     3808.266172761056, 3745.267272596804, 3692.343525689731, 3645.5533571744386, 3618.2781512814176, 
     3574.26074141688, 3504.3061262646543, 3452.444673732173, 3401.19897729189, 3382.3740348889764, 
     3340.6702550205196, 3301.0814510684318, 3247.5885929108044, 3190.270755323219, 3179.9905812848137, 
     3154.2367011478286, 3089.5396073964585, 3045.61707926626, 2988.993953177785, 2993.614218064459, 
     2941.346229778838, 2875.5955684762366, 2866.3253584487247, 2834.117714931289, 2785.1456843776896, 
     2759.6514682361576, 2763.2024159338034, 2720.1356002598905, 2660.140799793623, 2690.2175242045923, 
     2635.7118932237527, 2632.9222293329244, 2574.6268292686323, 2555.965416634073, 2545.7190837261787, 
     2513.381871491499, 2491.5685394975612, 2496.0498636812163, 2466.450734239057, 2442.7208947876484, 
     2420.5347333116115, 2381.537840563225, 2388.0917624044428, 2340.6133232506804, 2335.0286400024625, 
     2318.927311833302, 2319.0470174461234, 2308.234710680705, 2262.22706400162, 2235.7819112049838, 
     2259.270039523929, 2221.0453886756854, 2202.6929974205277, 2184.288565745161, 2170.0699701041776, 
     2160.0469316534904, 2127.6818252628573, 2134.731718758979, 2101.962979423362, 2101.441656971703, 
     2066.4026551229113, 2074.2546618976407, 2063.6767099234867, 2048.1153581337485, 2031.8747775400475]
x = list(range(90))

kneedle = kneed.KneeLocator(x,y,S=1,curve='convex', direction='decreasing')
print(kneedle.knee)
kneedle.plot_knee_normalized()
kneedle.plot_knee()

If you look at the resulting plots below, I would expect the knee point to be more to the left, basically where you see the max of the difference curve (around 0.3 on the normalized plot in my case).
At least that's my understanding of the kneedle algorithm...
I tried with different values of S (smaller and bigger), but can't see any changes here.
BTW, Sensitivity is a bit counter intuitive, the smaller S the more coarse the algorithm, maybe it's an idea to put it in the denominator of the formula and/or rename it.
Haven't looked into the code yet, but still planning to do so...

--Peter
image
image

can not detect knee/elbow point in python 3.9

Hi guys,
Not sure if as below showing, it not supports 3.9x?

image

If it is supposed to support 3.9x, the below most probably is an issue.
Thanks!

===================================
Python 3.9.7
Name: kneed
Version: 0.7.0

import scipy.stats as st
import plotly.express as px
import numpy as np
import kneed as kd

x2 = np.linspace(-3,3,500)
y2 = np.apply_along_axis(st.norm.cdf,0,x2)

fig = px.line(x=x2,y=y2)
fig.update_layout(height=600,
                  width=900,
                  title="-3到+3正态分数据的布累积概率图"
                 )
fig.show()

newplot

def find_elbows(c,d,fc=0):
    ew = kd.KneeLocator(x=x2,y=y2,S=fc,
                     curve=c,
                     direction=d,
                     online=False
                    )
    return ew

ews = [ find_elbows(x,y) for x in ['concave','convex'] for y in ['increasing','decreasing']]

for x in ews:
    x.plot_knee()

plot2--

Knee is outside the x vector

Hi,

I run the following code:

from kneed import KneeLocator

kneedle = KneeLocator([3,4,5,6,7,8], [1.4699,1.4149,1.3868,1.3594,1.3411,1.3265], direction='decreasing',curve='convex')

print(kneedle.knee)

And the output of kneedle.knee is 0.0 without any errors throwed or printed despite that 0.0 is not included in x. Is it expected ?

Bye

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.