GithubHelp home page GithubHelp logo

statisticianinstilettos / recmetrics Goto Github PK

View Code? Open in Web Editor NEW
561.0 561.0 100.0 5.85 MB

A library of metrics for evaluating recommender systems

License: MIT License

Jupyter Notebook 98.56% Python 1.39% Makefile 0.03% Dockerfile 0.02%

recmetrics's People

Contributors

alineberry avatar altaha avatar chrisjkuch avatar declow avatar dependabot[bot] avatar diogoflorencio avatar gregwchase avatar ibuda avatar itsoum avatar izenish avatar kenho211 avatar kshitijkarthick avatar lukassto avatar martinthoma avatar mrkaye97 avatar n00b001 avatar sab-6 avatar statisticianinstilettos avatar ytang07 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

recmetrics's Issues

Coverage over 100%

In the example bellow, the coverage measured exceeds 100%, which does not make sense.

This happens when items that are not listed on the catalog are recommended.

> from rcmetrics import prediction_coverage
> prediction_coverage([['x', 'y'], ['w', 'z']], catalog=['w', 'x', 'y'])
133.33

Integration with Deep Learning Based Frameworks

Is there any way to integrate this with recommender system frameworks that involve more deep learning-based algorithms such as PyTorch etc.? Sci-Kit Learn's with Surprise doesn't really support such algorithms

Is surprise really required?

First of all: this package looks great! It's exactly what I need for some small projects, so thanks for putting it out there!

I'm looking at the setup.py, and it lists surprise as a requirement. I don't see it imported anywhere in the package though, so I'm wondering if it can be removed? I get that it's useful for the example notebook, but that wouldn't be included in the pip install anyway. (I might suggest making surprise an extras_require if you want to keep it in there for demo purposes.)

If you're open to some packaging changes along these lines, I'd be happy to send a PR your way.

Installation issues

Hi! Have been trying to install recmetrics with "pip install recmetrcis", keep getting an error "ERROR: Could not build wheels for scikit-learn, which is required to install pyproject.toml-based projects". I'm using Windows, Python version 3.9.7, pip all upgraded. pip freeze shows that scikit-learn is actually already installed: "scikit-learn==0.24.2". I've also tried installing with pip from git, same result. Any ideas what I could still try?

mapk shouldn't require actual and predicted have the same length

This assertion check is incorrect. The actual parameter as used in _apk is expecting a list of true items and the predicted parameter is expecting a list of predicted items that can be true or false. See an example below where only A-C are true items and the prediction can be longer than the true list because it can contain false items.

if len(actual) != len(predicted):
raise AssertionError("Length mismatched")

true_items = ["A","B","C"]
prediction = ["A","Z","B","X"]
metrics.mapk(actual=true_items, predicted=prediction, k = 3)

License

This is missing a license. You can use https://tldrlegal.com/ for an overview. The top-3 are MIT, BSD and GPL (see my analysis).

The simplest way to add it is in the setup.py as license='MIT' or similar.

Unused Requirement

Surprise is listed as a module dependency but is not used in metrics or plots. Might be worth removing the dependency - especially since it requires additional built tools (Visual C++) and thus may throw unnecessary errors.

TypeError on class_separation_plot of example notebook

I attached the error below

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-05160122655c> in <module>
----> 1 recmetrics.class_separation_plot(pred_df, n_bins=45, class0_label="True class 0", class1_label="True class 1")

TypeError: class_separation_plot() got an unexpected keyword argument 'class0_label'

ImportError: cannot import name 'signature'

Importing the repository does not work. I am getting the following error ImportError: cannot import name 'signature'

pip install recmetrics
import recmetrics as re

The problem is this import from sklearn.utils.fixes import signature.

Slows down even if x_labels=False

First, thank you for providing this great libary! I faced an issue on rather large data sets, in particular when option x_labels is set to False. Suggests to insert an
if x_labels == True: before, similar as it is done on bottom of function. Because, whe I don't want to plot labels, why should plt.xticks(x) be executed?

plt.xticks(x)

module 'recmetrics' has no attribute 'prediction_coverage'

Hi there
I am trying to run example notebook. But I am getting 'module 'recmetrics' has no attribute 'prediction_coverage'' and "attribute error: module 'recmetrics' has no attribute 'catalog_coverage'"

any pointer or suggestion.

Thanks in advance

personalization() has explosive memory requirements due to pairwise comparison

On my system (16gb ram), a list of 10k recommendations will run. A list of 50k will crash out. I'd like to try to understand the personalization score across my entire hypothetical customer base 250k+.

Is there a way to chunk the scipy.sparse.csr_matrix and iteratively calculate the cosine similarity to avoid holding the whole thing in memory?

Personalization metric calculation optimization

Hi @statisticianinstilettos,

kudos for a great tool!
I would like to propose an optimization for calculating Personalization Metric here:

#get indicies for upper right triangle w/o diagonal
upper_right = np.triu_indices(similarity.shape[0], k=1)

#calculate average similarity
personalization = np.mean(similarity[upper_right])
return 1-personalization

There is no need to get the upper triangle indices, as the cosine similarity is a symmetric distance.
I will follow up with a pull request for this.

dev dependencies breaking installation

» poetry add recmetrics   
Using version ^0.1.5 for recmetrics

Updating dependencies
Resolving dependencies... (0.2s)

Because no versions of recmetrics match >0.1.5,<0.2.0
 and recmetrics (0.1.5) depends on pytest-cov (>=2.10.1,<3.0.0), recmetrics (>=0.1.5,<0.2.0) requires pytest-cov (>=2.10.1,<3.0.0).
So, because jewel-ml-models depends on both recmetrics (^0.1.5) and pytest-cov (^4.0.0), version solving failed.

pytest-cov is a development dependency, it shouldn't break like this. You can easily solve this by installing pytest-cov as development dependency. There are others as well... ipython maybe? Jupyter and twine too

Unable to import recmetrics

I am working on a recommendation engine using collaborative filtering and wanted to try the metrics provided by recmetrics. Here, the error I get trying to import the package (version 0.0.12).

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-309-301854677c00> in <module>
----> 1 import recmetrics
      2 
      3 recmetrics.long_tail_plot()

~/.virtualenvs/py3/lib/python3.6/site-packages/recmetrics/__init__.py in <module>
----> 1 from .plots import long_tail_plot, mark_plot, mapk_plot, coverage_plot, class_separation_plot, roc_plot, precision_recall_plot
      2 from .metrics import mark, coverage, personalization, intra_list_similarity, rmse, mse, make_confusion_matrix, recommender_precision, recommender_recall

~/.virtualenvs/py3/lib/python3.6/site-packages/recmetrics/plots.py in <module>
      5 from matplotlib.lines import Line2D
      6 from sklearn.metrics import roc_curve, auc, precision_recall_curve, average_precision_score
----> 7 from sklearn.utils.fixes import signature
      8 
      9 

ImportError: cannot import name 'signature'

Update PyPI pacakge

The current version needs to be updated as the packages depends on deprecated/removed functionality from different dependencies.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.