GithubHelp home page GithubHelp logo

isg-siegen / auto-surprise Goto Github PK

View Code? Open in Web Editor NEW
26.0 2.0 2.0 182 KB

An AutoRecSys library for Surprise. Automate algorithm selection and hyperparameter tuning :rocket:

Home Page: https://auto-surprise.readthedocs.io/en/stable/

License: MIT License

Python 100.00%
hyperopt hyperparameter-tuning automl recommender-system surprise machine-learning hyperparameter-search automated-machine-learning tpe

auto-surprise's People

Contributors

dependabot[bot] avatar joeran avatar thededlier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

auto-surprise's Issues

available parameters ?

when it comes to KNN based algos, what similarity options it uses...Also, does it have user_baed parameter as usual surprise library does at sim_options?

custom dataset / simple product recsys

Hi @thededlier , all,

Hope you are all well !

I was wondering if you could explain quickly how to use a simple csv file with columns like productId, customerId and ratings with auto surprise ?

Also, would it be complicated to create a Fastapi server ?

Thanks for any insights or inputs on these questions.

Cheers,
Luc Michalski

Quick Example errors

Hello,
I was trying to run the Quick Example (https://auto-surprise.readthedocs.io/en/stable/usage/quick_start.html#quick-example ) but I got this output:

auto_surprise 0.1.8
Available CPUs: 16
Evaluating RMSE, MAE, MSE of algorithm NormalPredictor on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    1.5145  1.5229  1.5190  1.5152  1.5235  1.5190  0.0037  
MAE (testset)     1.2179  1.2228  1.2170  1.2117  1.2180  1.2175  0.0035  
MSE (testset)     2.2937  2.3193  2.3074  2.2960  2.3210  2.3075  0.0114  
Fit time          0.06    0.06    0.07    0.07    0.07    0.07    0.01    
Test time         0.08    0.04    0.07    0.08    0.08    0.07    0.01    
Baseline loss: 1.5190366726386277
Starting process with svd algorithm
Starting process with svdpp algorithm
Starting process with nmf algorithm
Starting process with knn_basic algorithm
Starting process with knn_baseline algorithm
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Exception for algo svd
Starting process with knn_with_means algorithm
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]Exception for algo svdpp
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Starting process with knn_with_z_score algorithm
Exception for algo nmf
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

Starting process with co_clustering algorithm
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Exception for algo knn_basic
Starting process with slope_one algorithm
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

Starting process with baseline_only algorithm
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]Exception for algo knn_baseline
  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Exception for algo knn_with_means
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Exception for algo knn_with_z_score
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Exception for algo co_clustering
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

  0%|                                                                                                                                                              | 0/100 [00:00<?, ?trial/s, best loss=?]
Exception for algo baseline_only
Traceback (most recent call last):
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/trainer.py", line 85, in start_with_limits
    _, best_trial = self.algo_base.best_hyperparams(max_evals)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/auto_surprise/algorithms/base.py", line 99, in best_hyperparams
    best = fmin(**fmin_args)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 540, in fmin
    return trials.fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/base.py", line 671, in fmin
    return fmin(
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 586, in fmin
    rval.exhaust()
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 364, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "/home/mbaleato/AutoSurpriseExample/venv/lib/python3.8/site-packages/hyperopt/fmin.py", line 279, in run
    new_ids, self.domain, trials, self.rstate.integers(2 ** 31 - 1)
AttributeError: module 'numpy.random' has no attribute 'integers'

Evaluating RMSE, MAE, MSE of algorithm SlopeOne on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    0.9453  0.9533  0.9335  0.9422  0.9494  0.9447  0.0068  
MAE (testset)     0.7435  0.7469  0.7337  0.7397  0.7495  0.7426  0.0056  
MSE (testset)     0.8936  0.9089  0.8713  0.8878  0.9014  0.8926  0.0128  
Fit time          0.39    0.36    0.34    0.40    0.38    0.38    0.02    
Test time         1.59    1.20    1.30    1.50    1.20    1.36    0.16    
----Done!----
Best algorithm: slope_one
Best hyperparameters: None
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Algorithm        ┃ Hyperparameters ┃               Loss ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ svd              │ None            │               None │
│ svdpp            │ None            │               None │
│ nmf              │ None            │               None │
│ knn_basic        │ None            │               None │
│ knn_baseline     │ None            │               None │
│ knn_with_z_score │ None            │               None │
│ knn_with_means   │ None            │               None │
│ co_clustering    │ None            │               None │
│ baseline_only    │ None            │               None │
│ slope_one        │ None            │ 0.9447460038754011 │
└──────────────────┴─────────────────┴────────────────────┘

I have tried with other dataset, and always the same AttributeError:
AttributeError: module 'numpy.random' has no attribute 'integers'

Is there another version that works fine? Or another version of numpy?

Windows/Mac OS functionality?

Hello! I'm very interested in using this package for surprise algorithm selection but not sure if it's working properly for me. I see in Setup it's mentioned that the package requires installation on linux. Is the package not functional on Windows or Mac in that case?

I tried running the ml-100k usage example on my Windows 10 machine but I don't think the output I got was correct (best_model: svd, best_params: NoneType object of builtins module, best_score: 100). Or if this is the intended output could you advise on how to interpret it?
Annotation 2020-06-18 151226

Thank you

Can't run the basic usage

Hello, I was trying to use auto-surprise and run the Basic Usage, it does run properly the baseline loss part, but as soon as it start with the svd algorithm, I get the following error:

Starting process with svd algorithm
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\site-packages\IPython\core\interactiveshell.py", line 3418, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-56575903e07a>", line 19, in <module>
    max_evals=10
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\site-packages\auto_surprise\engine.py", line 96, in train
    best_algo, best_params, best_score, tasks = strategy.evaluate()
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\site-packages\auto_surprise\strategies\continuous_parallel.py", line 50, in evaluate
    p.start()
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\process.py", line 115, in start
    self._popen = self._Popen(self)
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\jmetchebarne\anaconda3\envs\Proyecto Recomendaciones\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle module objects

Not sure if I'm doing something wrong here, to be honest, I just executed the code posted:

from surprise import Dataset
from auto_surprise.engine import Engine

# Load the dataset
data = Dataset.load_builtin('ml-100k')

# Intitialize auto surprise engine
engine = Engine(verbose=True)

# Start the trainer
best_algo, best_params, best_score, tasks = engine.train(
    data=data, 
    target_metric='test_rmse', 
    cpu_time_limit=60 * 60, 
    max_evals=100
)

Any help is really appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.