Hi there, Contextual bandits problem is interesting and useful for m

mtrand.pyx in mtrand.RandomState.beta() a<=0 error in Adaptive Greedy Algorithm about contextualbandits HOT 5 CLOSED

david-cortes commented on August 19, 2024

mtrand.pyx in mtrand.RandomState.beta() a<=0 error in Adaptive Greedy Algorithm

from contextualbandits.

Comments (5)

david-cortes commented on August 19, 2024

Thanks for the bug report!

I however tried different initializations and Python versions, but was unable to reproduce the error. Some questions:

Are you using the latest version of this package? (that is: 1.7.2)
Does it happen at the beginning (right after initialization) or after having called fit/partial_fit multiple times?
Did you modify something in running the notebook?
Are you running this in a regular computer/server/laptop, or is it some special architechture or cluster or something?
Which joblib, NumPy and SciPy version are you using? In case they are old, have you tried updating them? (I tried NumPy 1.15 and SciPy 1.2)

It's weird that it would only happen with one class, as all classes in online set those numbers through the same piece of code in utils.

from contextualbandits.

oibook13 commented on August 19, 2024

Thanks for your time to check the problem.

I had figured it out, but I would still answer your questions before posting my solution.

Are you using the latest version of this package? (that is: 1.7.2)
Yes, it is the latest version.
Does it happen at the beginning (right after initialization) or after having called fit/partial_fit multiple times?
I directly used the code in online_contextual_bandits.ipynb. It took place in

lst_actions[model] = simulate_rounds(models[model], lst_rewards[model], lst_actions[model], X, y, batch_st, batch_end)

Did you modify something in running the notebook?
I didn't.
Are you running this in a regular computer/server/laptop, or is it some special architechture or cluster or something?
I ran it on a workstation, the arch should be x86_64.
Which joblib, NumPy and SciPy version are you using? In case they are old, have you tried updating them? (I tried NumPy 1.15 and SciPy 1.2)
I used joblib 0.12.5, numpy 1.15.4, scipy 1.1.0.

After debugging, it is caused by

def _calculate_beta_prior(nchoices): 
    return (3.0/nchoices, 4)

in contextualbandits/utils.py line 62.
In original version, it was 3/nchoices, when nchoices=159, it turns to 0. Now, I changed it to 3.0/nchoices=0.018. Then, it works like a charm.

Another thing I would like to report is that when active_choice='weighted' for AdaptiveGreedy, it would call AdaptiveExplorer._crit_active(). This would lead to another exception

TypeError: unbound method _crit_active() must be called with ActiveExplorer instance as first argument (got AdaptiveGreedy instance instead)

I copied AdaptiveExplorer._crit_active() to class AdaptiveGreedy and call

pred[set_greedy] = np.argmax(
                self._crit_active(
                    X[set_greedy],
                    pred_all[set_greedy],
                    self.active_choice),
                axis = 1)

in contextualbandits/online.py line 959. Then, everything works well but I don't know whether it is correct or not.

I probably should open a new issue for this.

Thanks.

from contextualbandits.

david-cortes commented on August 19, 2024

Thanks for your time to check the problem.

I had figured it out, but I would still answer your questions before posting my solution.

Are you using the latest version of this package? (that is: 1.7.2)
Yes, it is the latest version.

Does it happen at the beginning (right after initialization) or after having called fit/partial_fit multiple times?
I directly used the code in online_contextual_bandits.ipynb. It took place in
lst_actions[model] = simulate_rounds(models[model], lst_rewards[model], lst_actions[model], X, y, batch_st, batch_end)
Did you modify something in running the notebook?
I didn't.

Are you running this in a regular computer/server/laptop, or is it some special architechture or cluster or something?
I ran it on a workstation, the arch should be x86_64.

Which joblib, NumPy and SciPy version are you using? In case they are old, have you tried updating them? (I tried NumPy 1.15 and SciPy 1.2)
I used joblib 0.12.5, numpy 1.15.4, scipy 1.1.0.

After debugging, it is caused by
def _calculate_beta_prior(nchoices): 
    return (3.0/nchoices, 4)
in contextualbandits/utils.py line 62.
In original version, it was 3/nchoices, when nchoices=159, it turns to 0. Now, I changed it to 3.0/nchoices=0.018. Then, it works like a charm.

Another thing I would like to report is that when active_choice='weighted' for AdaptiveGreedy, it would call AdaptiveExplorer._crit_active(). This would lead to another exception
TypeError: unbound method _crit_active() must be called with ActiveExplorer instance as first argument (got AdaptiveGreedy instance instead)
I copied AdaptiveExplorer._crit_active() to class AdaptiveGreedy and call
pred[set_greedy] = np.argmax(
                self._crit_active(
                    X[set_greedy],
                    pred_all[set_greedy],
                    self.active_choice),
                axis = 1)
in contextualbandits/online.py line 959. Then, everything works well but I don't know whether it is correct or not.

I probably should open a new issue for this.

Thanks.

Thanks for the detailed report. It would be correct to just move the method like that. I'll try to make some slight adjustments to make it work under Python 2.7 (these are small details in which Python 2.7 and 3.4+ differ : int / int = float in 3, and the unbound method is valid in 3), although it oddly enough ran without errors for me in 2.7.15.

from contextualbandits.

oibook13 commented on August 19, 2024

As Python 2.7 will be deprecated on Jan 2021 (if I am not mistaken), it would be fine to support Python 3.x only.

Your work "Adapting multi-armed bandits policies to contextual bandits scenarios" is interesting to me, do you plan to extend it to work with PyTorch or TensorFlow to increase its visibility?

from contextualbandits.

david-cortes commented on August 19, 2024

Updated the potential integer divisions and unbound method calls (in online at least) - should work fine under Python 2.7 now. I don't plan to extend it to work with frameworks like tensorflow or pytorch, but if you want to use it with them, you'd just need to make them into a scikit-learn-like class with fit / predict (save for a few methods that use e.g. gradients) - I've tested it with xgboost for example and works fine too.

from contextualbandits.

mtrand.pyx in mtrand.RandomState.beta() a<=0 error in Adaptive Greedy Algorithm about contextualbandits HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs