george-jiexiong / multi-armed-bandit Goto Github PK

View Code? Open in Web Editor NEW

A basic implementation of techniques to solve the Multi-Armed bandit (MAB) problem from the context of a marketing strategy. A couple of techniques namely the Epsilon-Greedy Approach, Upper Confidence Bound (UCB), Gradient Ascent and Thompson Sampling have been used to analyze choosing the best website in terms of receiving a click.

Jupyter Notebook 100.00%

multi-armed-bandit's Introduction

Multi-Armed Bandit Project for Research Project in Data Science course 2018 at Aalto Univeristy

Introduction

Multiarmed Bandit (MAB) problems can be categorized as sequential resource allocation tasks, where one or more resources must be chosen wisely and efficiently allocated among competing projects. This must be typically performed in such a way so as to maximize the overall expected gain. The main dilemmain these particular problems is to either naturally choose between possible paths that yield instantly the maximum gain currently (exploitation) or sacrifice current gain over better future gains (exploration). Since strategies for these problems adequately represent a subsection of reinforcement learning methods, the ultimate objective is to achieve the most appropriate balance between exploration and exploitation, consequently maximizing the overall rewards.

Objective

In this project, the goal is to implement two basic MAB techniques namely the Epsilon-Greedy approach and Thompson Sampling and do a comparative analysis of how these techniques perform across multiple experiments. The problem is to choose a website among many which guarantees the best overall reward. Here, the reward refers to the total number of clicks gained by the website across all trials.

Author(s)

Muhammad Abdullah Khan

Recommend Projects

george-jiexiong / multi-armed-bandit Goto Github PK

multi-armed-bandit's Introduction

Multi-Armed Bandit Project for Research Project in Data Science course 2018 at Aalto Univeristy

Introduction

Objective

Author(s)

multi-armed-bandit's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs