GithubHelp home page GithubHelp logo

jdc08161063 / dota2-predictor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from andreiapostoae/dota2-predictor

0.0 2.0 0.0 25.13 MB

Tool that predicts the outcome of a Dota 2 game using Machine Learning

License: MIT License

Jupyter Notebook 15.61% Python 84.15% Shell 0.25%

dota2-predictor's Introduction

dota2-predictor

Overview

dota2-predictor is a tool that uses Machine Learning over a dataset of over 500000 past matches in order to predict the outcome of a game. This project achieves roughly 0.63 ROC AUC score using both logistic regression and neural networks.

Requirements

The project requires a handful of python packages. Install them using:

pip install -r requirements.txt

Basic usage

dota2-predictor has two main use cases: one for simply predicting the outcome of the game knowing all the heroes and one for predicting what the best last pick is given a configuration of the other nine heroes.

You can customise your preferred hero names in the preprocessing/heroes.json file.

Predicting the outcome of the game

python query.py 3520 Dire Luna SD WK TA PA AM Kunkka Tide Phoenix Zeus

The first argument is the average MMR of your game and is followed by a list of the 10 heroes: first 5 must be the radiant team and last 5 must be the dire team. The program will find you the best pretrained model given the average MMR of the game and output the chance to win.

Using closest model available: 3500 MMR
Dire chance: 47.217%

Predicting the best last pick

python query.py 3520 Radiant Luna SD WK TA PA AM Kunkka Tide Phoenix

The main difference from the previous example is that now you only input 9 heroes, in their respective order. You will get a list of possible picks and their corresponding chances in order to increase your chances of winning the game.

Wisp                      40.068%
Alchemist                 40.801%
Oracle                    41.269%
[...]
Sven                      60.3%
Ursa                      60.561%
Abyssal Underlord         63.929%

Advanced usage

Downloading new data

Patches are released almost monthly, so differences between data trained in different periods of time can get significant. Because the Steam API does not (easily) provide access to a player's MMR and the skill level query is broken, there are two steps in mining new data:

  • download lists of games played starting with a sequence number directly from Steam and filter irrelevant games
cd mining
python steam_miner.py list.csv NUM_GAMES
  • take each game from the list and find the hero configuration, the winner, and the MMR of the players who made it public using the opendota API (limited to 1 request per second)
python opendota_miner.py list.csv output.csv NUM_GAMES

Training a model

The raw input CSV are filtered using a DataPreprocess object and the remaining games will be given as input for the Logistic Regression.

# filter the games in the [mmr - offset, mmr + offset] interval
data_preprocess = DataPreprocess(full_list, mmr, offset)
filtered_list = data_preprocess.run()

# instantiate a LogReg object using the filtered games
log_reg = LogReg(filtered_list, mmr, offset, output_model="my_model")

# set evaluate to 1 to display information about the dataset and the training accuracy
logreg.run(evaluate=1)

This will save your model and synergy dictionaries in pickle format, which you can load and query later using the functions in evaluate.py.

Plotting the learning curve

You can plot the learning curve of your model using the learning_curve flag.

logreg.run(evaluate=1, learning_curve=1)

alt text

Plotting the heatmap of synergies and counter synergies

While training, statistics about hero synergies and counter synergies are stored in dictionaries that are saved in the pickle format, similar to the model. You can visualize those graphs using the heat_map flag.

Keep in mind that on both axis, the number represent the heroes indices (e.g. 0 is Anti-Mage, 80 is Chaos Knight etc).

logreg.run(evaluate=1, heat_map=1)

alt text

Plotting hero winrates

Data about hero winrates during the training phase can be plotted using the winrates flag. As there are 113 heroes currently, it is hard to fit the plot in this README, but you can find it here.

logreg.run(evaluate=1, winrates=1)

Pretraining models in a MMR interval

Alternatively, instead of training a single model at a time, you can train multiple models in an interval with your desired offset. The first argument should be the CSV containing the data, and the second should be the MMR offset

python pretrain.py 706d.csv 200

The MIN_MMR, MAX_MMR are set within the script, and the models and dictionaries will be saved in the pretrained folder, as well as a CSV file containing the statistics for every model.

Author's note

This is a hobby project started with the goal of achieving as high accuracy as possible given the picks from a game. Of course, one could argue that there are other statistics such as GPM, XPM or itemization that influence the outcome of a game, but this tool's usage is to suggest you the best possible pick before the game starts. Other statistics are dynamic throughout the game, so they do not help the prediction.

Even though there are papers where people claim to have achieved a much higher accuracy (>70% in some cases), it highly depends on the data they used. We need to keep in mind that Dota 2 is a game that is constantly evolving, getting more complex as new heroes are released and also balance changes are made all the time.

There is also the human factor that strongly influences the outcome of a game, so there is no way of predicting each game with close-to-perfect accuracy. This tool, however, is up-to-date with the current patch and does a decent job predicting your best possible last pick given a situation in order to give you that little extra chance that turns the tides in your favor.

Good luck in your matches and game on!

dota2-predictor's People

Contributors

andreiapostoae avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.