GithubHelp home page GithubHelp logo

gobigger-explore's Introduction

Gobigger-Explore

๐Ÿ”ฎ GoBigger Challenge 2021 Baseline

en/ไธญๆ–‡

๐Ÿค– Introduction

This is the baseline of GoBigger Multi-Agent Decision Intelligence Challenge in 2021. The baseline is based on OpenDILab and aims to provide a simple entry-level method. Participants can build agents by extending the baseline method provided. In addition, Opendilab's modular structure allows participants to easily get started, and it provides a wealth of reinforcement learning algorithms for participants to use. This baseline is a good starting point, especially for entry-level researchers who are familiar with multi-agent decision AI problems.

๐Ÿš€ Release Version

The current version is the latest version 0.3.0.

  1. What needs to be optimized in the future
    • Application of advance algorithms.
    • Design and study of advanced actions.
  2. Supervised Learning
    • Using bots to generate data for supervised learning.
    • The supervised learning model can be used as a competition model or as a pre-train for reinforcement learning.
    • Details can be seen SL
  3. Gobigger with Go-Explore
    • Training Gobigger with Go-Explore algorithm.
    • Speed up network training by loading endgame matches.
    • Details can be seen go-explore
  4. Version-0.3.0
    • Adopt in-place algorithm and gradient accumulation strategy to save gpu memory.
    • Efficiently encode the characteristics of the Version-0.2.0 relational section.
    • Simplified network model and efficient training process design.
  5. Version-0.2.0
    • version-0.2.0 version Link
    • Fix the ckpt bug to improve the accuracy of the evaluator.
    • Fix replay_buffer bug
    • Replay_buffer stores variable-length features to improve data utilization and training speed.
  6. Version-0.1.0
  7. Feature Engineering
    • Brand new feature engineering to improve convergence speed.
      • Scalar Encoder avatar

        • The default upper left corner is the origin of the coordinates.
        • The red rectangle in the figure is the global field of view, and the green rectangle is the local field of view.
      • Food Encoder

        • For the convenience of calculation, the area of ball uses the square of the radius, omitting the constant term.
        • The food map divides the local field of vision into h*w small grids, and the size of each small grid is 16*16.
        • food map[0,:,:] represents the sum of the area of all food in each small gridใ€‚
        • food map[1,:,:] represents the sum of the area of the clone ball of a certain id in each small grid.
        • The food grid divides the local field of vision into h*w small grids, and the size of each small grid is 8*8.
        • The food grid represents the offset of the food in each small grid relative to the upper left corner of the grid and the radius of the food.
        • The dimension of the food relation is [len(clone),7*7+1,3]. Where [:,7*7,3] represents the food information in the 7*7 grid neighborhood of each clone ball, including the offset and the sum of the squares of the food area in the grid. Because the coverage rate is very low, an approximation is made here, and the location information of food is subject to the last one. [len(clone):,1,3] represents the offset and area of each clone ball.
      • Clone Encoder

        • Encode the clone ball, including the position, radius, one-hot encoding of the player name, and the speed encoding of the clone ball.
      • Relation Encoder

        • The relative position relationship between ball_1 and ball_2,(x1-x2,y1-y2).
        • The distance between ball_1 and ball_2.
        • The collision of ball_1 and ball_2 is that the center of a ball appears in another ball.
        • Whether ball_1 and ball_2 collide with each other, that is, the distance between the arc of one ball and the center of the other ball.
        • Whether ball_1 and ball_2 collide with each other after splitting, that is, the distance between the arc of the farthest split ball and the center of the other ball.
        • The relationship between eating and being eaten is the relationship between the radius of the two balls.
        • The relationship between eating and being eaten is the relationship between the radius of the two balls after splitting.
        • The relationship between the radius of the two balls. And ball_1 and ball_2 are mapped to m*n r1 and m*n r2 respectively, where m represents the number of ball_1's clone ball, and n represents the number of ball_2's clone ball. avatar
      • Model

        • The role of the mask is to record the effective information after padding. Need to combine code to understand better.
        • The model design in Baseline is not the best, players just enjoy it! avatar
  8. Win Rate VS Bot
    • Version-0.3.0 VS Rule based bot in Gobigger. avatar
  9. Version comparison
    • Version-0.3.0 VS Version-0.2.0
      • v0.3.0 is more lightweight, and network design and feature coding are easy to use.
      • v0.3.0reward and Q-value curve avatar avatar

๐Ÿ‘‡ Getting Started

  1. System environment

    • Core 8
    • GPU 1080Ti(11G) or 1060(6G)
    • Memory 40G
  2. Baseline Config

    • The default config is the config used in this experiment. Participants can modify it according to the system environment.
    • The size of replay_buffer_size needs to be set according to the size of RAM.
    • The size of batch_size needs to be set according to the size of the GPU memory.
  3. Install the necessary packege

    # Install DI-engine
    git clone https://github.com/opendilab/DI-engine.git
    cd YOUR_PATH/DI-engine/
    pip install -e . --user

    # Install Env Gobigger
    git clone https://github.com/opendilab/GoBigger.git
    cd YOUR_PATH/GoBigger/
    pip install -e . --user
  1. Start training
    # Download baseline
    git clone https://github.com/opendilab/Gobigger-Explore.git
    cd my_submission/entry/
    python gobigger_vsbot_baseline_simple_main.py.py
  1. Evaluator and Save game videos
    cd my_submission/entry/
    python gobigger_vsbot_baseline_simple_eval.py --ckpt YOUR_CKPT_PATH
    # No need to save the video, uncomment line 258 of gobigger_env.py
    python gobigger_vsbot_baseline_simple_quick_eval.py --ckpt YOUR_CKPT_PATH
  1. SL่ฎญ็ปƒ
   cd my_submission/sl/
   python generate_data_opensource.py # generate data for training
   python train.py -c ./exp/sample/config.yaml #need change data dir
  1. Go explore
   cd my_submission/go-explore/
   python gobigger_vsbot_explore_main.py

๐ŸŽฏ Result

We released training log information, checkpoints, and evaluation videos. Below is the download link,

  • Version 0.3.0
    • Baidu Netdisk Link
      • Extraction code: 95el
  • Version 0.2.0
    • Baidu Netdisk Link
      • Extraction code: u4i6

๐Ÿ˜ Resources

gobigger-explore's People

Contributors

jayyoung0802 avatar mingzhang96 avatar paparazz1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.