GithubHelp home page GithubHelp logo

luxai-s2-baseline's Introduction

Update for gymnasium!!!

We have updated the baseline for the gymnasium version of the Lux AI Challenge. You can run the baseline with both the old gym and the new gymnasium version. The new gymnasium version is available here: Luxai-s2-Baseline4Gymnasium

The old gym version is available here: Luxai-s2-Baseline4Gym

Luxai-s2-Baseline

Welcome to the Lux AI Challenge Season 2! This repository serves as a baseline for the Lux AI Challenge Season 2, designed to provide participants with a strong starting point for the competition. Our goal is to provide you with a clear, understandable, and modifiable codebase so you can quickly start developing your own AI strategies.

This baseline includes an implementation of the PPO reinforcement learning algorithm, which you can use to train your own agent from scratch. The codebase is designed to be easy to modify, allowing you to experiment with different strategies, reward functions, and other parameters.

In addition to the main training script, we also provide additional tools and resources, including scripts for evaluating your AI strategy, as well as useful debugging and visualization tools. We hope these tools and resources will help you develop and improve your AI strategy more effectively.

More information about the Lux AI Challenge can be found on the competition page: https://www.kaggle.com/competitions/lux-ai-season-2

We look forward to seeing how your AI strategy performs in the competition!

luxai-s2-baseline's People

Contributors

schopenhauer-loves-hegel avatar yk7333 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

luxai-s2-baseline's Issues

requirements.txt

Hey!

Could you include a requirements.txt so that we know which versions of gym / torch this is compatible with?

I see it's using gym rather that gymansium and also that env.seed(x) is used so must be pre v26 of gym.

Thanks!

Memory leak in train.py ๐Ÿ›

We logged the gpu and ram usage and found that ram usage was exploding into swap quickly, tested with 30gb RAM and 15 gb RAM capped at 1/2 RAM. We think that its a memory leak.

CLI command: python train.py --total-timesteps 1050 --num-envs 1 --save-interval=999 --train-num-collect=1024

Running in WSL, configured according to the documentation in this repo.
here are the logs
log.json

this is how we logged it.

memory_tracker = {}
last_check = process.memory_info().rss
def log_memory(name): 
    global last_check
    inc = process.memory_info().rss - last_check
    if name not in memory_tracker:
        memory_tracker[name] = 0
    memory_tracker[name] += inc
    last_check = process.memory_info().rss

def print_memory():
    memory_tracker_formatted = {}
    for k in memory_tracker:
        memory_tracker_formatted[k] = "{:.2f}".format(memory_tracker[k] / 10**9)
    memory_tracker_formatted["current"] = "{:.2f}".format(process.memory_info().rss / 10**9)
    print("current:", memory_tracker_formatted)

This code is allocating the memory. We think its something to do with torch models not being detached.

                # beginning of code block
for player_id, player in enumerate(['player_0', 'player_1']):
                obs[player] += envs.split(next_obs[player])
                dones[train_step] = next_done
                log_memory("-1")

                # ALGO LOGIC: action logic
                # use no_grad() context disables gradient calculation: https://pytorch.org/docs/stable/generated/torch.no_grad.html
                with torch.no_grad():
                    log_memory("0")
                    # under no_grad, any tensors created as a result
                    # of a computation will have their internal requires_grad
                    # state set to false. This means their gradient will not be
                    # calculated by torch. This avoids memory consumption for stuff
                    # that doesn't need gradient calculatios.
                    valid_action = envs.get_valid_actions(player_id)
                    # np2torch = lambda x, dtype: torch.tensor(x).type(dtype).to(device).detach()
                    log_memory("1")
                    # calling agent() like a function actually calls agent.forward() under the hood
                    # its defined in net.py. Here the observation space is being passed in.
                    # an action space which is not completely abstract BUT not completely lux either is
                    # returned. Lets call it intermediate actio space
                    global_feature = np2torch(next_obs[player]['global_feature'], torch.float32)
                    map_feature = np2torch(next_obs[player]['map_feature'], torch.float32)

                    log_memory("1.25")
                    # Note np2torch is a pytorch model in delcared in the folder
                    action_feature = tree.map_structure(lambda x: np2torch(x, torch.int16), next_obs[player]['action_feature'])
                    log_memory("1.5")
                    va = tree.map_structure(lambda x: np2torch(x, torch.bool), valid_action)
                    log_memory("1.75")

                    logprob, value, raw_action, _ = agent(
                        global_feature,
                        map_feature,
                        action_feature,
                        va
                    )
                    values[player][train_step] = value
                    log_memory("2")
                    # action space arrays are partitioned like so
                    # {"transfer_power": [10, 11, 12]} where 10, 11 and 12 represent the 
                    # actions to perform on environments 0, 1 and 2 (vectorized environments)
                    # This function splits the actions into independent trees per environment:
                    # [{'transfer_power': 10}, {'transfer_power': 11}, {'transfer_power': 12}]
                    valid_actions[player] += envs.split(valid_action)
                    log_memory("3")
                # see above comment, split raw_actions into independent trees per vectorized environment
                actions[player] += envs.split(raw_action)
                action[player_id] = raw_action
                logprobs[player][train_step] = logprob
                log_memory("4")

Are any modifications needed for neurips stage?

I followed README instructions, but i got a lot of assertion error on python train.py.

I modified map_size 48 to 64 and it resolves assertion error, but i faced value error:

File "/sources/Luxai-s2-Baseline/luxenv.py", line 329, in reset
obs, rewards, dones, infos = self.proxy.step(actions)
ValueError: too many values to unpack (expected 4)

Modification of 329th line of luxenv.py
obs, rewards, dones, _, infos = self.proxy.step(actions)
resloves the error, but it raised the other error

File "/opt/conda/envs/luxai_s2/lib/python3.8/site-packages/numpy/core/shape_base.py", line 471, in stack
return _nx.concatenate(expanded_arrays, axis=axis, out=out,
File "<array_function internals>", line 200, in concatenate
ValueError: Output array is the wrong shape

Submit to kaggle

Do I need to write a separate agent.py file to submit to Kaggle?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.