stefanbschneider / mobile-env Goto Github PK

View Code? Open in Web Editor NEW

84.0 3.0 24.0 2.93 MB

An open, minimalist Gymnasium environment for autonomous coordination in wireless mobile networks.

Home Page: https://mobile-env.readthedocs.io

License: MIT License

Python 100.00%

reinforcement-learning environment gym mobile-networks cellular coordination management optimization python python3

mobile-env's Introduction

mobile-env: An Open Environment for Autonomous Coordination in Mobile Networks

mobile-env is an open, minimalist environment for training and evaluating coordination algorithms in wireless mobile networks. The environment allows modeling users moving around an area and can connect to one or multiple base stations. Using the Gymnasium (previously Gym) interface, the environment can be used with any reinforcement learning framework (e.g., stable-baselines or Ray RLlib) or any custom (even non-RL) coordination approach. The environment is highly configurable and can be easily extended (e.g., regarding users, movement patterns, channel models, etc.).

mobile-env supports multi-agent and centralized reinforcement learning policies. It provides various choices for rewards and observations. mobile-env is also easily extendable, so that anyone may add another channel models (e.g. path loss), movement patterns, utility functions, etc.

As an example, mobile-env can be used to study multi-cell selection in coordinated multipoint. Here, it must be decided what connections should be established among user equipments (UEs) and base stations (BSs) in order to maximize Quality of Experience (QoE) globally. To maximize the QoE of single UEs, the UE intends to connect to as many BSs as possible, which yields higher (macro) data rates. However, BSs multiplex resources among connected UEs (e.g. schedule physical resource blocks) and, therefore, UEs compete for limited resources (conflicting goals). To maximize QoE globally, the policy must recognize that (1) the data rate of any connection is governed by the channel (e.g. SNR) between UE and BS and (2) QoE of single UEs not necessarily grows linearly with increasing data rate.

^{Base station icon by Clea Doltz from the Noun Project}

Try mobile-env:

Part I: Customizing mobile-env and single-agent RL with stable-baselines3:
Part II: Multi-agent RL on mobile-env with Ray RLlib:

Documentation and API: ReadTheDocs

Citation

If you use mobile-env in your work, please cite our paper (author PDF):

@inproceedings{schneider2022mobileenv,
  author = {Schneider, Stefan and Werner, Stefan and Khalili, Ramin and Hecker, Artur and Karl, Holger},
  title = {mobile-env: An Open Platform for Reinforcement Learning in Wireless Mobile Networks},
  booktitle={Network Operations and Management Symposium (NOMS)},
  year = {2022},
  publisher = {IEEE/IFIP},
}

mobile-env is based on the underlying environment using in DeepCoMP, which is a combination of reinforcement learning approaches for dynamic multi-cell selection. mobile-env provides this underlying environment as open, stand-alone environment.

Installation

From PyPI (Recommended)

The simplest option is to install the latest release of mobile-env from PyPI using pip:

pip install mobile-env

This is recommended for most users. mobile-env is tested on Ubuntu, Windows, and MacOS.

From Source (Development)

Alternatively, for development, you can clone mobile-env from GitHub and install it from source. After cloning, install in "editable" mode (-e):

pip install -e .

This is equivalent to running pip install -r requirements.txt.

If you want to run tests or examples, also install the requirements in tests. For dependencies for building docs, install the requirements in docs.

Example Usage

import gymnasium
import mobile_env

env = gymnasium.make("mobile-medium-central-v0")
obs, info = env.reset()
done = False

while not done:
    action = ... # Your agent code here
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated
    env.render()

Customization

mobile-env supports custom channel models, movement patterns, arrival & departure models, resource multiplexing schemes and utility functions. For example, replacing the default Okumura–Hata channel model by a (simplified) path loss model can be as easy as this:

import gymnasium
import numpy as np
from mobile_env.core.base import MComCore
from mobile_env.core.channel import Channel


class PathLoss(Channel):
    def __init__(self, gamma, **kwargs):
        super().__init__(**kwargs)
        # path loss exponent
        self.gamma = gamma

    def power_loss(self, bs, ue):
        """Computes power loss between BS and UE."""
        dist = bs.point.distance(ue.point)
        loss = 10 * self.gamma * np.log10(4 * np.pi * dist * bs.frequency)
        return loss


# replace default channel model in configuration
config = MComCore.default_config()
config['channel'] = PathLoss

# pass init parameters to custom channel class!
config['channel_params'].update({'gamma': 2.0})

# create environment with custom channel model
env = gymnasium.make('mobile-small-central-v0', config=config)
# ...

Projects Using mobile-env

If you are using movile-env, please let us know and we are happy to link to your project from the readme. You can also open a pull request yourself.

Contributing

Development: @stefanbschneider and @stwerner97

We happy if you find mobile-env useful. If you have feedback or want to report bugs, feel free to open an issue. Also, we are happy to link to your projects if you use mobile-env.

We also welcome contributions: Whether you implement a new channel model, fix a bug, or just make a minor addition elsewhere, feel free to open a pull request!

mobile-env's People

Contributors

Stargazers

Watchers

mobile-env's Issues

Link mobile-env paper

Add mobile-env paper and citation info to readme

How to specify custom SINR calculation

And also, I wrote a short code which is about SINR, as shown below:

''''
def sinr(self, bs, ue, all_bs):
"""Custom SINR calculation."""
signal = np.clip(10 ** ((bs.tx_power - self.alpha * self.power_loss(bs, ue)) / 10), -1e+30, 1e+30)
interference = sum(np.clip(10 ** ((other_bs.tx_power - self.alpha * self.power_loss(other_bs, ue)) / 10), -1e+30, 1e+30) for other_bs in all_bs if other_bs != bs)
return signal / (ue.noise + interference)
''''
Just wondering is there any way I can overwrite the old snr with the new sinr ? I tried to overwrite the old snr but since I add an extra argument "all_bs", it seems it does not work.

Thanks for replying.

Originally posted by @R0B1NNN1 in #38 (comment)

Update readme from Gym to Gymnasium

Hey,

Would you be willing to update your readme to say Gymnasium instead of Gym? You switched your repository over from it, and Gymnasium isn't the same thing as Gym (I used to maintain Gym) :)

Collect and write metrics for evaluation

mobile-env needs a generic way to define and collect metrics during testing/evaluation, but not during training (too much overhead).
These metrics (eg, utility per UE and time step) should be written to a CSV file after simulation. This should allow to quantify and compare performance of different RL approaches.

In DeepCoMP, all metrics are kept in the info dict: https://github.com/CN-UPB/DeepCoMP/blob/master/deepcomp/env/single_ue/base.py#L383
This is later logged through a callback in TensorBoard and automatically written to CSV by RLlib.

To keep this framework-independent in mobile-env, I suggest to also collect metrics in the info dict but then write them directly to csv in Python.
This should be disabled by default and only enabled for testing.

@stwerner97 What do you think? Do you have any suggestions on how to integrate this in an easy, yet generic way?

Add separate Ray RLlib example

Add separete notebook showing how to use Ray RLlib with mobile-env. Based on the current demo notebook.

Then make an MR to list it in the Ray docs: https://docs.ray.io/en/latest/rllib/rllib-examples.html

Limit mobile-env to single-cell selection

Hello, @stefanbschneider , I have a few questions regarding to the mobile-env.

For mobile-env, is it possible to set the number of users per base station instead of we have random users.
In this environment, we assume that each UE can connect to multiple BSs, right? Is it possible to change the settings to each UE can only connect 1 BS.

Feature Request: migrate from gym to gymnasium

Hi, this repository is currently listed in the gymnasium third party environments but we are cleaning the list up to only include maintained gymnasium-compatible repositories.

Would it be possible for it to be upgraded from gym to gymnasium? Gymnasium is the maintained version of openai gym and is compatible with current RL training libraries (rllib and tianshou have already migrated, and stable-baselines3 will soon).

For information about upgrading and compatibility, see migration guide and gym compatibility. The main difference is the API has switched to returning truncated and terminated, rather than done, in order to give more information and mitigate edge case issues.

Multi-agent RL on mobile-env with Stable-Baselines3

Thanks for you work!
Could you give me an example about Multi-agent RL on mobile-env with Stable-Baselines3?
Thanks!

Can this implement ue to ue connection?

Update to stable-baselines3 v2.0+

Once released, update to stable-baselines3 version 2.0.0+ to properly support gymnasium. sb3 is used for tests and in the demo notebook; currently an early alpha version with gymnasium support.

Once updated, also make an MR to add mobile-env to the sb3 projects page: https://stable-baselines3.readthedocs.io/en/master/misc/projects.html

RLlib resource error in demo notebook

Dear Stefan Schneider:

I have a question regarding the Co-lab document you uploaded on GitHub, the last section "Multi-agent RL with Raylib." When I ran "analysis = ray.tune.run("PPO", config=config, local_dir="results_rllib", stop={"timesteps_total": 30000}, checkpoint_at_end=True)", I had a warning of:

"2023-04-28 16:05:59,264 WARNING insufficient_resources_manager.py:128 -- Ignore this message if the cluster is autoscaling. You asked for 3.0 cpu and 0 gpu per trial, but the cluster only has 1.0 cpu and 0 gpu. Stop the tuning job and adjust the resources requested per trial (possibly via resources_per_trial or via num_workers for rllib) and/or add more resources to your Ray runtime. "

I am wondering why? I thought because of the default settings of CPU and GPU. I tried to fix it by adding num_workers = 0 inside the config{}, and it has a error of cannot find the env: mobile-small-ma-v0.

Thank you & Using NS3 or OMNeT++?

Such a great study.
Thank you so much for contribution to this field.
Is it possible to use this study on other network simulators such as NS3 or OMNeT++.
If it is possible, can you give us advice how to start.

Regards,
Serap

For custom env

Hello, @stefanbschneider @stwerner97

I met a problem when I define a new customer environment. Here is my code:

class Env1(MComCore):
    # overwrite some of the default settings
    @classmethod
    def default_config(cls):
        config = super().default_config()
        # update seed 
        config.update({
            "width": 200,
            "height": 200,
            "EP_MAX_TIME": 100,
            "seed": 68,
            "reset_rng_episode": True  
        })      
        return config
    
    # configure users and cells in the constructor
    def __init__(self, config={}, render_mode=None):
        # load default config defined above; overwrite with custom params
        env_config = self.default_config()
        env_config.update(config)

        # two cells next to each other; unpack config defaults for other params
        stations = [
            BaseStation(bs_id=0, pos=(110, 130), **env_config["bs"]),
            BaseStation(bs_id=1, pos=(65, 80), **env_config["bs"]),
            BaseStation(bs_id=2, pos=(120, 30), **env_config["bs"]),

        ]

        # users
        users = [
            # two fast moving users with config defaults
            UserEquipment(ue_id=1, **env_config["ue"]),
            UserEquipment(ue_id=2, **env_config["ue"]),
            UserEquipment(ue_id=3, **env_config["ue"]),
            UserEquipment(ue_id=4, **env_config["ue"]),
            UserEquipment(ue_id=5, **env_config["ue"]),
        ]

        super().__init__(stations, users, config, render_mode)       

import gymnasium
from ray.tune.registry import register_env

# use the mobile-env RLlib wrapper for RLlib
def register(config):
    # importing mobile_env registers the included environments
    from mobile_env.wrappers.multi_agent import RLlibMAWrapper

    env = Env1(config={"seed": 68}, render_mode="rgb_array")
    return RLlibMAWrapper(env)

# register the predefined scenario with RLlib
register_env("mobile-small-ma-v0", register)

import ray

# init ray with available CPUs (and GPUs) and init ray
ray.init(
  num_cpus=5,   # change to your available number of CPUs
  include_dashboard=False,
  ignore_reinit_error=True,
  log_to_driver=False,
)

import ray.air
from ray.rllib.algorithms.ppo import PPOConfig

from ray.rllib.policy.policy import PolicySpec
from ray.tune.stopper import MaximumIterationStopper

# Create an RLlib config using multi-agent PPO on mobile-env's small scenario.
config = (
    PPOConfig()
    .environment(env="mobile-small-ma-v0")
    # Here, we configure all agents to share the same policy.
    .multi_agent(
        policies={"shared_policy": PolicySpec()},
        policy_mapping_fn=lambda agent_id, episode, worker, **kwargs: "shared_policy",
    )
    # RLlib needs +1 CPU than configured below (for the driver/traininer?)
    .resources(num_cpus_per_worker=4)
    .rollouts(num_rollout_workers=1)
)

# Create the Trainer/Tuner and define how long to train
tuner = ray.tune.Tuner(
    "PPO",
    run_config=ray.air.RunConfig(
        # Save the training progress and checkpoints locally under the specified subfolder.
        storage_path="./results",
        # Control training length by setting the number of iterations. 1 iter = 4000 time steps by default.
        stop=MaximumIterationStopper(max_iter=5),
        checkpoint_config=ray.air.CheckpointConfig(checkpoint_at_end=True),
    ),
    param_space=config,
)

# Run training and save the result
result_grid = tuner.fit()

So when I try to overwrite the original env MComCore, I have the following bugs:

2023-08-07 21:58:28,868	ERROR tune_controller.py:873 -- Trial task failed for trial PPO_mobile-small-ma-v0_1d01d_00000
Traceback (most recent call last):
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\air\execution\_internal\event_manager.py", line 110, in resolve_future
    result = ray.get(future)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\_private\auto_init_hook.py", line 18, in auto_init_wrapper
    return fn(*args, **kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\_private\client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\_private\worker.py", line 2540, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train() (pid=32848, ip=127.0.0.1, actor_id=de081499b53ab9f1b929d70b01000000, repr=PPO)
  File "python\ray\_raylet.pyx", line 1434, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1438, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1378, in ray._raylet.execute_task.function_executor
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\_private\function_manager.py", line 724, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\util\tracing\tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\tune\trainable\trainable.py", line 389, in train
    raise skipped from exception_cause(skipped)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\tune\trainable\trainable.py", line 386, in train
    result = self.step()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\util\tracing\tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\algorithms\algorithm.py", line 803, in step
    results, train_iter_ctx = self._run_one_training_iteration()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\util\tracing\tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\algorithms\algorithm.py", line 2853, in _run_one_training_iteration
    results = self.training_step()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\util\tracing\tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\algorithms\ppo\ppo.py", line 403, in training_step
    train_batch = synchronous_parallel_sample(
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\execution\rollout_ops.py", line 85, in synchronous_parallel_sample
    sample_batches = worker_set.foreach_worker(
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\worker_set.py", line 722, in foreach_worker
    handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\worker_set.py", line 75, in handle_remote_call_result_errors
    raise r.get()
ray.exceptions.RayTaskError(AttributeError): ray::RolloutWorker.apply() (pid=34104, ip=127.0.0.1, actor_id=87a496b9b1fd3ab0e62cb5aa01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x000001AA097D0100>)
  File "python\ray\_raylet.pyx", line 1434, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1438, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1378, in ray._raylet.execute_task.function_executor
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\_private\function_manager.py", line 724, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\util\tracing\tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\utils\actor_manager.py", line 185, in apply
    raise e
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\utils\actor_manager.py", line 176, in apply
    return func(self, *args, **kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\execution\rollout_ops.py", line 86, in <lambda>
    lambda w: w.sample(), local_worker=False, healthy_only=True
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\util\tracing\tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 915, in sample
    batches = [self.input_reader.next()]
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\sampler.py", line 92, in next
    batches = [self.get_data()]
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\sampler.py", line 277, in get_data
    item = next(self._env_runner)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\env_runner_v2.py", line 323, in run
    outputs = self.step()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\evaluation\env_runner_v2.py", line 342, in step
    ) = self._base_env.poll()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\env\multi_agent_env.py", line 633, in poll
    ) = env_state.poll()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\env\multi_agent_env.py", line 828, in poll
    self.reset()
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\env\multi_agent_env.py", line 912, in reset
    raise e
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\ray\rllib\env\multi_agent_env.py", line 906, in reset
    obs_and_infos = self.env.reset(seed=seed, options=options)
  File "c:\Users\18406\anaconda3\envs\rayenv\lib\site-packages\mobile_env\wrappers\multi_agent.py", line 34, in reset
    self.prev_step_ues = set(obs.keys())
AttributeError: 'numpy.ndarray' object has no attribute 'keys'

But if I do not overwrite it, in other words if I use the default multi-agent env mobile-small-ma-v0 this bug will not happen. I am wondering why?

Thanks for replying in advance.

Action space definition

Hi,
I am reading your projects and want to ask about your actions definitions of the agents and your action space size. I found that you are using MultiDiscrete for action space in your environment, can I customize it to another size such as Discrete and what are the parameters of each dimension in the vector of the MultiDiscrete action spaces stands for?

env.load() halts

I downloaded canonical Wiki-How, but the load() function never returns after creating a virtual device. Also, ResNet() is not found, so I had to comment.

import android_env
from android_env.components.tools.easyocr_wrapper import EasyOCRWrapper
from android_env.components.coordinator import EventCheckControl

import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"

#paths and parameters:
task_path = (..)
avd_name = 'Pixel 2 API 30'
android_avd_home = (...)
android_sdk_root = (...)
emulator_path = (...)
adb_path = (...)

env = android_env.load( task_path=task_path
, avd_name=avd_name
, android_avd_home=android_avd_home
, android_sdk_root=android_sdk_root
, emulator_path=emulator_path
, adb_path=adb_path
, run_headless=True
, mitm_config=None
, start_token_mark=""
, non_start_token_mark="##"
, special_token_pattern = r"[\w+]"
#, unify_vocabulary="vocab.txt"
, text_model=EasyOCRWrapper()
#, icon_model=ResNet()
#, icon_model=EasyOCRWrapper()
, with_view_hierarchy=False
, coordinator_args={ "vh_check_control_method": EventCheckControl.LIFT
, "vh_check_control_value": 3.
, "screen_check_control_method": EventCheckControl.LIFT
, "screen_check_control_value": 1.
}
)

Refactor environment to wrappers for multi-agent and central setting

Allocate Physical Cell Identifier (PCI) to BS

Hello,
I need to allocate PCI (using RL algorithms) to BS. I saw the example implementation of multi_agent.py in handlers. Is it possible to have more BS than the number of PCIs and use RL algorithms to allocate PCI to BS , considering collision and confusion free network?