lijunsun90 / pursuitfsc2 Goto Github PK

POMG algorithm for large-scale pursuit game with partial observation and no communication.

License: MIT License

Python 100.00%

large-scale multi-agent-cooperation partially-observable-markov-model predator-prey pursuit pursuit-evasion self-organization no-communication

pursuitfsc2's Introduction

Fuzzy self-organizing cooperative coevolution (FSC2) for multi-target self-organizing pursuit

Official code for the paper "Toward multi-target self-organizing pursuit in a partially observable Markov game", which has been submitted to Arxiv and journal for peer-review.

Using the following to cite:

Sun, L., Chang, Y.C., Lyu, C., Shi, Y., Shi, Y. and Lin, C.T., 2022. Toward multi-target self-organizing pursuit in a partially observable Markov game. arXiv preprint arXiv:2206.12330.

Description

In the proposed FSC2, the multi-target self-organizing pursuit (MTSOP or SOP) problem is decomposed into three subtasks: fuzzy-based distributed task allocation (DTA), self-organizing search (SOS), and single-target pursuit (STP).

The MTSOP, i.e, the proposed FSC2 algorithm, is in the folder multi_target_self_organizing_pursuit.
The SOS task is trained and tested in the folder multi_target_self_organizing_search.
The proposed global distributed consistency (DC) metric in task allocation is in the folder fuzzy_clustering_metric.

Dependencies tested on

Python 3.7.11

numpy 1.19.1

torch 1.10.2

mpi4py 3.1.3

To run the comparison code of ApeX-DQN, additional dependencies are:

tensorflow 1.15.0

ray 1.10.0

To run the comparison code of MADDPG, additional dependencies are:

https://github.com/openai/multiagent-particle-envs

All the dependencies are listed in the file environment_for_mtsop_fsc2.yml.

Acknowledgements:

The actor-critic codes are mostly from and modified based on

https://github.com/openai/spinningup/tree/master/spinup/algos/pytorch/vpg

The ApeX-DQN codes for comparison are from

The MADDPG codes for comparison are from

pursuitfsc2's People

Contributors

Stargazers

Watchers

Forkers

han-adam yuxin916

pursuitfsc2's Issues

I wonder if this collision detection is equivalent to assigning all_agents_positions_matrix all values to 0I wonder if this collision detection is equivalent to assigning all_agents_positions_matrix all values to 0

collisions = [all_agents_positions_matrix[self.pursuer_layer.get_position(agent_id)[0],
self.pursuer_layer.get_position(agent_id)[1]] - 1
for agent_id in range(self.n_pursuers)]

def get_collision_reward(self):
"""
Added in Version-SOP.
"""
pursuer_position_matrix = self.pursuer_layer.get_state_matrix()

    if self.surround:
        # No collisions are allowed
        evader_position_matrix = self.evader_layer.get_state_matrix()
        all_agents_positions_matrix = pursuer_position_matrix + evader_position_matrix
    else:
        # Collisions with evaders are allowed, while that with other pursuers are prevented.
        all_agents_positions_matrix = pursuer_position_matrix

    collisions = [all_agents_positions_matrix[self.pursuer_layer.get_position(agent_id)[0],
                                              self.pursuer_layer.get_position(agent_id)[1]] - 1
                  for agent_id in range(self.n_pursuers)]

    self.n_agents_collide_with_others_per_multiagent_step = np.count_nonzero(collisions)
    self.n_collision_events_per_multiagent_step = np.count_nonzero(np.maximum(all_agents_positions_matrix - 1, 0))

    collision_rewards = self.collision_reward * np.asarray(collisions)

    self.n_collision_with_obstacles = np.sum([pursuer.collide_with_obstacle for pursuer in self.pursuers])

    return collision_rewards

Multi-agent search stops at the edge of the map

When my agent searches or runs away, it hits the edge and can't run

Qns about experiment result

Hi!
I noticed that you have an experiment log file: /data/stp_rl_pursuer_6_6_4_1_.txt.

This single target pursuit problem set up. world_size = 6, n_pursuers = 4 and n_evaders = 1. May i ask

And how many epoch you recommended for training? the default 1e5?
What is the fov_scope under this scenario?
What is the difference between stp_6_6_4_1_.txt and sop_40_40_4_1_.txt other than the size of map become bigger?

Thank you for your help!

I wonder if this collision detection is equivalent to assigning all_agents_positions_matrix all values to 0

collisions = [all_agents_positions_matrix[self.pursuer_layer.get_position(agent_id)[0], self.pursuer_layer.get_position(agent_id)[1]] - 1
for agent_id in range(self.n_pursuers)]

The train_sos_apexdqn.py file is always in the pendinc state. Procedure

It's always been this way
== Status ==
Current time: 2024-05-27 21:24:51 (running for 00:00:10.13)
Memory usage on this node: 11.9/15.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/5 CPUs, 0/4 GPUs, 0.0/2.56 GiB heap, 0.0/1.28 GiB objects
Result logdir: C:\Users\Jarvis\Desktop\pursuitFSC2-main\multi_target_self_organizing_search\results\sos_apexdqn\ADQN
Number of trials: 1/1 (1 PENDING)
+--------------------------+----------+-------+
| Trial name | status | loc |
|--------------------------+----------+-------|
| APEX_pursuit_79642_00000 | PENDING | |
+--------------------------+----------+-------+

can not find performance_logger.py in multi_target_self_organizing_pursuit Folder

Hi, thank you for this amazing work.

in train_sop_mappo.py and train_sop_mappo_global_state.py, i failed to import PerformanceLogger.

Thank you for your help!

About sos maddpg training time

When running the train_sos_maddpg.py file, after 500episodes the training time is 3 to 10 times more than before, do you have this problem

train_sop_actor_critic.py and train_sop_mappo.py not converge

Hi! Sorry for bothering again.

I have run through few times of experiments when

world_size=6,
fov=11,
n_pursuers=4
n_evaders=1
The evader is "DoNothingPrey".
n_epochs=1000,
steps_per_epoch=8000
max_episode_length = 100
other rl hyperparameter remain unchanged

Under this most simplest case, the converge performance of actor_critic and mappo does not look good and the capture i could not tell is intentional or just randomly catch successful. The statistical result not good.

Could you provide the hyperparameter you used as your actor_critic and mappo demo? To provide more information, i would be appreciate if we can change contact and discuss further. Here is my email address: [email protected].

Again, thank you and best regards.