mpschrader / gym-sokoban Goto Github PK

View Code? Open in Web Editor NEW

320.0 320.0 75.0 2.46 MB

Sokoban environment for OpenAI Gym

License: MIT License

Python 99.77% Shell 0.23%

environment gym openai python reinforcement-learning sokoban

gym-sokoban's People

Contributors

Stargazers

Watchers

gym-sokoban's Issues

pip installation not working

Python can't import the pip installation

ModuleNotFoundError: No module named 'gym_sokoban'

maybe a typo?

Feature Request: upgrade from gym to gymnasium

Hi, this repository is currently listed in the gymnasium third party environments but we are cleaning the list up to only include maintained gymnasium-compatible repositories.

Would it be possible for it to be upgraded from gym to gymnasium? Gymnasium is the maintained version of openai gym and is compatible with current RL training libraries (rllib and tianshou have already migrated, and stable-baselines3 will soon).

For information about upgrading and compatibility, see migration guide and gym compatibility. The main difference is the API has switched to returning truncated and terminated, rather than done, in order to give more information and mitigate edge case issues. The documentation explains how to easily convert your code.

How do I forgo procedural level generation and instead overwrite .reset() to pick from a directory of levels?

Integrate boxoban level

Recently DeepMind published 1.000.000 pre-generated levels (Repo). We would love to integrate these levels as a new variation.

can I always assume there would be a frame?

Hey,
Can I assume there would always be a frame to every board? namely, that the right/left-most and upper/lower locations are walls?

Refactoring: TinyWorld as rendering mode, not extra environments

It would be great to refactor the TinyWorld environments to not be their own classes, but to be new rendering modes.

This means two new rendering modes should be introduced:

tiny_rgb_array
tiny_human

action env.step(action) returns done=True even when the target is not reached

Hi, in the Sokoban code, I tried running the code for max steps 150, but I see that the env files have max_steps default to 120 and the done is set to True even when target is not reached.

Is there a way I can define the max_steps to higher value in env? Is there any function or value to be set?

max time steps

Hi!
I would like to know if there is any way of modifying the max time steps so that the enviorment doesnt disappear early.

Add new variations of Sokoban

Possible variations:

Two Player
Push and Pull
Box Specific Targets

Add NOOP action

As for many gym environments, action 0 should map to NOOP (no operation = no movement for a step of the environment).
Some common environment wrapper and agents utilize this option, especially the paper deepmind released the boxoban-levels for:
https://deepmind.com/research/publications/investigation-model-free-planning/

Add Social Preview Image

Cannot run the example

Context:

Install gym-sokoban from the source:
python install -e .
Running the following code in a Jupyter notebook:

import gym 
import gym_sokoban

env = gym.make('Sokoban-v0')
#env.reset()

env.render(mode='human')

action = env.action_space.sample()
observation, reward, done, info = env.step(action)

Error:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[9], line 1
----> 1 env = gym.make('Sokoban-v0')
      2 #env.reset()
      4 env.render(mode='human')

File ~/miniconda3/envs/llama3/lib/python3.9/site-packages/gym/envs/registration.py:640, in make(id, max_episode_steps, autoreset, apply_api_compatibility, disable_env_checker, **kwargs)
    637     render_mode = None
    639 try:
--> 640     env = env_creator(**_kwargs)
    641 except TypeError as e:
    642     if (
    643         str(e).find("got an unexpected keyword argument 'render_mode'") >= 0
    644         and apply_human_rendering
    645     ):

File ~/miniconda3/envs/llama3/lib/python3.9/site-packages/gym_sokoban/envs/sokoban_env_variations.py:14, in __init__(self)

File ~/miniconda3/envs/llama3/lib/python3.9/site-packages/gym_sokoban/envs/sokoban_env.py:48, in SokobanEnv.__init__(self, dim_room, max_steps, num_boxes, num_gen_steps, reset)
     44 self.observation_space = Box(low=0, high=255, shape=(screen_height, screen_width, 3), dtype=np.uint8)
     46 if reset:
     47     # Initialize Room
---> 48     _ = self.reset()
...
    408 else:
    409     # Writing: check that the directory to write to does exist
    410     dn = os.path.dirname(fn)

FileNotFoundError: No such file: '/home/xy/miniconda3/envs/llama3/lib/python3.9/site-packages/gym_sokoban/envs/surface/box.png'
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

After checking the site-package folder, I found there is no gym_sokoban folder. There is only an egg-link file:

gym-sokoban.egg-link

I am not sure how I can fix the above issue and get it running.

cannot re-register id

If I try to run a script twice which imports "gym_sokoban", I get the error "cannot re-register id: Sokoban-v0"

I have to restart my kernel to be able to run it again.

Any suggestions?

Publish to PyPi

Possible Instructions:
https://medium.com/@joel.barmettler/how-to-upload-your-python-package-to-pypi-65edc5fe9c56

Issue running ' env.render(mode='human')'

Hello! Firstly thanks for creating this repo and sokoban package!

I'm running into some initial issues in trying to sanity check the install and environment. When I try to run Random_Sampling.py, I run into an error:
"ImportError: cannot import name 'rendering' from 'gym.envs.classic_control' " , and it errors on line: env.render(mode='human').

Any thoughts on what might be causing the issue or how I could fix it?

Thanks!

long import / build time

how can we make the package load and environment start up quicker?

there are some nested for loops, could we potentially replace or parallelize them?

Readme file needs updating

The readme file for the push and pull variant of the game has the Fixed Target Room IDs instead of the push and pull room ids:
https://github.com/mpSchrader/gym-sokoban/blob/master/docs/variations/PushAndPull.md

environment reward for each step

i read your code and i found that reward = -0.1 + sum_component_rewards. So instead of return 1 for Push Box on Target, env returns 0.9 and instead of return 10 for winning, env returns -0.1+1+10= 10.9

what is successful rate for generating environment?,in Sokoban-v1

Sometime this warning "[SOKOBAN] Runtime Error/Warning: Generated Model with score == 0
[SOKOBAN] Retry . . ." happens. This effects the speed of game.

Starting Rendering always returns not tiny

Update deprecated dependency

scipy.misc.read has been removed in latest scipy
You can just refactor to imageio, as mentioned in the reference.

Runtime issues

Hi,

Thank you for open source this awesome project!

Is there a mechanism for avoiding runtime issue like:

env = gym.make('TinyWorld-Sokoban-small-v0')  # Error
# RuntimeError: Not enough free spots (#3) to place 1 player and 2 boxes.

There is also possible issue when we reset the environment

env = gym.make('TinyWorld-Sokoban-small-v0')
for _ in range(int(1e3)):
    env.reset()
# RuntimeError: Not enough free spots (#3) to place 1 player and 2 boxes.
# or
# RuntimeWarning: Generated Model with score == 0

A simple workaround for the issue with reset could be:

def callback_sokoban_reset(f):
    def callback():
        try:
            return f()
        except (RuntimeWarning, RuntimeError):
            print("[SOKOBAN] Runtime error retry . . .")
            return callback()
    return callback

env = gym.make('TinyWorld-Sokoban-small-v0')
env.reset = callback_sokoban_reset(env.reset)

but there is maybe a more efficient way of dealing with these issues.

Number states of the env

How I can get the number states of env? I try to used policy iteration and value iteration to solve this!

check, if game is done

Hi,
could it be that in file sokoban_env.py function "_check_if_done()" is not considering self.max_steps in evaluation, if a game is done?

New example with image storage and gif generation

Create example that allows to record a game.

Baseline agent for Sokban

Hello,

Is there any RL baseline configuration for Sokoban gym environment. I want to compare to working learning agent on this game in order to see if my approach is doing good enough.

Thank you.

Reward calculation

Hi,
function _calc_reward() in sokoban_env.py is only called, when action is < 4. So if i start training and actions are >= 4 the penalty for steps is 0 until the chosen action is < 4 for the first time. I don´t know if this has any effect in the longrun (e.g. if rewards are calculated correctly in general)

Pip version doesn't support keyword argument (any kwargs)

Pip version gym-Sokoban doesn't support kwargs currently. It took me 1 hour to debug looking at the newest version code on github.
TypeError: __init__() got an unexpected keyword argument 'dim_room'

Success Rate?

Hello

I am amazed by your work. I am wondering if you tested the Sokoban's game on the standard RL method (Q learning, A2C, ec), and wondering if you have success rate for this kind of game?

Can the optimal policy for Sokoban be retrieved from the reverse playing?

I am not really familiar with the code in this Github repository, in particular, with the code that generates the rooms. However, I was told that the optimal policy for each room can be generated from the "reverse playing" algorithm that is used to ensure that the rooms are solvable. So, is this true? If yes, what's the easiest way to do it? (Of course, if I get familiar with the code, I will be able to answer this question, but if you can immediately answer it, that would save me some time: I am looking for some environments where the optimal policy is known, so I was wondering if I could use your code to compute the optimal policy).

mpschrader / gym-sokoban Goto Github PK

gym-sokoban's People

Contributors

Stargazers

Watchers

Forkers

gym-sokoban's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs