在 Stable_Baselines3 中重置后如何记录观察结果?

问题描述 投票:0回答:1

我想在使用 SB3 的同时记录训练期间

observation
之后获得的每个
reset

基于 this 问题消息,我决定使用

Monitor
包装器而不是回调。

但是,

Monitor
包装器给了我一个错误。 这是我的代码 -

import gym
from stable_baselines3 import PPO
from stable_baselines3.common.callbacks import BaseCallback

from stable_baselines3.common.monitor import Monitor

class CustomMonitor(Monitor):
    def __init__(self, env, filename=None, allow_early_resets=True, reset_keywords=(), info_keywords=()):
        super(CustomMonitor, self).__init__(env)
        self.reset_observations = []

    def reset(self, **kwargs):
        observation = super(CustomMonitor, self).reset(**kwargs)
        self.reset_observations.append(observation)
        return observation

env = gym.make('LunarLander-v2')
env = CustomMonitor(env)

model = PPO('MlpPolicy', env, verbose=1)
# Train the model
model.learn(total_timesteps=1000000)

# Save the model
model.save("ppo_lunarlander_mutant")


但是,运行后,我收到以下错误 -

Traceback (most recent call last):
  File "minimal_example.py", line 21, in <module>
    model = PPO('MlpPolicy', env, verbose=1)
  File "/home/thoma/anaconda3/envs/wp/lib/python3.8/site-packages/stable_baselines3/ppo/ppo.py", line 109, in __init__
    super().__init__(
  File "/home/thoma/anaconda3/envs/wp/lib/python3.8/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 85, in __init__
    super().__init__(
  File "/home/thoma/anaconda3/envs/wp/lib/python3.8/site-packages/stable_baselines3/common/base_class.py", line 180, in __init__
    assert isinstance(self.action_space, supported_action_spaces), (
AssertionError: The algorithm only supports (<class 'gymnasium.spaces.box.Box'>, <class 'gymnasium.spaces.discrete.Discrete'>, <class 'gymnasium.spaces.multi_discrete.MultiDiscrete'>, <class 'gymnasium.spaces.multi_binary.MultiBinary'>) as action spaces but Discrete(4) was provided

python reinforcement-learning openai-gym stable-baselines
1个回答
0
投票

我应该使用

gymnasium
而不是
gym
。从以下错误中应该可以看出这一点 -

AssertionError: The algorithm only supports (<class 'gymnasium.spaces.box.Box'>, <class 'gymnasium.spaces.discrete.Discrete'>, <class 'gymnasium.spaces.multi_discrete.MultiDiscrete'>, <class 'gymnasium.spaces.multi_binary.MultiBinary'>) as action spaces but Discrete(4) was provided 

也许旧版本的

stable_baselines3
可以与
gym
一起使用,这需要进一步调查

© www.soinside.com 2019 - 2024. All rights reserved.