Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Using transformations and custom environment with own rand_action? #2828

Answered by vmoens
TogetherLiving asked this question in Q&A
Discussion options

I have a custom environment based on torchrl.env.EnvBase. In the corresponding sublasss of EnvBase I have defined a specific rand_action method, which does some masking in the action_space.

Now I want to use torchrl.envs.ObservationNorm which supports the initialization of stats (loc, scale) based on a roll-out with a given number of steps. The roll-out in turn calls rand_action.

I now do observe that when applying the transformation to my environment, the transformations always has the type torchrl.envs.TransformedEnv and thus the init_stats method calls the generic rand_action method - this is not what I want.

Subclassing TransformedEnv does not seem a feasible path because when applying a transformation, then the parent environment always results in a vanilla TransformedEnv (_EEvPostInit).

Question: how can I achieve that the specific rand_action method of my custom_environment is called when doing init_stats for ObservationNorm attached to my base environment?

You must be logged in to vote

Thanks for raising this!

this is not what I want

I guess this is the line where the dummy rollout is gathered:

tensordict = parent.rollout(max_steps=num_iter)

In theory the rand_action of your env should be called unless you have a transform that affects the action:

def rand_action(self, tensordict: TensorDictBase | None = None) -> TensorDict:
if type(self.base_env).rand_action is not EnvBase.rand_action:
# TODO: this will fail if the transform modifies the input.
# For instance, if an env overrides rand_action …

Replies: 1 comment

Comment options

Thanks for raising this!

this is not what I want

I guess this is the line where the dummy rollout is gathered:

tensordict = parent.rollout(max_steps=num_iter)

In theory the rand_action of your env should be called unless you have a transform that affects the action:

def rand_action(self, tensordict: TensorDictBase | None = None) -> TensorDict:
if type(self.base_env).rand_action is not EnvBase.rand_action:
# TODO: this will fail if the transform modifies the input.
# For instance, if an env overrides rand_action and we build a
# env = PendulumEnv().append_transform(ActionDiscretizer(num_intervals=4))
# env.rand_action will NOT have a discrete action!
# Getting a discrete action would require coding the inverse transform of an action within
# ActionDiscretizer (ie, float->int, not int->float).
# We can loosely check that the action_spec isn't altered - that doesn't mean the action is
# intact but it covers part of these alterations.
#
# The following check may be expensive to run and could be cached.
if self.full_action_spec != self.base_env.full_action_spec:
raise RuntimeError(
f"The rand_action method from the base env {self.base_env.__class__.__name__} "
"has been overwritten, but the transforms appended to the environment modify "
"the action. To call the base env rand_action method, we should then invert the "
"action transform, which is (in general) not doable. "
f"The full action spec of the base env is: {self.base_env.full_action_spec}, \n"
f"the full action spec of the transformed env is {self.full_action_spec}."
)
return self.base_env.rand_action(tensordict)
return super().rand_action(tensordict)

Do you have such transform?

Subclassing TransformedEnv does not seem a feasible path because when applying a transformation, then the parent environment always results in a vanilla TransformedEnv (_EEvPostInit).

Ah never thought about that. This kind of things is a bit hard to handle, because you could have a TransformedEnv subclass that is pre-built with a defined transform (e.g. ObsNormEnv(TransformedEnv) which would always have an ObseervationNorm transform). Then the parent env of your transform should not be a ObsNormEnv instance!

You must be logged in to vote
0 replies
Answer selected by TogetherLiving
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.