motornet.environments#
- class motornet.environment.Environment(effector, q_init=None, name: str = 'Env', differentiable: bool = True, max_ep_duration: float = 1.0, action_noise: float = 0.0, obs_noise: float | list = 0.0, action_frame_stacking: int = 0, proprioception_delay: float | None = None, vision_delay: float | None = None, proprioception_noise: float = 0.0, vision_noise: float = 0.0, **kwargs)#
Bases:
Env
,Module
Base class for environments.
- Parameters:
effector –
motornet.effector.Effector
object class or subclass. This is the effector that will evolve in the environment.q_init – Tensor or numpy.ndarray, the desired initial joint states for the environment, if a single set of pre-defined initial joint states is desired. If None, the initial joint states will be drawn from the
motornet.nets.layers.Network.get_initial_state
method at each call ofgenerate()
. This parameter will be ignored ongenerate()
calls where a joint_state is provided as input argument.name – String, the name of the environment object instance.
differentiable – Boolean, whether the environment will be differentiable or not. This will usually be useful for reinforcement learning, where the differentiability is not needed.
max_ep_duration – Float, the maximum duration of an episode, in seconds.
action_noise – Float, the standard deviation of the Gaussian noise added to the action input at each step of the simulation.
obs_noise – Float or list, the standard deviation of the Gaussian noise added to the observation vector at each step of the simulation. If this is a list, it should have as many elements as the observation vector and will indicate the standard deviation of each observation element independently.
action_frame_stacking – Integer, the number of past action steps to add to the observation vector.
proprioception_delay – Float, the delay in seconds for the proprioceptive feedback to be added to the observation vector. If None, no delay will occur.
vision_delay – Float, the delay in seconds for the visual feedback to be added to the observation vector. If None, no delay will occur.
proprioception_noise – Float, the standard deviation of the Gaussian noise added to the proprioceptive feedback at each step of the simulation.
vision_noise – Float, the standard deviation of the Gaussian noise added to the visual feedback at each step of the simulation.
**kwargs – This is passed as-is to the
torch.nn.Module
parent class.
- apply_noise(loc, noise: float | list) Tensor #
Applies element-wise Gaussian noise to the input loc.
- Parameters:
loc – input on which the Gaussian noise is applied, which in probabilistic terms make it the mean of the Gaussian distribution.
noise – Float or list, the standard deviation (spread or “width”) of the distribution. Must be non-negative. If this is a list, it must contain as many elements as the second axis of `loc, and the Gaussian distribution for each column of loc will have a different standard deviation. Note that the elements within each column of loc will still be independent and identically distributed (i.i.d.).
- Returns:
A noisy version of loc as a tensor.
- detach(x)#
- get_attributes()#
Gets all non-callable attributes declared in the object instance, excluding gym.spaces.Space attributes, the effector, muscle, and skeleton attributes.
- Returns:
A list of attribute names as string elements.
A list of attribute values.
- get_obs(action=None, deterministic: bool = False) Tensor | ndarray #
Returns a (batch_size, n_features) tensor containing the (potientially time-delayed) observations. By default, this is the task goal, followed by the output of the
get_proprioception()
method, the output of theget_vision()
method, and finally the lastaction_frame_stacking
action sets, if a non-zero action_frame_stacking keyword argument was passed at initialization of this class instance. .i.i.d. Gaussian noise is added to each element in the tensor, using theobs_noise
attribute.
- get_proprioception() Tensor #
Returns a (batch_size, n_features) tensor containing the instantaneous (non-delayed) proprioceptive feedback. By default, this is the normalized muscle length for each muscle, followed by the normalized muscle velocity for each muscle as well. .i.i.d. Gaussian noise is added to each element in the tensor, using the
proprioception_noise
attribute.
- get_save_config()#
Gets the environment object’s configuration as a dictionary.
- Returns:
A dictionary containing the parameters of the environment’s configuration. All parameters held as non-callable attributes by the object instance will be included in the dictionary, excluding gym.spaces.Space attributes, the effector, muscle, and skeleton attributes.
- get_vision() Tensor #
Returns a (batch_size, n_features) tensor containing the instantaneous (non-delayed) visual feedback. By default, this is the cartesian position of the end-point effector, that is, the fingertip. .i.i.d. Gaussian noise is added to each element in the tensor, using the
vision_noise
attribute.
- joint2cartesian(joint_states)#
Shortcut to
motornet.effector.Effector.joint2cartesian()
method.
- print_attributes()#
Prints all non-callable attributes declared in the object instance, excluding gym.spaces.Space attributes, the effector, muscle, and skeleton attributes.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None)#
Initialize the task goal and
effector
states for a (batch of) simulation episode(s). Theeffector
states (joint, cartesian, muscle, geometry) are initialized to be biomechanically compatible with each other. This method is likely to be overwritten by any subclass to implement user-defined computations, such as defining a custom initial goal or initial states.- Parameters:
seed – Integer, the seed that is used to initialize the environment’s PRNG (np_random). If the environment does not already have a PRNG and
seed=None
(the default option) is passed, a seed will be chosen from some source of entropy (e.g. timestamp or /dev/urandom). However, if the environment already has a PRNG andseed=None
is passed, the PRNG will not be reset. If you pass an integer, the PRNG will be reset even if it already exists. Usually, you want to pass an integer right after the environment has been initialized and then never again.options – Dictionary, optional kwargs specific to motornet environments. This is mainly useful to pass batch_size, joint_state, and deterministic kwargs if desired, as described below.
- Options:
batch_size: Integer, the desired batch size. Default: 1.
joint_state: The joint state from which the other state values are inferred. If None, the q_init value declared during the class instantiation will be used. If q_init is also None, random initial joint states are drawn, from which the other state values are inferred. Default: None.
deterministic: Boolean, whether observation, proprioception, and vision noise are applied. Default: False.
- Returns:
The observation vector as tensor or numpy.ndarray, if the
Environment
is set as differentiable or not, respectively. It has dimensionality (batch_size, n_features).A dictionary containing the initial step’s information.
- step(action: Tensor | ndarray, deterministic: bool = False, **kwargs) tuple[Tensor | ndarray, bool, bool, dict[str, Any]] #
Perform one simulation step. This method is likely to be overwritten by any subclass to implement user-defined computations, such as reward value calculation for reinforcement learning, custom truncation or termination conditions, or time-varying goals.
- Parameters:
action – Tensor or numpy.ndarray, the input drive to the actuators.
deterministic – Boolean, whether observation, action, proprioception, and vision noise are applied.
**kwargs – This is passed as-is to the
motornet.effector.Effector.step()
call. This is maily useful to passkwargs. (endpoint_load or joint_load) –
- Returns:
The observation vector as tensor or numpy.ndarray, if the
Environment
is set as differentiable or not, respectively. It has dimensionality (batch_size, n_features).A numpy.ndarray with the reward information for the step, with dimensionality (batch_size, 1). This is None if the
Environment
is set as differentiable. By default this always returns 0. in theEnvironment
.A boolean indicating if the simulation has been terminated or truncated. If the
Environment
is set as differentiable, this returns True when the simulation time reaches max_ep_duration provided at initialization.A boolean indicating if the simulation has been truncated early or not. This always returns False if the
Environment
is set as differentiable.A dictionary containing this step’s information.
- update_obs_buffer(action=None)#
- class motornet.environment.RandomTargetReach(*args, **kwargs)#
Bases:
Environment
A reach to a random target from a random starting position.
- Parameters:
network –
motornet.nets.layers.Network
object class or subclass. This is the network that will perform the task.name – String, the name of the task object instance.
deriv_weight – Float, the weight of the muscle activation’s derivative contribution to the default muscle L2 loss.
**kwargs – This is passed as-is to the parent
Task
class.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[Any, dict[str, Any]] #
Uses the
Environment.reset()
method of the parent classEnvironment
that can be overwritten to change the returned data. Here the goals (i.e., the targets) are drawn from a random uniform distribution across the full joint space.