integrator#

ND Integrator system.

An integrator system is an extremely simple dynamical system model, typically used to model a single variable and its higher order derivatives, where the input is applied to the highest derivative term, and is “integrated” upwards.

\[\begin{split}\dot{x} = \begin{bmatrix} 0 & 1 & 0 & \cdots & 0 & 0 \\ 0 & 0 & 1 & \cdots & 0 & 0 \\ 0 & 0 & 0 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 0 & 1 \\ 0 & 0 & 0 & \cdots & 0 & 0 \end{bmatrix} x + \begin{bmatrix} 0 \\ 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix} u + w\end{split}\]
\[\begin{split}x_{t+1} = \begin{bmatrix} 1 & T & \cdots & \frac{T^{N-2}}{(N-2)!} & \frac{T^{N-1}}{(N-1)!} \\ 0 & 1 & \cdots & \frac{T^{N-3}}{(N-3)!} & \frac{T^{N-2}}{(N-2)!} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 1 & T \\ 0 & 0 & \cdots & 0 & 1 \end{bmatrix} x_{t} + \begin{bmatrix} \frac{T^{N}}{N!} \\ \frac{T^{N-1}}{(N-1)!} \\ \vdots \\ \frac{T^{2}}{2} \\ T \end{bmatrix} u_{t} + w_{t}\end{split}\]

A 2D Integrator system, for example, corresponds to the position and velocity components of a variable, where the input is applied to the velocity and then integrates upward to the position variable.

Tip

Chaining two 2D integrator systems can model a system with x/y position and velocity.

\[\begin{split}\dot{x} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix} x + \begin{bmatrix} 0 & 0 \\ 1 & 0 \\ 0 & 0 \\ 0 & 1 \end{bmatrix} u + w\end{split}\]
\[\begin{split}x_{t+1} = \begin{bmatrix} 1 & T & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & T \\ 0 & 0 & 0 & 1 \end{bmatrix} x_{t} + \begin{bmatrix} \frac{T^{2}}{2} & 0 \\ T & 0 \\ 0 & \frac{T^{2}}{2} \\ 0 & T \end{bmatrix} u_{t} + w_{t}\end{split}\]
class gym_socks.envs.integrator.NDIntegratorEnv(*args, **kwargs)[source]#

ND integrator system.

Bases: gym_socks.envs.dynamical_system.DynamicalSystem

Parameters

dim (int) – The dimension of the integrator system.

Example

>>> from gym_socks.envs import NDIntegratorEnv
>>> env = NDIntegratorEnv(dim=2)
>>> env.reset()
>>> for i in range(10):
...     action = env.action_space.sample()
...     obs, reward, done, _ = env.step(action)
action_space = None#

The action (input) space of the system.

close()[source]#

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

cost(time, state, action)[source]#

Cost function for the system.

Warning

This function is typically not used in SOCKS, but is included here for compatibility with OpenAI gym, which returns the cost from the step() function.

dynamics(time, state, action, disturbance)[source]#

Dynamics for the system.

\[\dot{x} = f(x, u, w)\]
Parameters
  • time – The time variable.

  • state – The state of the system at the current time step.

  • action – The control action applied at the current time step.

  • disturbance – A realization of a random variable representing process noise.

Returns

The state of the system at the next time step.

generate_disturbance(time, state, action)[source]#

Generate a disturbance.

Note

Override generate_disturbance() in subclasses to modify the disturbance properties, such as the scale or distribution.

generate_observation(time, state, action)[source]#

Generate an observation from the system.

\[y = h(x, u, v)\]
Parameters
  • time – The time variable.

  • state – The state of the system at the current time step.

  • action – The control action applied at the current time step.

Returns

An observation of the system at the current time step.

Note

Override generate_observation() in subclasses if the system is partially observable. By default, the function returns the system state directly, meaning it is fully observable.

property np_random#

Random number generator.

observation_space = None#

The space of system observations.

Note

The observation space typically only differs from the state space if the system is partially observable. If this is the case, generate_observation() should be defined to return an element of the observation space.

render(mode='human')[source]#

Renders the environment.

This method must be overridden in subclasses in order to enable rendering. Not all environments support rendering.

Parameters

mode – the mode to render with

reset(state=None)[source]#

Reset the system to a random initial condition.

property sampling_time#

Sampling time, in seconds.

seed(seed=None)[source]#

Sets the seed of the random number generator.

Parameters

seed – Integer value representing the random seed.

Returns

The seed of the RNG.

state_space = None#

The state space of the system.

Note

By convention, in controls theory the state space of the system is the set of all possible states that the system can have. OpenAI Gym’s convention is to ignore the underlying state space, opting to use only the observation_space.

step(time=0, action=None)[source]#

Advances the system forward one time step.

Parameters
  • time – Time of the simulation. Used primarily for time-varying systems.

  • action – Action (input) applied to the system at the current time step.

Returns

A tuple (obs, cost, done, info), where obs is the observation vector. Generally, it is the state of the system corrupted by some measurement noise. If the system is fully observable, this is the actual state of the system at the next time step. cost is the cost (reward) obtained by the system for taking action u in state x and transitioning to state y. In general, this is not typically used with DynamicalSystem models. done is a flag to indicate the simulation has terminated. Usually toggled by guard conditions, which terminates the simulation if the system violates certain operating constraints. info is a dictionary containing extra information.

Return type

tuple