Open an interactive version of this example on Binder: Binder badge

Stochastic Reachability#

This example shows the stochastic reachability algorithm.

By default, the system is a double integrator (2D stochastic chain of integrators).

To run the example, use the following command:

python examples/reach/stoch_reach.py
[1]:
import gym

import numpy as np

from functools import partial

from gym.envs.registration import make
from sklearn.metrics.pairwise import rbf_kernel

from gym_socks.algorithms.reach.kernel_sr import kernel_sr

from gym_socks.sampling import sample
from gym_socks.sampling import default_sampler
from gym_socks.sampling import grid_sampler
from gym_socks.sampling import repeat

from gym_socks.utils.grid import make_grid_from_space
from gym_socks.utils.grid import make_grid_from_ranges

Configuration variables.

[2]:
system_id = "2DIntegratorEnv-v0"
time_horizon = 5
sigma = 0.1
regularization_param = 1

Generate The Sample#

We demonstrate the algorithm on a simple 2-D integrator system, and sample on a grid within the region of interest. Note that this is a simplification in order to make the result look more “uniform”, but is not specifically required for the algorithm to work.

[3]:
env = make(system_id)

sample_size = 3125  # This number is chosen based on the values below.

sample_space = gym.spaces.Box(
    low=-1.1, high=1.1, shape=env.state_space.shape, dtype=env.state_space.dtype
)

state_sampler = repeat(
    grid_sampler(make_grid_from_space(sample_space=sample_space, resolution=25)), num=5
)

action_sampler = grid_sampler(make_grid_from_ranges([np.linspace(-1, 1, 5)]))

S = sample(
    sampler=default_sampler(
        state_sampler=state_sampler, action_sampler=action_sampler, env=env
    ),
    sample_size=sample_size,
)

We define the target tube and the constraint (safety) tube, which are sequences of bounded sets indexed by time. Here, we define the target tube such that it is between -0.5 and 0.5, and specify the constraint tube to be between -1 and 1.

[4]:
target_tube = [
    gym.spaces.Box(low=-0.5, high=0.5, shape=(2,), dtype=np.float32)
] * time_horizon

constraint_tube = [
    gym.spaces.Box(low=-1, high=1, shape=(2,), dtype=np.float32)
] * time_horizon

# Generate test points.
x1 = np.linspace(-1, 1, 50)
x2 = np.linspace(-1, 1, 50)
T = make_grid_from_ranges([x1, x2])

Algorithm#

We then run the algorithm to compute the safety probabilities at each of the test points for the first-hitting time stochastic reachability problem. We can easily change to the terminal-hitting time problem by changing “FHT” below to “THT”.

[5]:
safety_probabilities = kernel_sr(
    S=S,
    T=T,
    time_horizon=time_horizon,
    constraint_tube=constraint_tube,
    target_tube=target_tube,
    problem="FHT",
    regularization_param=regularization_param,
    kernel_fn=partial(rbf_kernel, gamma=1 / (2 * (sigma ** 2))),
    verbose=False,
)

Results#

We then plot the results for all of the test points. The warmer colors indicate higher safety probabilities.

[6]:
import matplotlib
import matplotlib.pyplot as plt

# Reshape data.
XX, YY = np.meshgrid(x1, x2, indexing="ij")
Z = safety_probabilities[0].reshape(XX.shape)

# Plot flat color map.
fig = plt.figure()
ax = plt.axes()

plt.pcolor(XX, YY, Z, cmap="viridis", shading="auto")
plt.colorbar()

plt.show()
../../_images/examples_reach_stoch_reach_11_0.png