monte_carlo#

class gym_socks.algorithms.reach.monte_carlo.MonteCarloSR(num_iterations=None, time_horizon=None, constraint_tube=None, target_tube=None, problem='THT', verbose=False, *args, **kwargs)[source]#

Stochastic reachability using Monte-Carlo.

Computes an approximation of the safety probabilities of the stochastic reachability problem using Monte-Carlo methods.

Parameters
  • num_iterations (int) – Number of Monte-Carlo iterations.

  • time_horizon (int) – Time horizon of the algorithm.

  • constraint_tube (list) – List of spaces or constraint functions. Must be the same length as num_steps.

  • target_tube (list) – List of spaces or target functions. Must be the same length as num_steps.

  • problem (str) – One of {“THT”, “FHT”}. “THT” specifies the terminal-hitting time problem and “FHT” specifies the first-hitting time problem.

  • verbose (bool) – Boolean flag to indicate verbose output.

fit()[source]#
fit_predict(env, policy, T)[source]#

Run the algorithm.

Computes the safety probabilities for the points provided. For each point in T, the algorithm computes a collection of trajectories using the point as the initial condition. Then, we can evaluate the indicator functions for each generated trajectory and the estimated safety probability is the sum of indicators divided by the number of trajectories.

Parameters
Returns

The safety probabilities corresponding to each point. The output is in the form of a 2D-array, where each row corresponds to the points in T and the number of columns corresponds to the number of time steps.

predict()[source]#
gym_socks.algorithms.reach.monte_carlo.monte_carlo_sr(env, policy, T, num_iterations=None, time_horizon=None, constraint_tube=None, target_tube=None, problem='THT', verbose=False)[source]#

Stochastic reachability using Monte-Carlo.

Computes an approximation of the safety probabilities of the stochastic reachability problem using Monte-Carlo methods.

Parameters
  • env (gym_socks.envs.dynamical_system.DynamicalSystem) – The dynamical system model.

  • policy (gym_socks.policies.policy.BasePolicy) – The policy applied to the system during sampling.

  • T (numpy.ndarray) – Points to estimate the safety probabilities at. Should be in the form of a 2D-array, where each row indicates a point.

  • num_iterations (Optional[int]) – Number of Monte-Carlo iterations.

  • constraint_tube (Optional[list]) – List of spaces or constraint functions. Must be the same length as num_steps.

  • target_tube (Optional[list]) – List of spaces or target functions. Must be the same length as num_steps.

  • problem (str) – One of {“THT”, “FHT”}. “THT” specifies the terminal-hitting time problem and “FHT” specifies the first-hitting time problem.

  • verbose (bool) – Boolean flag to indicate verbose output.

  • time_horizon (Optional[int]) –