pyrca.simulation package

Simulated Data Generation

class pyrca.simulation.data_gen.DAGGenConfig(num_node=20, num_edge=30, rng=None)

Bases: object

The configuration class of generating DAG.

Parameters
  • num_node (int) – Number of nodes in DAG.

  • num_edge (int) – Number of edges in DAG.

  • rng (Optional[Generator]) – A random generator.

num_node: int = 20
num_edge: int = 30
rng: Generator = None
class pyrca.simulation.data_gen.DAGGen(config)

Bases: object

The DAG generation.

config_class

alias of DAGGenConfig

gen()

Generate a directed acyclic graphs with a single end. The first node with index of 0 is the only end that does not have results.

Returns

A matrix, where matrix[i, j] != 0 means i is the cause of j.

class pyrca.simulation.data_gen.DataGenConfig(dag, noise_type=None, func_type=None, num_samples=5000, weight_generator='normal')

Bases: object

The configuration class of generating RCA Data.

Parameters
  • dag (ndarray) – The dependency graph.

  • noise_type (Optional[str]) – probability distribution of each node’s noise, it is required to be in valid_noise type list.

  • func_type (Optional[str]) – causal function form of each node, it is required to be in valid_noise type list.

  • num_samples (int) – Number of samples.

  • weight_generator (str) – random generator for model weights.

dag: ndarray
noise_type: str = None
func_type: str = None
num_samples: int = 5000
weight_generator: str = 'normal'
class pyrca.simulation.data_gen.DataGen(config)

Bases: object

Normal data generation.

Generates data (n_samples, n_nodes) according to DAG matrix.

config_class

alias of DataGenConfig

gen()

For each node

xi=xjPa(xi)Aijfi(xj)+βinoisei

where f_i indicates element-wise transformation, it is chosen from identity, square, sin, tanh, noise_i is chosen from Exponential(1), Normal(0,1), Uniform(-0.5,0.5), Laplace(0, 1) Both the weights of A_{ij} and beta_i are chosen from _uniform_weight or _normal_weight.

Returns

  • data: (num_samples, num_node) data of each variable x_i.

  • parent_weights: (num_node, num_node) Combination weights of each variable A_{ij}.

  • noise_weights: (num_node,) noise weight for of each variable beta_i.

class pyrca.simulation.data_gen.AnomalyDataGenConfig(parent_weights, noise_weights, noise_type, func_type, baseline, threshold, num_samples=5000, anomaly_type=0, weight_generator='normal')

Bases: object

The configuration class of generating RCA Data.

Parameters
  • parent_weights (array) – The weights of parents of each node.

  • noise_weights (array) – The noise weights of each node.

  • noise_type (str) – probability distribution of each node’s noise, it is required to be in valid_noise type list.

  • func_type (str) – causal function form of each node, it is required to be in valid_noise type list.

  • baseline (float) – baseline of normal data.

  • threshold (float) – threshold to differentiate anomaly data from stats-based anomaly detector.

  • num_samples (int) – Number of anomaly samples.

  • anomaly_type (str) – 0 change the weight of noise term, 1 add a constant shift to noise term, 2 change the weight of parent nodes.

  • weight_generator (str) – random generator for model weights.

parent_weights: array
noise_weights: array
noise_type: str
func_type: str
baseline: float
threshold: float
num_samples: int = 5000
anomaly_type: str = 0
weight_generator: str = 'normal'
class pyrca.simulation.data_gen.AnomalyDataGen(config)

Bases: object

Anomaly data generation.

Generates anomaly data (n_samples, n_nodes).

config_class

alias of AnomalyDataGenConfig

gen()

Generate anomaly data by randomly choose the root cause nodes.

Returns

  • data: (num_samples, num_node) data of each variable x_i in anomaly phase.

  • root causes: (num_node) root causes of anomaly data.