Base Tabular Benchmarking module

causalai.benchmark.tabular.base

class causalai.benchmark.tabular.base.BenchmarkTabularBase(algo_dict: Dict | None = None, kargs_dict: Dict | None = None, num_exp: int = 20, custom_metric_dict: Dict = {}, **kargs)

Base class for the tabular data benchmarking module for both continuous and discrete cases. This class defines methods for aggregating and plotting results, and a method for benchmarking on a user provided list of datasets.

__init__(algo_dict: Dict | None = None, kargs_dict: Dict | None = None, num_exp: int = 20, custom_metric_dict: Dict = {}, **kargs)

Base tabular data benchmarking module

Parameters:
  • algo_dict (Dict) --

    A Python dictionary where keys are names of causal discovery algorithms, and values are the unistantiated class objects for the corresponding algorithm. Note that this class must be inherited from the BaseTabularAlgoFull class that can be found in causalai.models.tabular.base. Crucially, this class constructor must take a TabularData object (found in causalai.data.tabular) as input, and should have a run method which performs the causal discovery and returns a Python dictionary. The keys of this dictionary should be of the form:

    {

    var_name1: {'parents': [par(var_name1)]}, var_name2: {'parents': [par(var_name2)]}

    }

    where par(.) denotes the parent variable name of the argument variable name.

  • kargs_dict (Dict) -- A Python dictionary where keys are names of causal discovery algorithms (same as algo_dict), and the corresponding values contain any arguments to be passed to the run method of the class object specified in algo_dict.

  • num_exp (int) -- The number of independent runs to perform per experiment, each with a different random seed. A different random seed generates a different synthetic graph and data for any given configuration. Note that for use provided data, num_exp is not used.

  • custom_metric_dict (Dict) -- A Python dictionary for specifying custom metrics in addition to the default evaluation metrics calculated for each experiment (precision, recall, F1 score, and time taken). The keys of this dictionary are the names of the user specified metrics, and the corresponding values are callable functions that take as input (graph_est, graph_gt). Here graph_est and graph_gt are the estimated and ground truth causal graph. These graphs are specified as Python Dictionaries, where keys are the children names, and the corresponding values are lists of parent variable names.

aggregate_results(metric_name)

This method aggregates the causal discovery results generated by one of the benchmarking methods (which must be run first), and produces a result mean and a result standard deviation array. Both these arrays have shape (num_algorithms x num_variants), where num_algorithms is the number of causal discovery algorithms specified in the benchmarking module, and num_variants is the number of configurations of the argument being varied (e.g. in benchmark_variable_complexity, the number of variables specified). Note that for the bechmark_custom_dataset, num_variants=1.

Parameters:

metric_name (str) -- String specifying which metric (E.g. Precision) to aggregate from the generated results.

bechmark_custom_dataset(dataset_list, discrete=False)

This module helps evaluate the performance of one or more causal discovery algorithms on user provided data.

Parameters:
  • dataset_list (List[Tuple]) -- The data must be a list of tuples, where each tuple contains the triplet (data_array, var_names, graph_gt), where data_array is a 2D Numpy data array of shape (samples x variables), var_names is a list of variable names, and graph_gt is the ground truth causal graph in the form of a Python dictionary, where keys are the variable names, and the corresponding values are a list of parent names.

  • discrete -- Specify if all the datasets contain discrete or continuous variables. This information is only used to decide whether to standardize the data arrays or not. If discrete is False, all the data arrays are standardized.

plot(metric_name='f1_score', xaxis_mode=1)

This method plots the aggregated results for metric_name. Y-axis is the metric_name, and x-axis can be one of two things-- algorithm names, or the variant values, depending on the specified value of xaxis_mode.

Parameters:
  • metric_name (str) -- String specifying which metric (E.g. Precision) to aggregate from the generated results.

  • xaxis_mode (int) -- Integer (0 or 1) specifying what to plot on the x-axis. When 0, x-axis is algorithm names, and when 1, x-axis is the values of the variant. Variant denotes the configurations of the argument being varied (e.g. in benchmark_variable_complexity, the number of variables).

class causalai.benchmark.tabular.base.BenchmarkContinuousTabularBase(algo_dict: Dict | None = None, kargs_dict: Dict | None = None, num_exp: int = 20, custom_metric_dict: Dict | None = {}, **kargs)

Base class for the tabular data benchmarking module for the continuous case. This class inherits the methods and variables from BenchmarkTabularBase, and defines the dictionaries of default causal discovery algorithms and their default respective arguments.

__init__(algo_dict: Dict | None = None, kargs_dict: Dict | None = None, num_exp: int = 20, custom_metric_dict: Dict | None = {}, **kargs)

Benchmarking module for continuous tabular data.

Parameters:
  • algo_dict (Dict) --

    A Python dictionary where keys are names of causal discovery algorithms, and values are the unistantiated class objects for the corresponding algorithm. Note that this class must be inherited from the BaseTabularAlgoFull class that can be found in causalai.models.tabular.base. Crucially, this class constructor must take a TabularData object (found in causalai.data.tabular) as input, and should have a run method which performs the causal discovery and returns a Python dictionary. The keys of this dictionary should be of the form:

    {

    var_name1: {'parents': [par(var_name1)]}, var_name2: {'parents': [par(var_name2)]}

    }

    where par(.) denotes the parent variable name of the argument variable name.

  • kargs_dict (Dict) -- A Python dictionary where keys are names of causal discovery algorithms (same as algo_dict), and the corresponding values contain any arguments to be passed to the run method of the class object specified in algo_dict.

  • num_exp (int) -- The number of independent runs to perform per experiment, each with a different random seed. A different random seed generates a different synthetic graph and data for any given configuration. Note that for use provided data, num_exp is not used.

  • custom_metric_dict (Dict) -- A Python dictionary for specifying custom metrics in addition to the default evaluation metrics calculated for each experiment (precision, recall, F1 score, and time taken). The keys of this dictionary are the names of the user specified metrics, and the corresponding values are callable functions that take as input (graph_est, graph_gt). Here graph_est and graph_gt are the estimated and ground truth causal graph. These graphs are specified as Python Dictionaries, where keys are the children names, and the corresponding values are lists of parent variable names.

class causalai.benchmark.tabular.base.BenchmarkDiscreteTabularBase(algo_dict: Dict | None = None, kargs_dict: Dict | None = None, num_exp: int = 20, custom_metric_dict: Dict | None = {}, **kargs)

Base class for the tabular data benchmarking module for the discrete case. This class inherits the methods and variables from BenchmarkTabularBase, and defines the dictionaries of default causal discovery algorithms and their default respective arguments.

__init__(algo_dict: Dict | None = None, kargs_dict: Dict | None = None, num_exp: int = 20, custom_metric_dict: Dict | None = {}, **kargs)

Benchmarking module for discrete tabular data.

Parameters:
  • algo_dict (Dict) --

    A Python dictionary where keys are names of causal discovery algorithms, and values are the unistantiated class objects for the corresponding algorithm. Note that this class must be inherited from the BaseTabularAlgoFull class that can be found in causalai.models.tabular.base. Crucially, this class constructor must take a TabularData object (found in causalai.data.tabular) as input, and should have a run method which performs the causal discovery and returns a Python dictionary. The keys of this dictionary should be of the form:

    {

    var_name1: {'parents': [par(var_name1)]}, var_name2: {'parents': [par(var_name2)]}

    }

    where par(.) denotes the parent variable name of the argument variable name.

  • kargs_dict (Dict) -- A Python dictionary where keys are names of causal discovery algorithms (same as algo_dict), and the corresponding values contain any arguments to be passed to the run method of the class object specified in algo_dict.

  • num_exp (int) -- The number of independent runs to perform per experiment, each with a different random seed. A different random seed generates a different synthetic graph and data for any given configuration. Note that for use provided data, num_exp is not used.

  • custom_metric_dict (Dict) -- A Python dictionary for specifying custom metrics in addition to the default evaluation metrics calculated for each experiment (precision, recall, F1 score, and time taken). The keys of this dictionary are the names of the user specified metrics, and the corresponding values are callable functions that take as input (graph_est, graph_gt). Here graph_est and graph_gt are the estimated and ground truth causal graph. These graphs are specified as Python Dictionaries, where keys are the children names, and the corresponding values are lists of parent variable names.