Tabular Causal Inference module

causalai.models.tabular.causal_inference

class causalai.models.tabular.causal_inference.CausalInference(data: ndarray, var_names: List[str | int], causal_graph: Dict[int | str, Tuple[int | str]], prediction_model=None, use_multiprocessing: bool = False, discrete: bool = False, method: str = 'causal_path')

This class implements causal inference for tabular data, for both continuous and discrete data. Specifically, it supports average treatment effect (ATE) and conditional ATE. To perform causal inference, this class requires the observational data, causal graph for the data, a prediction model of choice which is used for learning the mapping between variables in the causal graph, and specifying whether the data is discrete or continuous. This class also supports the use of multi-processing to speed up computation. Typically multi-processing is only helpful when the size of the relevant graph (depending on the treatment variables and the target variables) is large (mode than 10) or when the prediction model is heavy (e.g. MLP).

__init__(data: ndarray, var_names: List[str | int], causal_graph: Dict[int | str, Tuple[int | str]], prediction_model=None, use_multiprocessing: bool = False, discrete: bool = False, method: str = 'causal_path')
Parameters:
  • data (ndarray) -- The observational data of size (N,D) where N is the number of observations and D is the number of variables.

  • var_names (list) -- List of variable names. The number of variables must be the same as the number of columns in data.

  • causal_graph (dict) -- The underlyig causal graph for the given data array. causal_graph is a dictionary with variable names as keys and the list of parent nodes of each key as the corresponding values.

  • prediction_model (model class) -- A model class (e.g. Sklearn`s LinearRegression) that has fit and predict method. Do not pass an instantiated class object, rather an uninstantiated one. None may be specified when discrete=True, in which case our default prediction model for discrete data is used. Otherwise, For data with linear dependence between variables, typically Sklearn`s LinearRegression works, and for non-linear dependence, Sklearn`s MLPRegressor works.

  • use_multiprocessing (bool) -- If true multi-processing is used to speed up computation.

  • discrete (bool) -- Set to true if the intervention variables discrete. Non-intervetion variables are expected to be continuous. Note that the states for a discrete variable must take value in [0,1,...K-1], where K is the number of states for that variable. Discrete variables may have different number of states.

  • method (str) -- The method used to estimate the causal effect of interventions. The supported options are 'causal_path' and 'backdoor'. See the functions ate_causal_path and ate_backdoor for details.

ate(target_var: int | str, treatments: TreatmentInfo | List[TreatmentInfo]) Tuple[float, ndarray, ndarray]

Mathematically Average Treatmet Effect (ATE) is expressed as,

𝙰𝚃𝙴 = 𝔼[𝑌|𝚍𝚘(𝑋=𝑥𝑡)]−𝔼[𝑌|𝚍𝚘(𝑋=𝑥𝑐)]

where 𝚍𝚘 denotes the intervention operation. In words, ATE aims to determine the relative expected difference in the value of 𝑌 when we intervene 𝑋 to be 𝑥𝑡 compared to when we intervene 𝑋 to be 𝑥𝑐. Here 𝑥𝑡 and 𝑥𝑐 are respectively the treatment value and control value.

Parameters:
  • target_var (int or str) -- Specify the name of the target variable of interest on which the effect of the treatment is to be estimated.

  • treatments (dict or list of dict) -- Each treatment is specified as a dictionary in which the keys are var_name, treatment_value, control_value. The value of var_name is a str or int depending on var_names specified during class object creation, and treatment_value and control_value are 1D arrays of length equal to the number of observations in data (specified during class object creation).

Returns:

Returns a tuple of 3 items:

  • ate: The average treatment effect on target_var.

  • y_treat: The individual effect of treatment value for each observation.

  • y_control: The individual effect of control value for each observation.

Return type:

float, ndarray, ndarray

ate_backdoor(target_var: int | str, treatments: TreatmentInfo | List[TreatmentInfo]) Tuple[float, ndarray, ndarray]

This method finds all the backdoor sets for given interventions and target variables. We then learn a single conditional model P(y | X, Z), where y is the target variable, X is the set of intervention variables, and Z is the backdoor adjustment set. We then estimate the causal effect of the intervention on X using the following result (Theorem 1) from Pearl 1995 (Causal diagrams for empirical research): P(y | do(X)) = sum_z P(y | X, Z). P(Z) ~= 1/N . (P(y | X, Z)) where N is the number of samples in the observational data, and X takes the intervention values (i,e, the treatment or control values).

Backdoor criterion: Given a DAG, and an ordered variable pair (X,Y), a subset Z of variables in the DAG satisfies the backdoor criterion w.r.t. (X,Y) if Z does not contain a descendant of X, Z does not contain any colliders, and Z blocks all paths between X and Y that contain a arrow pointing into X. If Z contains colliders, then for each path, we additionally condition on at least one of its parents, or one of their descendants, in that path. This is because conditioning on colliders open up paths rather than blocking them. For multiple intervention variables, see the function find_approx_minimal_backdoor_set below.

We find this estimator to have a higher variance and lower performance compared to the causal_path method. However, this method is more efficient in general, since we only need to fit a single model.

Parameters:
  • target_var (int or str) -- Specify the name of the target variable of interest on which the effect of the treatment is to be estimated.

  • treatments (dict or list of dict) -- Each treatment is specified as a dictionary in which the keys are var_name, treatment_value, control_value. The value of var_name is a str or int depending on var_names specified during class object creation, and treatment_value and control_value are 1D arrays of length equal to the number of observations in data (specified during class object creation).

Returns:

Returns a tuple of 4 items:

  • ate: The average treatment effect on target_var.

  • y_treat: The individual effect of treatment value for each observation.

  • y_control: The individual effect of control value for each observation.

  • valid_backdoor: Boolean value specifying if a valid backdoor set was found or not.

Return type:

float, ndarray, ndarray

ate_causal_path(target_var: int | str, treatments: TreatmentInfo | List[TreatmentInfo]) Tuple[float, ndarray, ndarray]

In this implementation, we learn a set of relevant conditional models that are together able to simulate the data generating process, and we then use this process to estimate ATE by performing interventions explicitly in this process, using the learned models. For instance, consider a simple causal graph with three variables A,B,C: A->B->C. If we wanted to estimate the causal effect of intervetion on A, on the target variable C, then in this estimator, we first fit two conditional model P(B|A) and P(C|B), using the given observational data. We then replace the intervention variable (A) in the observation data with the given intervention values (treatment and control values) and form 2 different datasets this way. We then perform inference using the learned models on this intervention data along the causal path to estimate the effect of the interventions on A, on C. Specifically, we first estimate B_treat using P(B|A=A_treat). We then use this B_treat to estimate C_treat using P(C|B=B_treat). We similarly compute B_control and C_control using A_control. We then estimate ATE as the mean of (C_treat - C_control).

We find this estimator to have a lower variance and better performance compared to the backdoor adjustment set method. However, this method may be more computationally expensive in general, due to the need to learn multiple models, depending on the given causal graph.

Parameters:
  • target_var (int or str) -- Specify the name of the target variable of interest on which the effect of the treatment is to be estimated.

  • treatments (dict or list of dict) -- Each treatment is specified as a dictionary in which the keys are var_name, treatment_value, control_value. The value of var_name is a str or int depending on var_names specified during class object creation, and treatment_value and control_value are 1D arrays of length equal to the number of observations in data (specified during class object creation).

Returns:

Returns a tuple of 3 items:

  • ate: The average treatment effect on target_var.

  • y_treat: The individual effect of treatment value for each observation.

  • y_control: The individual effect of control value for each observation.

Return type:

float, ndarray, ndarray

cate(target_var: int | str, treatments: TreatmentInfo | List[TreatmentInfo], conditions: Tuple[ConditionalInfo] | ConditionalInfo, condition_prediction_model=None) float

Mathematically Conditional Average Treatmet Effect (CATE) is expressed as,

𝙲𝙰𝚃𝙴 = 𝔼[𝑌|𝚍𝚘(𝑋=𝑥𝑡),𝐶=𝑐]−𝔼[𝑌|𝚍𝚘(𝑋=𝑥𝑐),𝐶=𝑐]

where 𝚍𝚘 denotes the intervention operation. In words, CATE aims to determine the relative expected difference in the value of 𝑌 when we intervene 𝑋 to be 𝑥𝑡 compared to when we intervene 𝑋 to be 𝑥𝑐, where we condition on some set of variables 𝐶 taking value 𝑐. Notice here that 𝑋 is intervened but 𝐶 is not. Here 𝑥𝑡 and 𝑥𝑐 are respectively the treatment value and control value.

Parameters:
  • target_var (int or str) -- Specify the name of the target variable of interest on which the effect of the treatment is to be estimated.

  • treatments (dict or list of dict) -- Each treatment is specified as a dictionary in which the keys are var_name, treatment_value, control_value. The value of var_name is a str or int depending on var_names specified during class object creation, and treatment_value and control_value are 1D arrays of length equal to the number of observations in data (specified during class object creation).

  • conditions (dict or list of dict) -- Each condition is specified as a dictionary in which the keys are var_name, and condition_value. The value of var_name is a str or int depending on var_names specified during class object creation, and condition_value is a scalar value (float for continuous data and integer for discrete data).

  • condition_prediction_model (model class) -- A model class (e.g. Sklearn`s LinearRegression) that has fit and predict method. Do not pass an instantiated class object, rather an uninstantiated one. None may be specified when discrete=True, in which case our default prediction model for discrete data is used. Otherwise, For data with linear dependence between variables, typically Sklearn`s LinearRegression works, and for non-linear dependence, Sklearn`s MLPRegressor works.

Returns:

Returns CATE-- The conditional average treatment effect on target_var.

Return type:

float

counterfactual(sample: ndarray, target_var: int | str, interventions: Dict, counterfactual_prediction_model=None)

Counterfactuals aim at estimating the effect of an intervention on a specific instance or sample. Suppose we have a specific instance of a system of random variables (𝑋1,𝑋2,...,𝑋𝑁) given by (𝑋1=𝑥1,𝑋2=𝑥2,...,𝑋𝑁=𝑥𝑁) , then in a counterfactual, we want to know the effect an intervention (say) 𝑋1=𝑘 would have had on some other variable(s) (say 𝑋2), holding all the remaining variables fixed. Mathematically, this can be expressed as,

𝙲𝚘𝚞𝚗𝚝𝚎𝚛𝚏𝚊𝚌𝚝𝚞𝚊𝚕 = 𝑋2|𝚍𝚘(𝑋1=𝑘),𝑋3=𝑥3,𝑋4=4,⋯,𝑋𝑁=𝑥𝑁

Parameters:
  • sample (ndarray) -- A 1D array of data sample where the ith index corresponds to the ith variable name in var_names (specified in the causal inference object constructor).

  • target_var (int or str) -- Specify the name of the target variable of interest on which the effect of the treatment is to be estimated.

  • interventions (dict) -- A dictionary in which keys are var_names, and the corresponding values are the scalar interventions.

  • counterfactual_prediction_model (model class) -- A model class (e.g. Sklearn`s LinearRegression) that has fit and predict method. Do not pass an instantiated class object, rather an uninstantiated one.

Returns:

Returns the counterfactual on the given sample for the specified interventions.

Return type:

float