Tabular Base module

causalai.models.tabular.base

class causalai.models.tabular.base.BaseTabularAlgo(data: TabularData, prior_knowledge: PriorKnowledge | None = None, **kargs)
__init__(data: TabularData, prior_knowledge: PriorKnowledge | None = None, **kargs)
Parameters:
  • data (TabularData object) -- It contains data.values, a numpy array of shape (observations N, variables D).

  • prior_knowledge (PriorKnowledge object) -- Specify prior knoweledge to the causal discovery process by either forbidding links/co-parents that are known to not exist, or adding back links/co-parents that do exist based on expert knowledge. See the PriorKnowledge class for more details.

get_all_parents(target_var: int | str) List

Populates the list using all nodes

get_candidate_mb(target_var: int | str) List

Populates the list using all the nodes that prior_knowledge allows

get_candidate_parents(target_var: int | str) List

Populates the list using all the nodes that prior_knowledge allows

get_parents(pvalue_thres: float = 0.05) Tuple[int | str]

Assuming run() function has been called for a target_var, get_parents function returns the list of parent names that cause the target_var under the given pvalue_thres.

Parameters:

pvalue_thres (float) -- Significance level used for hypothesis testing (default: 0.05). Candidate parents with pvalues above pvalue_thres are ignored, and the rest are returned as the cause of the target_var.

Returns:

List of estimated parents.

Return type:

list

abstract run(target_var: int | str, pvalue_thres: float = 0.05) ResultInfoTabularSingle | ResultInfoTabularMB

Run causal discovery using the algorithm implemented here

Parameters:
  • target_var (int) -- Target variable index or name for which parents need to be estimated.

  • pvalue_thres (float) -- Significance level used for hypothesis testing (default: 0.05). Candidate parents with pvalues above pvalue_thres are ignored, and the rest are returned as the cause of the target_var.

Returns:

Dictionary has three keys:

  • parents or markov_blanket : List of estimated parents or markov blanket.

  • value_dict : Dictionary of form {var3_name:float, ...} containing the test statistic of a link.

  • pvalue_dict : Dictionary of form {var3_name:float, ...} containing the p-value corresponding to the above test statistic.

Return type:

dict

sort_parents(parents_vals: Dict) Tuple[int | str]

Sort (in descending order) parents according to test statistic values.

Parameters:

parents_vals (dict) -- Dictionary of form {<var_name>:float, ...} containing the test statistic value of each causal link.

Returns:

List of form [<var_i_name>, <var_k_name>, ...] containing sorted parents.

Return type:

list

class causalai.models.tabular.base.BaseTabularAlgoFull(**kargs)
__init__(**kargs)
get_parents(pvalue_thres: float = 0.05, target_var: int | str | None = None) Dict[int | str, Tuple[int | str]]

Assuming run() function has been called, get_parents function returns a dictionary. The keys of this dictionary are the variable names, and the corresponding values are the list of parent names that cause the target variable under the given pvalue_thres.

Parameters:
  • pvalue_thres (float) -- This pvalue_thres is the significance level used for hypothesis testing (default: 0.05).

  • target_var (str or float, optional) -- If specified (must be one of the data variable names), the parents of only this variable are returned as a list, otherwise a dictionary is returned where each key is a target variable name, and the corresponding values is the list of its parents.

Returns:

Dictionary has D keys, where D is the number of variables. The value corresponding each key is the list of lagged parent names that cause the target variable under the given pvalue_thres.

Return type:

dict

run(pvalue_thres: float = 0.05) Dict[int | str, ResultInfoTabularFull]

Run causal discovery using the algorithm implemented here for estimating the causal stength of all potential parents of all the variables.

Parameters:

pvalue_thres (float) -- Significance level used for hypothesis testing (default: 0.05). Candidate parents with pvalues above pvalue_thres are ignored, and the rest are returned as the cause of the target_var.

Returns:

Dictionay has D keys, where D is the number of variables. The value corresponding each key is the dictionary output of BaseTabularAlgo.run.

Return type:

dict