Tabular Base module
causalai.models.tabular.base
- class causalai.models.tabular.base.BaseTabularAlgo(data: TabularData, prior_knowledge: PriorKnowledge | None = None, **kargs)
- __init__(data: TabularData, prior_knowledge: PriorKnowledge | None = None, **kargs)
- Parameters:
data (TabularData object) -- It contains data.values, a numpy array of shape (observations N, variables D).
prior_knowledge (PriorKnowledge object) -- Specify prior knoweledge to the causal discovery process by either forbidding links/co-parents that are known to not exist, or adding back links/co-parents that do exist based on expert knowledge. See the PriorKnowledge class for more details.
- get_all_parents(target_var: int | str) List
Populates the list using all nodes
- get_candidate_mb(target_var: int | str) List
Populates the list using all the nodes that prior_knowledge allows
- get_candidate_parents(target_var: int | str) List
Populates the list using all the nodes that prior_knowledge allows
- get_parents(pvalue_thres: float = 0.05) Tuple[int | str]
Assuming run() function has been called for a target_var, get_parents function returns the list of parent names that cause the target_var under the given pvalue_thres.
- Parameters:
pvalue_thres (float) -- Significance level used for hypothesis testing (default: 0.05). Candidate parents with pvalues above pvalue_thres are ignored, and the rest are returned as the cause of the target_var.
- Returns:
List of estimated parents.
- Return type:
list
- abstract run(target_var: int | str, pvalue_thres: float = 0.05) ResultInfoTabularSingle | ResultInfoTabularMB
Run causal discovery using the algorithm implemented here
- Parameters:
target_var (int) -- Target variable index or name for which parents need to be estimated.
pvalue_thres (float) -- Significance level used for hypothesis testing (default: 0.05). Candidate parents with pvalues above pvalue_thres are ignored, and the rest are returned as the cause of the target_var.
- Returns:
Dictionary has three keys:
parents or markov_blanket : List of estimated parents or markov blanket.
value_dict : Dictionary of form {var3_name:float, ...} containing the test statistic of a link.
pvalue_dict : Dictionary of form {var3_name:float, ...} containing the p-value corresponding to the above test statistic.
- Return type:
dict
- sort_parents(parents_vals: Dict) Tuple[int | str]
Sort (in descending order) parents according to test statistic values.
- Parameters:
parents_vals (dict) -- Dictionary of form {<var_name>:float, ...} containing the test statistic value of each causal link.
- Returns:
List of form [<var_i_name>, <var_k_name>, ...] containing sorted parents.
- Return type:
list
- class causalai.models.tabular.base.BaseTabularAlgoFull(**kargs)
- __init__(**kargs)
- get_parents(pvalue_thres: float = 0.05, target_var: int | str | None = None) Dict[int | str, Tuple[int | str]]
Assuming run() function has been called, get_parents function returns a dictionary. The keys of this dictionary are the variable names, and the corresponding values are the list of parent names that cause the target variable under the given pvalue_thres.
- Parameters:
pvalue_thres (float) -- This pvalue_thres is the significance level used for hypothesis testing (default: 0.05).
target_var (str or float, optional) -- If specified (must be one of the data variable names), the parents of only this variable are returned as a list, otherwise a dictionary is returned where each key is a target variable name, and the corresponding values is the list of its parents.
- Returns:
Dictionary has D keys, where D is the number of variables. The value corresponding each key is the list of lagged parent names that cause the target variable under the given pvalue_thres.
- Return type:
dict
- run(pvalue_thres: float = 0.05) Dict[int | str, ResultInfoTabularFull]
Run causal discovery using the algorithm implemented here for estimating the causal stength of all potential parents of all the variables.
- Parameters:
pvalue_thres (float) -- Significance level used for hypothesis testing (default: 0.05). Candidate parents with pvalues above pvalue_thres are ignored, and the rest are returned as the cause of the target_var.
- Returns:
Dictionay has D keys, where D is the number of variables. The value corresponding each key is the dictionary output of BaseTabularAlgo.run.
- Return type:
dict