Evaluation and Visualization
Advice for developers if needed: Evaluation and visualization
Algorithms in CADIMULC simply represent the causation among variables, for both ground-truth and learning results, as the directed pairs in an adjacency matrix with only two elements 0 and 1.
If you incline to this representation of data structure in your work or research,
then Evaluator
in CADIMULC might provide you convenience
for evaluating the causal graph directly.
Class: Evaluator
cadimulc.utils.evaluation.Evaluator
Given an instance as to causal discovery, the Evaluator
defines the classification
errors between an actual graph and a predicted graph, which is corresponding to,
in the field of machine learning,
the four of the categories within a confusion matrix.
-
TP (True Positives): The number of the estimated directed pairs that are consistent with the true causal pairs. Namely, TP qualifies the correct estimation of causal relations.
-
FP (False Positives): The number of the estimated directed pairs that do not present in the true causal pairs.
-
TN: (True Negatives): The number of the unestimated directed pairs that are consistent with the true causal pairs. TN reflects the correct prediction of unpresented causal relations.
-
FN (False Negatives): The number of the unestimated directed pairs that do present in the true causal pairs.
The only assessment of directed causal relations
The Evaluator
focuses on the assessment of estimated directed pairs (TP)
extracted from an adjacency matrix, treating the rest as unpresented pairs (FP)
relative to the ground-truth.
In other words, Evaluator
in CADIMULC does not explicitly consider bi-directed
pairs or undirected pairs.
---
Primary Method: precision_pairwise
Causal pair precision refers to the proportion of the correctly estimated directed pairs in all the estimated directed pairs (EDP):
The higher the precision, the larger the amount of the causal pairs, compared to EDP, that are identified, without considering the amount of unestimated pairs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
true_graph |
ndarray
|
True causal graph, namely the ground-truth. |
required |
est_graph |
ndarray
|
Estimated causal graph, namely the empirical causal graph. |
required |
Returns:
Name | Type | Description |
---|---|---|
precision |
float
|
Precision of the "causal discovery task". |
Source code in cadimulc\utils\evaluation.py
Primary Method: recall_pairwise
Causal pair recall refers to the proportion of correctly estimated directed pairs in all true causal pairs (TCP):
The higher the recall, the larger the amount of the causal pairs, compared to TCP, that are identified, without considering the amount of incorrectly estimated pairs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
true_graph |
ndarray
|
True causal graph, namely the ground-truth. |
required |
est_graph |
ndarray
|
Estimated causal graph, namely the empirical causal graph. |
required |
Returns:
Name | Type | Description |
---|---|---|
recall |
float
|
Recall of the "causal discovery task". |
Source code in cadimulc\utils\evaluation.py
Primary Method: f1_score_pairwise
Causal pair F1-score, the concordant mean of the precision and recall, represents the global measurement of causal discovery, bring together the advantages from both the precision and recall.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
true_graph |
ndarray
|
True causal graph, namely the ground-truth. |
required |
est_graph |
ndarray
|
Estimated causal graph, namely the empirical causal graph. |
required |
Returns:
Name | Type | Description |
---|---|---|
f1_score |
float
|
F1-score of the "causal discovery task". |
Source code in cadimulc\utils\evaluation.py
Primary Method: evaluate_skeleton
Note
Construction of a network skeleton is the fundamental part relative to the procedure of hybrid-based approaches. CADIMULC also provides simply way to evaluate the causal skeleton. Notice that performance of the hybrid-based approach largely depends on the initial performance of the causal skeleton learning.
The evaluate_skeleton
method evaluates a network skeleton based on an assigned
metric. To this end, available metrics mirroring to the causal pair evaluation
are list as the following:
- Skeleton Precision = TP (of the estimated skeleton) / all estimated edges.
- Skeleton Recall = TP (of the estimated skeleton) / all true edges.
- Skeleton F1-score = (2 * Precision * Recall) / (Precision + Recall).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
true_skeleton |
ndarray
|
True causal skeleton, namely the ground-truth. |
required |
est_skeleton |
ndarray
|
Estimated causal skeleton, namely the empirical causal skeleton. |
required |
metric |
str
|
selective metrics from |
required |
Returns:
Type | Description |
---|---|
float
|
The evaluating value of the causal skeleton in light of the assigned metric. |
Source code in cadimulc\utils\evaluation.py
Secondary Method: get_directed_pairs
Extract directed pairs from a graph.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph |
ndarray
|
An adjacency bool matrix representing the causation among variables. |
required |
Returns:
Name | Type | Description |
---|---|---|
direct_pairs |
list[list]
|
A list whose elements are in form of [parent, child], referring to the causation parent -> child. |
Source code in cadimulc\utils\evaluation.py
Secondary Method: get_pairwise_info
Obtain information related to a given directed graph: (1) number of the directed pairs; (2) parents-child pairing relationships.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph |
ndarray
|
An adjacency bool matrix representing the causation among variables. |
required |
Returns:
Type | Description |
---|---|
(int, dict)
|
|
Source code in cadimulc\utils\evaluation.py
Function: draw_graph_from_ndarray
cadimulc.utils.visualization.draw_graph_from_ndarray(array, graph_type='auto', rename_nodes=None, testing_text=None, save_fig=False, saving_path=None)
Draw the directed or undirected (causal) graph that is in form of adjacency matrix (implementation based on NetworkX).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
array |
ndarray
|
the causal graph (directed) or causal skeleton (indirected) in form of adjacency matrix. |
required |
graph_type |
str
|
use |
'auto'
|
rename_nodes |
list | None
|
Rename the nodes consisting with the column of dataset (n * d). |
None
|
testing_text |
str | None
|
Add simple text to the figure. |
None
|
save_fig |
bool
|
Specify saving a figure or not. Make sure to enter the saving path if you specify |
False
|
saving_path |
str | None
|
The image saving path along with your image file name. e.g. ../file_location/image_file_name. |
None
|
Source code in cadimulc\utils\visualization.py
Running examples
CADIMULC is a light Python repository without sophisticated library API design. Documentation on this page is meant to provide introductory materials of the practical tool as to causal discovery. For running example, please simply check out Quick Tutorials for the straightforward usage in the "micro" workflow of causal discovery.