leaspy.variables.dag

This module defines the VariablesDAG class used to represent the relationships between the variables of a model.

Classes

VariablesDAG

Directed acyclic graph of symbolic variables used in a leaspy model with efficient topologically sorted bidirectional access.

Module Contents

class VariablesDAG[source]

Bases: collections.abc.Mapping

Directed acyclic graph of symbolic variables used in a leaspy model with efficient topologically sorted bidirectional access.

Parameters:
variablesMapping [ VariableName, VariableInterface]

A dictionary mapping variable names to their specification objects (e.g., IndepVariable, LinkedVariable, etc.).

direct_ancestorsVariablesToFrozenSet

A dictionary mapping each variable name to the set of variable names it directly depends on (i.e., its parents). Use from_dict() class method to reach the natural dependencies of linked variables.

Notes

Internally, this class precomputes:

  • direct_children: inverse of direct_ancestors, mapping each node to its immediate dependents.

  • sorted_variables_names: a topological ordering of variable names (roots to leaves).

  • sorted_children: all transitive children of a node in topological order.

  • sorted_ancestors: all transitive ancestors of a node in topological order.

  • sorted_variables_by_type: variables grouped and ordered by their Python class/type.

In general the VariablesDAG is not a tree because the graph may not be totally connected and may have multiple roots. It is not a multi-tree either since there may be multiple directed paths between two nodes - e.g. logistic_model = f[g, b(g), …]. However, we do assume that there is no cycle in the graph (not checked currently), which is equivalent to be topologically sortable.

In order to improve the efficiency of the algorithms, a node-wise sorted children and ancestors mappings are computed. These mappings are useful to:

  • perform computations and caching of intermediate variable dependencies

  • quickly reset all dependent nodes upon a modification

Finally, we do not store children nor ancestors in a specific node class to avoid cross-references in such nodes.

References

https://en.wikipedia.org/wiki/Directed_acyclic_graph#Computational_problems

Examples

>>> from leaspy.variables import VariablesDAG
>>> from leaspy.variables.specs import IndepVariable, LinkedVariable
>>> d_vars = {
    "x": IndepVariable(),
    "y": LinkedVariable(lambda *, x: -x),
}
>>> dag = VariablesDAG.from_dict(d_vars)
variables: Mapping[VariableName, VariableInterface]

Dictionary of variable names to their corresponding variable specification objects.

direct_ancestors: VariablesToFrozenSet

Mapping of variable names to their direct ancestors.

direct_children: VariablesToFrozenSet

Mapping of variable names to their direct children.

sorted_variables_names: tuple[VariableName, Ellipsis]

Tuple of variable names sorted in topological order (from roots to leaves).

sorted_children: Mapping[VariableName, tuple[VariableName, Ellipsis]]

Mapping of all children of a given variable in a topological order.

sorted_ancestors: Mapping[VariableName, tuple[VariableName, Ellipsis]]

Mapping of all ancestor of a given variable in a topological order.

sorted_variables_by_type: Mapping[Type[VariableInterface], Mapping[VariableName, VariableInterface]]

The sorted variables, but stratified per variable type, to easily access them.

classmethod from_dict(input_dictionary)[source]

Instantiate a new DAG of variables from a dictionary of variables.

This method is using LinkedVariable dependencies as direct ancestors.

Parameters:
input_dictionaryMapping [VariableName , VariableInterface]

The dictionary to use to create the DAG.

Returns:
VariablesDAG

A new instance of VariablesDAG with the computed direct ancestor relationships.

Parameters:

input_dictionary (Mapping[VariableName, VariableInterface])

static compute_topological_order_and_path_matrix(direct_children, direct_ancestors)[source]

Produce a topological sorting of the DAG.

This relies on a modified Kahn’s algorithm to produce a topological sorting of DAG, and the corresponding path matrix as a by-product.

Parameters:
direct_childrenVariablesToFrozenSet

Mapping from each variable to its direct children.

direct_ancestorsVariablesToFrozenSet

Mapping from each variable to its direct ancestors.

Returns:
sorted_nodestuple [VariableName, …]

Nodes in a topological order.

path_matrixtorch.Tensor [bool]

Boolean triangle superior (strict) matrix indicating whether there is a (directed) path between nodes.

Parameters:
Return type:

tuple[tuple[VariableName, Ellipsis], Tensor]

Notes

The algorithm has linear complexity with the O(number of edges + number of nodes). Input nodes are sorted by name in order to have fully reproducible output of the initial order of nodes and edges. (Thus renaming nodes may change the output, due to non-uniqueness of topological order)

static compute_sorted_children_and_ancestors(sorted_nodes, path_matrix)[source]

Produce node-wise topologically sorted children and ancestors from provided nodes.

Parameters:
sorted_nodestuple of VariableName

The sorted nodes.

path_matrixtorch.Tensor

A binary 2D tensor where path_matrix[i, j] == 1 indicates a path from node i to node j. Assumes the row/column indices correspond to the order of sorted_nodes.

Returns:
sorted_childrendict [VariableName, tuple [VariableName, …]]

The sorted children.

sorted_ancestorsdict [VariableName, tuple [VariableName, …]]

The sorted ancestors.

Parameters:
Return type:

tuple[dict[VariableName, tuple[VariableName, Ellipsis]], dict[VariableName, tuple[VariableName, Ellipsis]]]

property individual_variable_names: tuple[VariableName, Ellipsis]

Returns a tuple of variable names corresponding to the individual variables.

Returns:
tuple of VariableName

The individual variable names.

Return type:

tuple[VariableName, Ellipsis]