Graph Components
This module is responsible for defining the modular building-block objects that can be composed to create graph representations of pipelines.
- class easylink.graph_components.InputSlot(name, env_var, validator)[source]
Bases:
objectA single input slot to a specific node.
InputSlotsrepresent distinct semantic categories of input files, between which a node must be able to differentiate. In order to pass data between nodes, anInputSlotof one node can be connected to anOutputSlotof another node via anEdgeParamsinstance.Notes
Nodes can be either
StepsorImplementations.-
env_var:
str|None The environment variable that is used to pass a list of data filepaths to an
Implementation.
-
validator:
Callable[[str],None] |None A function that validates the input data being passed into the pipeline via this
InputSlot. If the data is invalid, the function should raise an exception with a descriptive error message which will then be reported to the user. Note that the function *must* be defined in theeasylink.utilities.validation_utilsmodule!
-
env_var:
- class easylink.graph_components.OutputSlot(name)[source]
Bases:
objectA single output slot from a specific node.
Outputslotsrepresent distinct semantic categories of output files, between which a node must be able to differentiate. In order to pass data between nodes, anOutputSlotof one node can be connected to anInputSlotof another node via anEdgeParamsinstance.Notes
Nodes can be either
StepsorImplementations.Input data is validated via the
InputSlotrequiredvalidatorattribute. In order to prevent multiple validations of the same files (since outputs of one node can be inputs to another), no such validator is stored here on theOutputSlot.- Parameters:
name (str)
- class easylink.graph_components.EdgeParams(source_node, target_node, output_slot, input_slot, filepaths=None)[source]
Bases:
objectThe details of an edge between two nodes in a graph.
EdgeParamsconnect theOutputSlotof a source node to theInputSlotof a target node.Notes
Nodes can be either
StepsorImplementations.- Parameters:
-
filepaths:
tuple[str] |None= None The filepaths that are passed from the source node to the target node.
- class easylink.graph_components.StepGraph(*args, backend=None, **kwargs)[source]
Bases:
MultiDiGraphA directed acyclic graph (DAG) of
Steps.StepGraphsare DAGs withStepnames for nodes and their correspondingStepinstances as attributes on those nodes. The file dependencies between nodes are the graph edges; multiple edges between nodes are permitted.Notes
These are high-level abstractions; they represent a conceptual pipeline graph with no detail as to how each
Stepis implemented.The largest
StepGraphis that of the entirePipelineSchema.- add_edge_from_params(edge_params)[source]
Adds a new edge to the
StepGraph.- Return type:
- Parameters:
edge_params (EdgeParams) – The details of the new edge to be added to the graph.
- class easylink.graph_components.ImplementationGraph(*args, backend=None, **kwargs)[source]
Bases:
MultiDiGraphA directed acyclic graph (DAG) of
Implementations.ImplementationGraphsare DAGs withImplementationsfor nodes and the file dependencies between them for edges. Self-edges as well as multiple edges between nodes are permitted.Notes
An
ImplementationGraphis a low-level abstraction; it represents the actual implementations of eachStepin a pipeline. This is in contrast to aStepGraph, which can be an intricate nested structure due to the various complex and self-similarStepinstances (which represent abstract operations such as “loop this step N times”). AnImplementationGraphis the flattened and concrete graph ofImplementationsin a given pipeline.The largest
ImplementationGraph(that is, the specificImplementationGraphrepresenting the entire pipeline to be run) is that of thePipelineGraph.- property splitter_nodes: list[str]
The topologically sorted list of splitter nodes (which have no implementations).
- property aggregator_nodes: list[str]
The topologically sorted list of aggregator nodes (which have no implementations).
- property implementations: list[Implementation]
The topologically sorted list of all
Implementationsin the graph.
- add_node_from_implementation(node_name, implementation)[source]
Adds a new node to the
ImplementationGraph.- Return type:
- Parameters:
node_name – The name of the new node.
implementation (Implementation) – The
Implementationto add to the graph as a new node.
- add_edge_from_params(edge_params)[source]
Adds a new edge to the
ImplementationGraph.- Return type:
- Parameters:
edge_params (EdgeParams) – The details of the new edge to be added to the graph.
- class easylink.graph_components.SlotMapping(parent_slot, child_node, child_slot)[source]
Bases:
ABCA mapping between a slot on a parent node and a slot on one of its child nodes.
SlotMappingis an interface intended to be used by concreteInputSlotMappingandOutputSlotMappingclasses to represent a mapping between parent and child nodes at different levels of a potentially-nested graph. Specifically, they are used to (1) remap edges between parent and child nodes in aPipelineSchemaand (2) map a leafStep'sslots to the correspondingImplementationslots when building theImplementationGraph.Notes
Nodes can be either
StepsorImplementations.- _abc_impl = <_abc._abc_data object>
- abstract remap_edge(edge)[source]
Remaps an edge to connect the parent and child nodes.
- Return type:
- Parameters:
edge (EdgeParams)
- class easylink.graph_components.InputSlotMapping(parent_slot, child_node, child_slot)[source]
Bases:
SlotMappingA mapping between
InputSlotsof a parent node and a child node.- _abc_impl = <_abc._abc_data object>
- remap_edge(edge)[source]
Remaps an edge’s
InputSlot.- Return type:
- Parameters:
edge (EdgeParams) – The edge to remap.
- Returns:
The details of the remapped edge.
- Raises:
ValueError – If the parent slot does not match the input slot of the edge.
- class easylink.graph_components.OutputSlotMapping(parent_slot, child_node, child_slot)[source]
Bases:
SlotMappingA mapping between
InputSlotsof a parent node and a child node.- _abc_impl = <_abc._abc_data object>
- remap_edge(edge)[source]
Remaps an edge’s
OutputSlot.- Return type:
- Parameters:
edge (EdgeParams) – The edge to remap.
- Returns:
The details of the remapped edge.
- Raises:
ValueError – If the parent slot does not match the output slot of the edge.