struct
#include <src/vt/termination/termination.h>
TerminationDetector Detect global termination and of subsets of work.
Implements distributed algorithms to termination detection across the entire VT runtime and for subset of work, encapsulated in an epoch. Ships with two algorithms: 4-counter wave-based termination for large collective epochs; and, Dijkstra-Scholten parental responsibility termination for rooted epochs. Epochs may have other epochs nested within them, forming a graph.
The termination detector detects termination of the transitive closure of a piece of work—either starting collectively with all nodes or starting on a particular node (rooted).
In order to track work on the distributed system, work is "produced" and "consumed". Produce and consume are separate counters that are tracked on each node for each epoch. When the global produce and consume counts (sum across all nodes) are equal, termination is reached.
Base classes
-
template<typename T>struct vt::runtime::component::Component<TerminationDetector>
Component
class for a generic VT runtime module, CRTP'ed over the component's actual type- struct TermAction
- struct vt::collective::tree::Tree
- General interface for storing a spanning tree.
- struct StateDS
- struct TermInterface
Public types
-
template<typename T>using EpochContainerType = std::unordered_map<EpochType, T>
- using TermStateType = TermState
-
using TermStateDSType = term::
ds:: StateDS:: TerminatorType -
using SuccessorBagType = EpochDependency::
SuccessorBagType -
using EpochGraph = termination::
graph:: EpochGraph -
using EpochGraphMsg = termination::
graph:: EpochGraphMsg<EpochGraph> - using EpochStackType = EpochStack
Constructors, destructors, conversion operators
- TerminationDetector()
- Construct a termination detector.
- ~TerminationDetector() virtual
Public functions
- auto name() -> std::string override
- Get the name of the component.
- void produce(EpochType epoch = any_epoch_sentinel, TermCounterType num_units = 1, NodeType node = uninitialized_destination)
- Produce on an epoch—increase the produce counter.
- void consume(EpochType epoch = any_epoch_sentinel, TermCounterType num_units = 1, NodeType node = uninitialized_destination)
- Consume on an epoch—increase the consume counter.
- void hangDetectSend()
- Special produce for hang detection.
- void hangDetectRecv()
- Special consume for hang detection.
- auto isRooted(EpochType epoch) -> bool
- Check if an epoch is rooted.
- auto isDS(EpochType epoch) -> bool
- Check if the algorithm behind an epoch is Dijkstra-Scholten parental responsibility.
- auto getDSTerm(EpochType epoch, bool is_root = false) -> TermStateDSType*
- Get or create the DS terminator for an epoch.
- void resetGlobalTerm()
- Reset global termination to start producing/consuming again.
- void freeEpoch(EpochType const& epoch)
- Free an epoch after termination.
- auto makeEpochRooted(UseDS use_ds = UseDS{true}, ParentEpochCapture parent = ParentEpochCapture{}) -> EpochType
- Create a new rooted epoch.
- auto makeEpochCollective(ParentEpochCapture parent = ParentEpochCapture{}) -> EpochType
- Create a new collective epoch.
- auto makeEpochRooted(std::string const& label, UseDS use_ds = UseDS{true}, ParentEpochCapture parent = ParentEpochCapture{}) -> EpochType
- Create a new rooted epoch with a label.
- auto makeEpochCollective(std::string const& label, ParentEpochCapture parent = ParentEpochCapture{}) -> EpochType
- Create a collective epoch with a label.
- auto makeEpoch(std::string const& label, bool is_coll, UseDS use_ds = UseDS{false}, ParentEpochCapture parent = ParentEpochCapture{}) -> EpochType
- Create a new rooted or collective epoch with a label.
- void initializeCollectiveEpoch(EpochType const epoch, std::string const& label, ParentEpochCapture parent = ParentEpochCapture{})
- Setup a collective epoch with the epoch already generated.
- void initializeRootedEpoch(EpochType const epoch, std::string const& label, UseDS use_ds = UseDS{false}, ParentEpochCapture parent = ParentEpochCapture{})
- Setup a new rooted epoch with the epoch already generated.
- void finishedEpoch(EpochType const& epoch)
- Tell the termination detector that all initial work has been enqueued for a given epoch on this node.
- void activateEpoch(EpochType const& epoch)
- Activate an epoch; start detecting on it.
- void finishNoActivateEpoch(EpochType const& epoch)
- Finish an epoch without activating it (starting the work of detecting its termination)
- auto makeEpochRootedWave(ParentEpochCapture parent, std::string const& label = "") -> EpochType
- Create a new rooted epoch that uses the 4-counter wave algorithm.
- auto makeEpochRootedDS(ParentEpochCapture parent, std::string const& label = "") -> EpochType
- Create a new rooted epoch that uses the DS algorithm.
- void initializeRootedWaveEpoch(EpochType const epoch, ParentEpochCapture parent, std::string const& label = "")
- Setup a new rooted epoch that uses the 4-counter wave algorithm with an epoch already generated.
- void initializeRootedDSEpoch(EpochType const epoch, ParentEpochCapture parent, std::string const& label = "")
- Setup a new rooted epoch that uses the DS algorithm with the epoch already generated.
- void startEpochGraphBuild()
- Build the epoch graph. Typically called to output to the user due to a failure.
- void setLocalTerminated(bool const terminated, bool const no_propagate = true)
- Set whether the scheduler has locally terminated.
- void maybePropagate()
- Progress function to move state forward.
- auto getNumUnits() const -> TermCounterType
- Get number of units produced on global epoch.
- auto getNumTerminatedCollectiveEpochs() const -> std::size_t
- Get number of collective epochs that have terminated.
- auto testEpochTerminated(EpochType epoch) -> TermStatusEnum override
- Test if an epoch has terminated or not.
- auto isEpochTerminated(EpochType epoch) -> bool
- Check if an epoch has terminated.
- auto makeGraph() -> std::shared_ptr<EpochGraph>
- Make the local epoch graph.
- void addLocalDependency(EpochType epoch)
- Add a local work dependency on an epoch to stop propagation.
- void releaseLocalDependency(EpochType epoch)
- Release a local work dependency on an epoch to resume propagation.
- void addDependency(EpochType predecessor, EpochType successor)
- Make a dependency between two epochs.
- void disableTD(EpochType in_epoch = any_epoch_sentinel)
- Disable termination detection on an epoch. Local counting is still enabled, but any non-local progress is halted until it is enabled.
- void enableTD(EpochType in_epoch = any_epoch_sentinel)
- Enable termination detection on an epoch.
- auto getEpochState() -> EpochContainerType<TermStateType> const &
- auto getEpochReadySet() -> std::unordered_set<EpochType> const &
- auto getEpochWaitSet() -> std::unordered_set<EpochType> const &
-
template<typename SerializerT>void serialize(SerializerT& s)
- auto getEpoch() const -> EpochType
- void pushEpoch(EpochType epoch)
- auto popEpoch(EpochType epoch = no_epoch) -> EpochType
- void pushEpochFast(EpochType epoch)
- void popEpochFast()
- auto getEpochStack() -> EpochStackType&
Public variables
Function documentation
void vt:: term:: TerminationDetector:: produce(EpochType epoch = any_epoch_sentinel,
TermCounterType num_units = 1,
NodeType node = uninitialized_destination)
Produce on an epoch—increase the produce counter.
Parameters | |
---|---|
epoch in | the epoch to produce; if empty, produce on global epoch |
num_units in | number of units to produce |
node in | the node where this unit will be consumed (optional) |
void vt:: term:: TerminationDetector:: consume(EpochType epoch = any_epoch_sentinel,
TermCounterType num_units = 1,
NodeType node = uninitialized_destination)
Consume on an epoch—increase the consume counter.
Parameters | |
---|---|
epoch in | the epoch to consume; if empty, consume on global epoch |
num_units in | number of units to consume |
node in | the node where this unit was produced (optional) |
TermStateDSType* vt:: term:: TerminationDetector:: getDSTerm(EpochType epoch,
bool is_root = false)
Get or create the DS terminator for an epoch.
Parameters | |
---|---|
epoch in | the epoch |
is_root in | whether this is the root (relevant when creating) |
Returns | the DS terminator manager |
EpochType vt:: term:: TerminationDetector:: makeEpochRooted(UseDS use_ds = UseDS{true},
ParentEpochCapture parent = ParentEpochCapture{})
Create a new rooted epoch.
Parameters | |
---|---|
use_ds in | whether to use the Dijkstra-Scholten algorithm |
parent in | parent epoch that waits for this new epoch |
Returns | the new epoch |
EpochType vt:: term:: TerminationDetector:: makeEpochCollective(ParentEpochCapture parent = ParentEpochCapture{})
Create a new collective epoch.
Parameters | |
---|---|
parent in | parent epoch that waits for this new epoch |
Returns | the new epoch |
EpochType vt:: term:: TerminationDetector:: makeEpochRooted(std::string const& label,
UseDS use_ds = UseDS{true},
ParentEpochCapture parent = ParentEpochCapture{})
Create a new rooted epoch with a label.
Parameters | |
---|---|
label in | epoch label for debugging purposes |
use_ds in | whether to use the Dijkstra-Scholten algorithm |
parent in | parent epoch that waits for this new epoch |
Returns | the new epoch |
EpochType vt:: term:: TerminationDetector:: makeEpochCollective(std::string const& label,
ParentEpochCapture parent = ParentEpochCapture{})
Create a collective epoch with a label.
Parameters | |
---|---|
label in | epoch label for debugging purposes |
parent in | parent epoch that waits for this new epoch |
Returns | the new epoch |
EpochType vt:: term:: TerminationDetector:: makeEpoch(std::string const& label,
bool is_coll,
UseDS use_ds = UseDS{false},
ParentEpochCapture parent = ParentEpochCapture{})
Create a new rooted or collective epoch with a label.
Parameters | |
---|---|
label in | epoch label for debugging purposes |
is_coll in | whether to create a collective or rooted epoch |
use_ds in | whether to use the Dijkstra-Scholten algorithm |
parent in | parent epoch that waits for this new epoch |
Returns | the new epoch |
void vt:: term:: TerminationDetector:: initializeCollectiveEpoch(EpochType const epoch,
std::string const& label,
ParentEpochCapture parent = ParentEpochCapture{})
Setup a collective epoch with the epoch already generated.
Parameters | |
---|---|
epoch in | the collective epoch already generated |
label in | epoch label for debugging purposes |
parent in | parent epoch that waits for this new epoch |
void vt:: term:: TerminationDetector:: initializeRootedEpoch(EpochType const epoch,
std::string const& label,
UseDS use_ds = UseDS{false},
ParentEpochCapture parent = ParentEpochCapture{})
Setup a new rooted epoch with the epoch already generated.
Parameters | |
---|---|
epoch in | the collective epoch already generated |
label in | epoch label for debugging purposes |
use_ds in | whether to use the Dijkstra-Scholten algorithm |
parent in | parent epoch that waits for this new epoch |
void vt:: term:: TerminationDetector:: finishedEpoch(EpochType const& epoch)
Tell the termination detector that all initial work has been enqueued for a given epoch on this node.
Parameters | |
---|---|
epoch in | the finished epoch |
void vt:: term:: TerminationDetector:: activateEpoch(EpochType const& epoch)
Activate an epoch; start detecting on it.
Parameters | |
---|---|
epoch in | the epoch to activate |
void vt:: term:: TerminationDetector:: finishNoActivateEpoch(EpochType const& epoch)
Finish an epoch without activating it (starting the work of detecting its termination)
Parameters | |
---|---|
epoch in | the epoch that is finished |
EpochType vt:: term:: TerminationDetector:: makeEpochRootedWave(ParentEpochCapture parent,
std::string const& label = "")
Create a new rooted epoch that uses the 4-counter wave algorithm.
Parameters | |
---|---|
parent in | parent epoch that waits for this new epoch |
label in | epoch label for debugging purposes |
Returns | the new epoch |
EpochType vt:: term:: TerminationDetector:: makeEpochRootedDS(ParentEpochCapture parent,
std::string const& label = "")
Create a new rooted epoch that uses the DS algorithm.
Parameters | |
---|---|
parent in | parent epoch that waits for this new epoch |
label in | epoch label for debugging purposes |
Returns | the new epoch |
void vt:: term:: TerminationDetector:: initializeRootedWaveEpoch(EpochType const epoch,
ParentEpochCapture parent,
std::string const& label = "")
Setup a new rooted epoch that uses the 4-counter wave algorithm with an epoch already generated.
Parameters | |
---|---|
epoch in | the wave epoch already generated |
parent in | parent epoch that waits for this new epoch |
label in | epoch label for debugging purposes |
void vt:: term:: TerminationDetector:: initializeRootedDSEpoch(EpochType const epoch,
ParentEpochCapture parent,
std::string const& label = "")
Setup a new rooted epoch that uses the DS algorithm with the epoch already generated.
Parameters | |
---|---|
epoch in | the DS epoch already generated |
parent in | parent epoch that waits for this new epoch |
label in | epoch label for debugging purposes |
void vt:: term:: TerminationDetector:: setLocalTerminated(bool const terminated,
bool const no_propagate = true)
Set whether the scheduler has locally terminated.
Parameters | |
---|---|
terminated in | whether it has terminated |
no_propagate in | whether to should propagate state remotely |
TermCounterType vt:: term:: TerminationDetector:: getNumUnits() const
Get number of units produced on global epoch.
Returns | number of produced units |
---|
std::size_t vt:: term:: TerminationDetector:: getNumTerminatedCollectiveEpochs() const
Get number of collective epochs that have terminated.
Returns | number of epochs |
---|
TermStatusEnum vt:: term:: TerminationDetector:: testEpochTerminated(EpochType epoch) override
Test if an epoch has terminated or not.
Parameters | |
---|---|
epoch in | the epoch to test |
Returns | status enum indicating the known state |
bool vt:: term:: TerminationDetector:: isEpochTerminated(EpochType epoch)
Check if an epoch has terminated.
Parameters | |
---|---|
epoch in | the epoch to test |
Returns | whether it is known to be terminated |
std::shared_ptr<EpochGraph> vt:: term:: TerminationDetector:: makeGraph()
Make the local epoch graph.
Returns | shared pointer to epoch graph |
---|
void vt:: term:: TerminationDetector:: addLocalDependency(EpochType epoch)
Add a local work dependency on an epoch to stop propagation.
Parameters | |
---|---|
epoch in | the epoch |
void vt:: term:: TerminationDetector:: releaseLocalDependency(EpochType epoch)
Release a local work dependency on an epoch to resume propagation.
Parameters | |
---|---|
epoch in | the epoch |
void vt:: term:: TerminationDetector:: addDependency(EpochType predecessor,
EpochType successor)
Make a dependency between two epochs.
Parameters | |
---|---|
predecessor in | the predecessor epoch |
successor in | the successor epoch |