Control flow
The control flow of an application configured with PyConf, such as Moore, defines the order in which algorithms should run. Each step in the control flow evaluates a decision, which indicates whether the step was successful or not. The overall decision of the application depends on what the total control flow evaluates to.
Concretely, the control flow in the example of Moore looks something like this:
MooreNode (LAZY_AND)
*-- HLTLinesNode (NONLAZY_OR)
| +-- Hlt2CharmPhysicsLineNode (LAZY_AND)
| | *-- PVFilter
| | *-- D2HHHCombiner
| +-- Hlt2DiMuonPhysicsLineNode (LAZY_AND)
| | *-- PVFilter
| | *-- MuMuCombiner
| +-- Hlt2LumiLineNode (LAZY_AND)
| | *-- ODINBeamFilter
| | *-- LumiCounter
| +-- Hlt2InclusiveBPhysicsLineNode (LAZY_AND)
| *-- PVFilter
| *-- TwoBodyBCombiner
*-- PersistencyNode (LAZY_AND)
*-- DecReports
*-- TurboWriter
If we think about how the trigger needs to come to its decision, we can understand what the specific pieces mean and why the control flow looks like it does:
For a start, every trigger line should run, and should do so independently of the decisions of other lines.
Each line runs a sequence of steps to evaluate its own decision, e.g. it first requires some non-zero number of primary vertices, and then further requires some combiner to produce candidates.
If at least one line produces a positive decision (‘yes’/’no’, ‘fired’/’did not fire’), the event should be written out.
So, the control flow looks the way it does in order to evaluate the total trigger decision in the way we want.
Nodes and algorithms
A control flow node has some number of children, and makes its decision based on the combination of the decisions of its children. There are two ways we can combine decisions in one of these so-called composite nodes:
Boolean
AND
, where all of the children must produce a positive decision.Boolean
OR
, where at least one of the children must produce a positive decision.
When evaluating a boolean expression, we can choose to ‘short circuit’ in
certain cases. With an AND
decision, we could choose to not run the next
child if the current child gives a negative decision, because we know the
total expression can now never be positive. With an OR
decision, we could
similarly stop as soon as one child has a positive decision. The LAZY
and
NONLAZY
attributes on each node specify this behaviour.
The Moore
node is LAZY_AND
. This is because we don’t want to write
anything out if the decision of the trigger lines was negative, so we short
circuit in that case.
The HLTLines
node is a NONLAZY_OR
. If one line has a positive
decision we already know the event will be saved, but we must evaluate all
lines as they are independent. We always want to know what every line did in
every event.
We have one other type of component in the control flow, which has no children. These are algorithms, and it is these that ultimately make decisions. They typically take some input, and then return a ‘yes’ or ‘no’ based on the properties of that input.
A primary vertex filter algorithm will return a positive decision if there’s at least one PV in the event.
A prescaler algorithm takes no input, instead evaluating its decision based on the value of a random number.
All together combining control flow nodes and algorithms allows us to express complex decision paths in a modular way.
Data flow
Implicit in the control flow is the data flow. Notice above that we don’t specify that the reconstruction should run, even though we need the reconstruction to run the PV filters!
In brief, satisfying data dependencies is the job of the scheduler,
HLTControlFlowMgr
. When the scheduler needs to run an algorithm, it takes
care of running the algorithms in the data dependency tree. (It’s clever
enough to not run the same algorithm multiple times, in case it appears in
multiple data dependency trees.)
We only need to explicitly take care of the control flow, which the scheduler is also responsible for executing.
API
The objects below are what are used to construct the control flow in the
configuration. A CompositeNode
instance
represents a composite node. The NodeLogic
enum is used to specify how child decisions should be combined, AND
or
OR
, and the whether to short circuit or not, LAZY
or NONLAZY
.
- class NodeLogic(value)[source]
Node control flow behaviour.
Each node contains an ordered set of subnodes/child nodes. These are processed in order one by one until the node can return True or False. Whether a node can return depends on its control flow behaviour.
- LAZY_AND = 'LAZY_AND'
Return False and stop processing as soon as a subnode returns False
- LAZY_OR = 'LAZY_OR'
Return True and stop processing as soon as a subnode returns True
- NONLAZY_AND = 'NONLAZY_AND'
Return False if any subnode returns False, but do process all subnodes
- NONLAZY_OR = 'NONLAZY_OR'
Return True if any subnode returns True, but do process all subnodes
- NOT = 'NOT'
Return the negation of the subnode