Control flow

The control flow of an application configured with PyConf, such as Moore, defines the order in which algorithms should run. Each step in the control flow evaluates a decision, which indicates whether the step was successful or not. The overall decision of the application depends on what the total control flow evaluates to.

Concretely, the control flow in the example of Moore looks something like this:

MooreNode (LAZY_AND)
*-- HLTLinesNode (NONLAZY_OR)
|   +-- Hlt2CharmPhysicsLineNode (LAZY_AND)
|   |   *-- PVFilter
|   |   *-- D2HHHCombiner
|   +-- Hlt2DiMuonPhysicsLineNode (LAZY_AND)
|   |   *-- PVFilter
|   |   *-- MuMuCombiner
|   +-- Hlt2LumiLineNode (LAZY_AND)
|   |   *-- ODINBeamFilter
|   |   *-- LumiCounter
|   +-- Hlt2InclusiveBPhysicsLineNode (LAZY_AND)
|       *-- PVFilter
|       *-- TwoBodyBCombiner
*-- PersistencyNode (LAZY_AND)
    *-- DecReports
    *-- TurboWriter

If we think about how the trigger needs to come to its decision, we can understand what the specific pieces mean and why the control flow looks like it does:

  1. For a start, every trigger line should run, and should do so independently of the decisions of other lines.

  2. Each line runs a sequence of steps to evaluate its own decision, e.g. it first requires some non-zero number of primary vertices, and then further requires some combiner to produce candidates.

  3. If at least one line produces a positive decision (‘yes’/’no’, ‘fired’/’did not fire’), the event should be written out.

So, the control flow looks the way it does in order to evaluate the total trigger decision in the way we want.

Nodes and algorithms

A control flow node has some number of children, and makes its decision based on the combination of the decisions of its children. There are two ways we can combine decisions in one of these so-called composite nodes:

  1. Boolean AND, where all of the children must produce a positive decision.

  2. Boolean OR, where at least one of the children must produce a positive decision.

When evaluating a boolean expression, we can choose to ‘short circuit’ in certain cases. With an AND decision, we could choose to not run the next child if the current child gives a negative decision, because we know the total expression can now never be positive. With an OR decision, we could similarly stop as soon as one child has a positive decision. The LAZY and NONLAZY attributes on each node specify this behaviour.

The Moore node is LAZY_AND. This is because we don’t want to write anything out if the decision of the trigger lines was negative, so we short circuit in that case.

The HLTLines node is a NONLAZY_OR. If one line has a positive decision we already know the event will be saved, but we must evaluate all lines as they are independent. We always want to know what every line did in every event.

We have one other type of component in the control flow, which has no children. These are algorithms, and it is these that ultimately make decisions. They typically take some input, and then return a ‘yes’ or ‘no’ based on the properties of that input.

A primary vertex filter algorithm will return a positive decision if there’s at least one PV in the event.

A prescaler algorithm takes no input, instead evaluating its decision based on the value of a random number.

All together combining control flow nodes and algorithms allows us to express complex decision paths in a modular way.

Data flow

Implicit in the control flow is the data flow. Notice above that we don’t specify that the reconstruction should run, even though we need the reconstruction to run the PV filters!

In brief, satisfying data dependencies is the job of the scheduler, HLTControlFlowMgr. When the scheduler needs to run an algorithm, it takes care of running the algorithms in the data dependency tree. (It’s clever enough to not run the same algorithm multiple times, in case it appears in multiple data dependency trees.)

We only need to explicitly take care of the control flow, which the scheduler is also responsible for executing.

API

The objects below are what are used to construct the control flow in the configuration. A CompositeNode instance represents a composite node. The NodeLogic enum is used to specify how child decisions should be combined, AND or OR, and the whether to short circuit or not, LAZY or NONLAZY.

class NodeLogic(value)[source]

Node control flow behaviour.

Each node contains an ordered set of subnodes/child nodes. These are processed in order one by one until the node can return True or False. Whether a node can return depends on its control flow behaviour.

LAZY_AND = 'LAZY_AND'

Return False and stop processing as soon as a subnode returns False

LAZY_OR = 'LAZY_OR'

Return True and stop processing as soon as a subnode returns True

NONLAZY_AND = 'NONLAZY_AND'

Return False if any subnode returns False, but do process all subnodes

NONLAZY_OR = 'NONLAZY_OR'

Return True if any subnode returns True, but do process all subnodes

NOT = 'NOT'

Return the negation of the subnode

class CompositeNode(name, children, combine_logic=NodeLogic.LAZY_AND, force_order=True)[source]

A container for a set of subnodes/child nodes.