CMS logoCMS event Hgg
Compact Muon Solenoid
LHC, CERN

CMS-TAU-20-001 ; CERN-EP-2021-257
Identification of hadronic tau lepton decays using a deep neural network
JINST 17 (2022) P07023
Abstract: A new algorithm is presented to discriminate reconstructed hadronic decays of tau leptons ($ \tau_\mathrm{h} $) that originate from genuine tau leptons in the CMS detector against $ \tau_\mathrm{h} $ candidates that originate from quark or gluon jets, electrons, or muons. The algorithm inputs information from all reconstructed particles in the vicinity of a $ \tau_\mathrm{h} $ candidate and employs a deep neural network with convolutional layers to efficiently process the inputs. This algorithm leads to a significantly improved performance compared with the previously used one. For example, the efficiency for a genuine $ \tau_\mathrm{h} $ to pass the discriminator against jets increases by 10--30% for a given efficiency for quark and gluon jets. Furthermore, a more efficient $ \tau_\mathrm{h} $ reconstruction is introduced that incorporates additional hadronic decay modes. The superior performance of the new algorithm to discriminate against jets, electrons, and muons and the improved $ \tau_\mathrm{h} $ reconstruction method are validated with LHC proton-proton collision data at $ \sqrt{s} = $ 13 TeV.
Figures & Tables Summary Additional Figures & Tables References CMS Publications
Figures

png pdf
Figure 1:
Decay mode confusion matrix. For a given generated decay mode, the fractions of reconstructed $ \tau_\mathrm{h} $ in different decay modes are given, as well as the fraction of generated $ \tau_\mathrm{h} $ that are not reconstructed. Both the generated and reconstructed $ \tau_\mathrm{h} $ need to fulfil $ p_{\mathrm{T}} > $ 20 GeV and $ |\eta| < $ 2.3. The $ \tau_\mathrm{h} $ candidates come from a $ \mathrm{Z}\to\tau\tau $ event sample with $ m_{\tau\tau} > $ 50 GeV. Decay modes with the same numbers of charged hadrons and one or two $ \pi^{0} $s are combined and labelled as ``$ \pi^{0} $s''.

png pdf
Figure 2:
Layout of the grids in $ \eta-\phi $ space around the reconstructed $ \tau_\mathrm{h} $ axis used to process the particle-level inputs for the convolutional layers of the DNN. The inner grid comprises 11 $ {\times} $ 11 cells with a grid size of 0.02 $ {\times} $ 0.02 and contains the signal cone with a radius of 0.05--0.1, which is defined in the $ \tau_\mathrm{h} $ reconstruction (the charged hadrons and $ \pi^{0} $ candidates used to reconstruct the $ \tau_\mathrm{h} $ candidate need to be within the signal cone). For high-$ p_{\mathrm{T}} $ quark and gluon jets, the finer grid is also able to resolve the dense core of the jet. The outer grid comprises 21 $ {\times} $ 21 cells with a grid size of 0.05 $ {\times} $ 0.05 and contains the isolation cone with a radius of 0.5 that is used to define higher-level observables that correlate with quark or gluon jet activity.

png pdf
Figure 3:
The DNN architecture. The three sets of input variables (inner cells, outer cells, and high-level features) are first processed separately through different subnetworks, whose outputs are then concatenated and processed through five fully connected layers before the output is calculated that gives the probabilities for a candidate to be either a $ \tau_\mathrm{h} $, an electron, a muon, or a quark or gluon jet. The subnetwork for the high-level inputs consists of three fully connected layers with decreasing numbers of nodes, taking 47 inputs and yielding 57 outputs. The features of both the inner and outer cells are input to complex subnetworks. In the first part, the observables in each grid cell are processed through a set of fully connected layers, first separately for electrons/photons (containing both the features for PF electrons and electrons from the standalone reconstruction), muons (similarly containing both features from PF and standalone muons), and charged/neutral hadrons, passing through three fully connected layers each. The outputs are concatenated and passed through four additional fully connected layers, yielding 64 outputs for each cell. The grids are then processed with convolutional layers, which successively reduce the size of the grid. For the inner cells, there are hence 5 convolutional layers that reduce the grid from 11 $ {\times} $ 11 to a single cell; for the outer cells, there are 10 convolutional layers that reduce the grid from 21 $ {\times} $ 21 to a single cell. The numbers of trainable parameters (TP) for the different subnetworks are also given for the different subnetworks.

png pdf
Figure 4:
Efficiency for quark and gluon jets to pass different tau identification discriminators versus the efficiency for genuine $ \tau_\mathrm{h} $. The upper two plots are obtained with jets from the W+jets simulated sample and the lower two plots with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ sample. The left two plots include jets and genuine $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV, whereas the right two plots include those with $ p_{\mathrm{T}} > $ 100 GeV. The working points are indicated as full circles. The efficiency for jets from the W+jets event sample, enriched in quark jets, to pass the discriminators is higher compared with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ event sample, which has a larger fraction of gluon and b-quark jets. The jet efficiency for a given $ \tau_\mathrm{h} $ efficiency is larger for jets and $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV than for those with $ p_{\mathrm{T}} > $ 100 GeV. Compared with the previously used MVA discriminator, the DEEPTAU discriminator reduces the jet efficiency for a given $ \tau_\mathrm{h} $ efficiency by consistently more than a factor of 1.8, and by more at high $ \tau_\mathrm{h} $ efficiency. The additional gain at high $ p_{\mathrm{T}} $ comes from the inclusion of updated decay modes in the $ \tau_\mathrm{h} $ reconstruction, as illustrated by the curves for the previously used MVA discriminator but including reconstructed $ \tau_\mathrm{h} $ candidates with additional decay modes.

png pdf
Figure 4-a:
Efficiency for quark and gluon jets to pass different tau identification discriminators versus the efficiency for genuine $ \tau_\mathrm{h} $. The upper two plots are obtained with jets from the W+jets simulated sample and the lower two plots with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ sample. The left two plots include jets and genuine $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV, whereas the right two plots include those with $ p_{\mathrm{T}} > $ 100 GeV. The working points are indicated as full circles. The efficiency for jets from the W+jets event sample, enriched in quark jets, to pass the discriminators is higher compared with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ event sample, which has a larger fraction of gluon and b-quark jets. The jet efficiency for a given $ \tau_\mathrm{h} $ efficiency is larger for jets and $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV than for those with $ p_{\mathrm{T}} > $ 100 GeV. Compared with the previously used MVA discriminator, the DEEPTAU discriminator reduces the jet efficiency for a given $ \tau_\mathrm{h} $ efficiency by consistently more than a factor of 1.8, and by more at high $ \tau_\mathrm{h} $ efficiency. The additional gain at high $ p_{\mathrm{T}} $ comes from the inclusion of updated decay modes in the $ \tau_\mathrm{h} $ reconstruction, as illustrated by the curves for the previously used MVA discriminator but including reconstructed $ \tau_\mathrm{h} $ candidates with additional decay modes.

png pdf
Figure 4-b:
Efficiency for quark and gluon jets to pass different tau identification discriminators versus the efficiency for genuine $ \tau_\mathrm{h} $. The upper two plots are obtained with jets from the W+jets simulated sample and the lower two plots with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ sample. The left two plots include jets and genuine $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV, whereas the right two plots include those with $ p_{\mathrm{T}} > $ 100 GeV. The working points are indicated as full circles. The efficiency for jets from the W+jets event sample, enriched in quark jets, to pass the discriminators is higher compared with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ event sample, which has a larger fraction of gluon and b-quark jets. The jet efficiency for a given $ \tau_\mathrm{h} $ efficiency is larger for jets and $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV than for those with $ p_{\mathrm{T}} > $ 100 GeV. Compared with the previously used MVA discriminator, the DEEPTAU discriminator reduces the jet efficiency for a given $ \tau_\mathrm{h} $ efficiency by consistently more than a factor of 1.8, and by more at high $ \tau_\mathrm{h} $ efficiency. The additional gain at high $ p_{\mathrm{T}} $ comes from the inclusion of updated decay modes in the $ \tau_\mathrm{h} $ reconstruction, as illustrated by the curves for the previously used MVA discriminator but including reconstructed $ \tau_\mathrm{h} $ candidates with additional decay modes.

png pdf
Figure 4-c:
Efficiency for quark and gluon jets to pass different tau identification discriminators versus the efficiency for genuine $ \tau_\mathrm{h} $. The upper two plots are obtained with jets from the W+jets simulated sample and the lower two plots with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ sample. The left two plots include jets and genuine $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV, whereas the right two plots include those with $ p_{\mathrm{T}} > $ 100 GeV. The working points are indicated as full circles. The efficiency for jets from the W+jets event sample, enriched in quark jets, to pass the discriminators is higher compared with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ event sample, which has a larger fraction of gluon and b-quark jets. The jet efficiency for a given $ \tau_\mathrm{h} $ efficiency is larger for jets and $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV than for those with $ p_{\mathrm{T}} > $ 100 GeV. Compared with the previously used MVA discriminator, the DEEPTAU discriminator reduces the jet efficiency for a given $ \tau_\mathrm{h} $ efficiency by consistently more than a factor of 1.8, and by more at high $ \tau_\mathrm{h} $ efficiency. The additional gain at high $ p_{\mathrm{T}} $ comes from the inclusion of updated decay modes in the $ \tau_\mathrm{h} $ reconstruction, as illustrated by the curves for the previously used MVA discriminator but including reconstructed $ \tau_\mathrm{h} $ candidates with additional decay modes.

png pdf
Figure 4-d:
Efficiency for quark and gluon jets to pass different tau identification discriminators versus the efficiency for genuine $ \tau_\mathrm{h} $. The upper two plots are obtained with jets from the W+jets simulated sample and the lower two plots with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ sample. The left two plots include jets and genuine $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV, whereas the right two plots include those with $ p_{\mathrm{T}} > $ 100 GeV. The working points are indicated as full circles. The efficiency for jets from the W+jets event sample, enriched in quark jets, to pass the discriminators is higher compared with jets from the $ \mathrm{t} \overline{\mathrm{t}} $ event sample, which has a larger fraction of gluon and b-quark jets. The jet efficiency for a given $ \tau_\mathrm{h} $ efficiency is larger for jets and $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} < $ 100 GeV than for those with $ p_{\mathrm{T}} > $ 100 GeV. Compared with the previously used MVA discriminator, the DEEPTAU discriminator reduces the jet efficiency for a given $ \tau_\mathrm{h} $ efficiency by consistently more than a factor of 1.8, and by more at high $ \tau_\mathrm{h} $ efficiency. The additional gain at high $ p_{\mathrm{T}} $ comes from the inclusion of updated decay modes in the $ \tau_\mathrm{h} $ reconstruction, as illustrated by the curves for the previously used MVA discriminator but including reconstructed $ \tau_\mathrm{h} $ candidates with additional decay modes.

png pdf
Figure 5:
Efficiencies for simulated $ \tau_\mathrm{h} $ decays with $ |\eta| < $ 2.3 to pass the following reconstruction and identification requirements: to be reconstructed in any decay mode with $ p_{\mathrm{T}} > $ 20 GeV and $ |\eta| < $ 2.3 (black dashed line), to be reconstructed in a decay mode except for those with missing charged hadrons (labelled ``2-prong'' and shown as full black line), and to be reconstructed in a decay mode except the 2-prong ones and to pass the Loose, Medium, or Tight working point of the $ D_\text{jet} $ discriminator (blue lines), obtained with a $ \mathrm{Z}\to\tau\tau $ event sample. The efficiencies are shown as a function of the visible genuine $ \tau_\mathrm{h} p_{\mathrm{T}} $ obtained from simulated decay products.

png pdf
Figure 6:
Efficiency for electrons versus efficiency for genuine $ \tau_\mathrm{h} $ to pass the MVA and $ D_\mathrm{e} $ discriminators, separately for electrons and $ \tau_\mathrm{h} $ with 20 $ < p_{\mathrm{T}} < $ 100 GeV (left) and $ p_{\mathrm{T}} > $ 100 GeV (right). Vertical bars correspond to the statistical uncertainties. The $ \tau_\mathrm{h} $ candidates are reconstructed in one of the $ \tau_\mathrm{h} $ decay modes without missing charged hadrons. Compared with the MVA discriminator, the $ D_\mathrm{e} $ discriminator reduces the electron efficiency by more than a factor of two for a $ \tau_\mathrm{h} $ efficiency of 70% and by more than a factor of 10 for $ \tau_\mathrm{h} $ efficiencies larger than 88%. Furthermore, working points (indicated as full circles) are now provided for previously inaccessible $ \tau_\mathrm{h} $ efficiencies larger than 90%, for a misidentification efficiency between 0.3 and 8%.

png pdf
Figure 6-a:
Efficiency for electrons versus efficiency for genuine $ \tau_\mathrm{h} $ to pass the MVA and $ D_\mathrm{e} $ discriminators, separately for electrons and $ \tau_\mathrm{h} $ with 20 $ < p_{\mathrm{T}} < $ 100 GeV (left) and $ p_{\mathrm{T}} > $ 100 GeV (right). Vertical bars correspond to the statistical uncertainties. The $ \tau_\mathrm{h} $ candidates are reconstructed in one of the $ \tau_\mathrm{h} $ decay modes without missing charged hadrons. Compared with the MVA discriminator, the $ D_\mathrm{e} $ discriminator reduces the electron efficiency by more than a factor of two for a $ \tau_\mathrm{h} $ efficiency of 70% and by more than a factor of 10 for $ \tau_\mathrm{h} $ efficiencies larger than 88%. Furthermore, working points (indicated as full circles) are now provided for previously inaccessible $ \tau_\mathrm{h} $ efficiencies larger than 90%, for a misidentification efficiency between 0.3 and 8%.

png pdf
Figure 6-b:
Efficiency for electrons versus efficiency for genuine $ \tau_\mathrm{h} $ to pass the MVA and $ D_\mathrm{e} $ discriminators, separately for electrons and $ \tau_\mathrm{h} $ with 20 $ < p_{\mathrm{T}} < $ 100 GeV (left) and $ p_{\mathrm{T}} > $ 100 GeV (right). Vertical bars correspond to the statistical uncertainties. The $ \tau_\mathrm{h} $ candidates are reconstructed in one of the $ \tau_\mathrm{h} $ decay modes without missing charged hadrons. Compared with the MVA discriminator, the $ D_\mathrm{e} $ discriminator reduces the electron efficiency by more than a factor of two for a $ \tau_\mathrm{h} $ efficiency of 70% and by more than a factor of 10 for $ \tau_\mathrm{h} $ efficiencies larger than 88%. Furthermore, working points (indicated as full circles) are now provided for previously inaccessible $ \tau_\mathrm{h} $ efficiencies larger than 90%, for a misidentification efficiency between 0.3 and 8%.

png pdf
Figure 7:
Efficiency for muons versus efficiency for simulated $ \tau_\mathrm{h} $ to pass the cutoff-based and $ D_\mu $ discriminators, separately for muons and $ \tau_\mathrm{h} $ with 20 $ < p_{\mathrm{T}} < $ 100 GeV (left) and $ p_{\mathrm{T}} > $ 100 GeV (right). The four working points are indicated as full circles. Vertical bars correspond to the statistical uncertainties. In both $ p_{\mathrm{T}} $ regimes, the $ D_\mu $ discriminator rejects up to a factor of 10 more muons at $ \tau_\mathrm{h} $ efficiencies around 99%, and it leads to an increase of the $ \tau_\mathrm{h} $ efficiency for a similar background rejection by about 0.5%.

png pdf
Figure 7-a:
Efficiency for muons versus efficiency for simulated $ \tau_\mathrm{h} $ to pass the cutoff-based and $ D_\mu $ discriminators, separately for muons and $ \tau_\mathrm{h} $ with 20 $ < p_{\mathrm{T}} < $ 100 GeV (left) and $ p_{\mathrm{T}} > $ 100 GeV (right). The four working points are indicated as full circles. Vertical bars correspond to the statistical uncertainties. In both $ p_{\mathrm{T}} $ regimes, the $ D_\mu $ discriminator rejects up to a factor of 10 more muons at $ \tau_\mathrm{h} $ efficiencies around 99%, and it leads to an increase of the $ \tau_\mathrm{h} $ efficiency for a similar background rejection by about 0.5%.

png pdf
Figure 7-b:
Efficiency for muons versus efficiency for simulated $ \tau_\mathrm{h} $ to pass the cutoff-based and $ D_\mu $ discriminators, separately for muons and $ \tau_\mathrm{h} $ with 20 $ < p_{\mathrm{T}} < $ 100 GeV (left) and $ p_{\mathrm{T}} > $ 100 GeV (right). The four working points are indicated as full circles. Vertical bars correspond to the statistical uncertainties. In both $ p_{\mathrm{T}} $ regimes, the $ D_\mu $ discriminator rejects up to a factor of 10 more muons at $ \tau_\mathrm{h} $ efficiencies around 99%, and it leads to an increase of the $ \tau_\mathrm{h} $ efficiency for a similar background rejection by about 0.5%.

png pdf
Figure 8:
Data-to-simulation scale factors for genuine $ \tau_\mathrm{h} $ to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Loose, Medium, and Tight working points of the $ D_\text{jet} $ discriminator as a function of the $ \tau_\mathrm{h} $ candidate $ p_{\mathrm{T}} $ (left). Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The red hatched bands indicate the uncertainties for $ p_{\mathrm{T}} > $ 40 GeV, obtained from a combination of the individual measurements. The right plot shows the data-to-simulation scale factors for the $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 40 GeV to pass the Medium $ D_\text{jet} $ working point as a function of reconstructed $ \tau_\mathrm{h} $ decay mode. The efficiencies are obtained with 2018 data and the according simulated events using a likelihood fit to the distribution of the reconstructed $ m_\text{vis}(\mu, \tau_\mathrm{h}) $. The scale factors are shown separately for data taken in 2016, 2017, and 2018 (and the corresponding simulated events) and for the four main $ \tau_\mathrm{h} $ decay modes.

png pdf
Figure 8-a:
Data-to-simulation scale factors for genuine $ \tau_\mathrm{h} $ to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Loose, Medium, and Tight working points of the $ D_\text{jet} $ discriminator as a function of the $ \tau_\mathrm{h} $ candidate $ p_{\mathrm{T}} $ (left). Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The red hatched bands indicate the uncertainties for $ p_{\mathrm{T}} > $ 40 GeV, obtained from a combination of the individual measurements. The right plot shows the data-to-simulation scale factors for the $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 40 GeV to pass the Medium $ D_\text{jet} $ working point as a function of reconstructed $ \tau_\mathrm{h} $ decay mode. The efficiencies are obtained with 2018 data and the according simulated events using a likelihood fit to the distribution of the reconstructed $ m_\text{vis}(\mu, \tau_\mathrm{h}) $. The scale factors are shown separately for data taken in 2016, 2017, and 2018 (and the corresponding simulated events) and for the four main $ \tau_\mathrm{h} $ decay modes.

png pdf
Figure 8-b:
Data-to-simulation scale factors for genuine $ \tau_\mathrm{h} $ to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Loose, Medium, and Tight working points of the $ D_\text{jet} $ discriminator as a function of the $ \tau_\mathrm{h} $ candidate $ p_{\mathrm{T}} $ (left). Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The red hatched bands indicate the uncertainties for $ p_{\mathrm{T}} > $ 40 GeV, obtained from a combination of the individual measurements. The right plot shows the data-to-simulation scale factors for the $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 40 GeV to pass the Medium $ D_\text{jet} $ working point as a function of reconstructed $ \tau_\mathrm{h} $ decay mode. The efficiencies are obtained with 2018 data and the according simulated events using a likelihood fit to the distribution of the reconstructed $ m_\text{vis}(\mu, \tau_\mathrm{h}) $. The scale factors are shown separately for data taken in 2016, 2017, and 2018 (and the corresponding simulated events) and for the four main $ \tau_\mathrm{h} $ decay modes.

png pdf
Figure 9:
Relative difference between $ \tau_\mathrm{h} $ energy obtained in data and simulated events for the four main reconstructed $ \tau_\mathrm{h} $ decay modes for the 2018 data set. The results are obtained from fits to the distribution of either the reconstructed $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ (blue lines) or $ m_{\tau_\mathrm{h}} $ (black lines). The horizontal bars represent the uncertainties in the measurements. The measured values are consistent with no shift of the $ \tau_\mathrm{h} $ energy scale between data and simulation, with the largest difference amounting to 1.5 standard deviations.

png pdf
Figure 10:
Distribution of the reconstructed visible invariant mass of the $ \mu\tau_\mathrm{h} $ system, $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ (left) and of the visible invariant $ \tau_\mathrm{h} $ mass (right). Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons. In the $ m(\tau_\mathrm{h}) $ distribution, the $ \mathrm{Z}\to\tau\tau $ contributions are shown separately for the different $ \tau_\mathrm{h} $ decay modes.

png pdf
Figure 10-a:
Distribution of the reconstructed visible invariant mass of the $ \mu\tau_\mathrm{h} $ system, $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ (left) and of the visible invariant $ \tau_\mathrm{h} $ mass (right). Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons. In the $ m(\tau_\mathrm{h}) $ distribution, the $ \mathrm{Z}\to\tau\tau $ contributions are shown separately for the different $ \tau_\mathrm{h} $ decay modes.

png pdf
Figure 10-b:
Distribution of the reconstructed visible invariant mass of the $ \mu\tau_\mathrm{h} $ system, $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ (left) and of the visible invariant $ \tau_\mathrm{h} $ mass (right). Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons. In the $ m(\tau_\mathrm{h}) $ distribution, the $ \mathrm{Z}\to\tau\tau $ contributions are shown separately for the different $ \tau_\mathrm{h} $ decay modes.

png pdf
Figure 11:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown separately for electrons with $ |\eta| < $ 1.46 (left), corresponding to the ECAL barrel region, and $ |\eta| > $ 1.56 (right), corresponding to the ECAL endcap regions.

png pdf
Figure 11-a:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown separately for electrons with $ |\eta| < $ 1.46 (left), corresponding to the ECAL barrel region, and $ |\eta| > $ 1.56 (right), corresponding to the ECAL endcap regions.

png pdf
Figure 11-b:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown separately for electrons with $ |\eta| < $ 1.46 (left), corresponding to the ECAL barrel region, and $ |\eta| > $ 1.56 (right), corresponding to the ECAL endcap regions.

png pdf
Figure 12:
Observed and expected efficiencies for muons to pass the loose (left) and tight (right) $ D_\mu $ working points. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $.

png pdf
Figure 12-a:
Observed and expected efficiencies for muons to pass the loose (left) and tight (right) $ D_\mu $ working points. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $.

png pdf
Figure 12-b:
Observed and expected efficiencies for muons to pass the loose (left) and tight (right) $ D_\mu $ working points. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $.

png pdf
Figure 13:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the loose (left) and tight (right) $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2018 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.

png pdf
Figure 13-a:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the loose (left) and tight (right) $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2018 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.

png pdf
Figure 13-b:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the loose (left) and tight (right) $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2018 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.
Tables

png pdf
Table 1:
Decays of $ \tau $ leptons and their branching fractions ($ \mathcal{B} $) in % [60]. The known intermediate resonances of all the listed hadrons are indicated where appropriate. Charged hadrons are denoted by the symbol $ \mathrm{h}^{\pm} $. Although only $ \tau^{-} $ decays are shown, the decays and values of the branching fractions are identical for charge-conjugate decays.

png pdf
Table 2:
Input variables used for the various kinds of particles that are contained in a given cell. For each type of particle, basic kinematic quantities ($ p_{\mathrm{T}}, \eta, \phi $) are included but not listed below. Similarly, the reconstructed charge is included for all charged particles. An estimated per-particle probability for the particle to come from a pileup interaction using the pileup identification (PUPPI) algorithm [61] is labelled as PUPPI. A number of input variables that give the compatibility of the track with the primary interaction vertex (PV) or a possible secondary vertex (SV) from the $ \tau_\mathrm{h} $ reconstruction are denoted as ``Track PV'' and ``Track SV''.

png pdf
Table 3:
Target $ \tau_\mathrm{h} $ identification efficiencies for the different working points defined for the three different discriminators. The target efficiencies are evaluated with the $ \mathrm{H}\to\tau\tau $ event sample for $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}} \in $ [30, 70] GeV.

png pdf
Table A1:
Parameters used in the definition of the loss function for the training of the DEEPTAU algorithm.
Summary
A new algorithm has been introduced to discriminate hadronic tau lepton decays ($ \tau_\mathrm{h} $) against jets, electrons, and muons. The algorithm is based on a deep neural network and combines fully connected and convolutional layers. As input, the algorithm combines information from individual reconstructed particles near the $ \tau_\mathrm{h} $ axis with information about the reconstructed $ \tau_\mathrm{h} $ candidate and other high-level variables. In addition, an improved $ \tau_\mathrm{h} $ reconstruction algorithm is introduced that increases the overall efficiency of the reconstruction by explicitly considering the $ \tau_\mathrm{h} $ decay mode with three charged hadrons and a neutral pion, and by applying looser quality criteria for the charged hadrons in the case of three-prong $ \tau_\mathrm{h} $ decays. The performance of the new $ \tau_\mathrm{h} $ identification and reconstruction algorithms significantly improves over the previously used algorithms, in particular in terms of discrimination against the background from jets and electrons. For a given jet rejection level, the efficiency for genuine $ \tau_\mathrm{h} $ candidates increases by 10--30%. Similarly, the efficiency for genuine $ \tau_\mathrm{h} $ candidates to pass the discriminator against electrons increases by 14% for the loosest working point that is employed in many analyses. Following its superior performance, CMS physics analyses with tau leptons will significantly increase their sensitivities when using the new algorithm. The superior performance of the algorithm is validated with collision data. The observed efficiencies for genuine $ \tau_\mathrm{h} $, jets, and electrons to be identified as $ \tau_\mathrm{h} $ typically agree within 10% with the expected efficiencies from simulated events. The agreement is similar to the one observed with previous algorithms and confirms the improvements.
Additional Figures

png pdf
Additional Figure 1:
Decay mode confusion matrix for high-$ p_{\mathrm{T}} \tau_\mathrm{h} $. For a given generated decay mode, the fractions of reconstructed $ \tau_\mathrm{h} $ in different decay modes are given, as well as the fraction of generated $ \tau_\mathrm{h} $ that are not reconstructed. The generated $ \tau_\mathrm{h} $ needs to fulfil $ p_{\mathrm{T}} > $ 100 GeV whereas the reconstructed $ \tau_\mathrm{h} $ is required to fulfil $ p_{\mathrm{T}} > $ 20 GeV. Both the generated and reconstructed $ \tau_\mathrm{h} $ need to fulfil $ |\eta| < $ 2.3. The $ \tau_\mathrm{h} $ candidates come from a $ \mathrm{Z}\to\tau\tau $ event sample with $ m_{\tau\tau} > $ 50 GeV. Decay modes with the same numbers of charged hadrons and one or two $ \pi^{0} $s are combined and labelled as ``$ \pi^{0} $s''.

png pdf
Additional Figure 2:
Data-to-simulation scale factors for genuine $ \tau_\mathrm{h} $ to be reconstructed as $ \tau_\mathrm{h} $ candidates, to fulfil additional loose pre-selection criteria discussed in the paper, and to pass the loose, medium, and tight working points of the $ D_\text{jet} $ discriminator as a function of the $ \tau_\mathrm{h} $ candidate $ p_{\mathrm{T}} $ for the 2016 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The red hatched bands indicate the uncertainties for $ p_{\mathrm{T}} > $ 40 GeV, obtained from a combination of the individual measurements.

png pdf
Additional Figure 3:
Data-to-simulation scale factors for genuine $ \tau_\mathrm{h} $ to be reconstructed as $ \tau_\mathrm{h} $ candidates, to fulfil additional loose pre-selection criteria discussed in the paper, and to pass the loose, medium, and tight working points of the $ D_\text{jet} $ discriminator as a function of the $ \tau_\mathrm{h} $ candidate $ p_{\mathrm{T}} $ for the 2017 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. The red hatched bands indicate the uncertainties for $ p_{\mathrm{T}} > $ 40 GeV, obtained from a combination of the individual measurements.

png pdf
Additional Figure 4:
Relative difference between $ \tau_\mathrm{h} $ energy obtained in data and simulated events for the four main reconstructed $ \tau_\mathrm{h} $ decay modes for the 2016 data set. The results are obtained from fits to the distribution of either the reconstructed $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ (blue lines) or $ m_{\tau_\mathrm{h}} $ (black lines). The horizontal bars represent the uncertainties in the measurements. The measured values are consistent with no shift of the $ \tau_\mathrm{h} $ energy scale between data and simulation, with the largest difference amounting to 1.5 standard deviations.

png pdf
Additional Figure 5:
Relative difference between $ \tau_\mathrm{h} $ energy obtained in data and simulated events for the four main reconstructed $ \tau_\mathrm{h} $ decay modes for the 2017 data set. The results are obtained from fits to the distribution of either the reconstructed $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ (blue lines) or $ m_{\tau_\mathrm{h}} $ (black lines). The horizontal bars represent the uncertainties in the measurements. The measured values are consistent with no shift of the $ \tau_\mathrm{h} $ energy scale between data and simulation, with the largest difference amounting to 1.5 standard deviations.

png pdf
Additional Figure 6:
Distribution of the reconstructed visible invariant mass of the $ \mu\tau_\mathrm{h} $ system, $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ for the 2016 data set. Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons.

png pdf
Additional Figure 7:
Distribution of the visible invariant $ \tau_\mathrm{h} $ mass for the 2016 data set. Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons. The $ \mathrm{Z}\to\tau\tau $ contributions are shown separately for the different $ \tau_\mathrm{h} $ decay modes.

png pdf
Additional Figure 8:
Distribution of the reconstructed visible invariant mass of the $ \mu\tau_\mathrm{h} $ system, $ m_\text{vis}(\mu, \tau_\mathrm{h}) $ for the 2017 data set. Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons.

png pdf
Additional Figure 9:
Distribution of the visible invariant $ \tau_\mathrm{h} $ mass for the 2017 data set. Vertical bars correspond to the statistical uncertainties. The event selection corresponds to the one in the measurement of the $ \tau_\mathrm{h} $ reconstruction and identification efficiencies and uses the Tight working point of the $ D_\text{jet} $ discriminator. The distributions incorporate all measured scale factors and energy corrections and are scaled to the outcome of a maximum likelihood to the observed data with the $ \mathrm{Z}\to\tau\tau $ contribution freely floating. The electroweak background combines contributions from single top quark, diboson, and W+jets processes as well as $ \mathrm{Z}(\to\tau\tau) $+jets events where the reconstructed $ \tau_\mathrm{h} $ originates from a jet misidentified as a $ \tau_\mathrm{h} $ candidate instead of one of the produced tau leptons. The $ \mathrm{Z}\to\tau\tau $ contributions are shown separately for the different $ \tau_\mathrm{h} $ decay modes.

png pdf
Additional Figure 10:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points for the 2016 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown for electrons with $ |\eta| < $ 1.46, corresponding to the ECAL barrel region.

png pdf
Additional Figure 11:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points for the 2016 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown for electrons with $ |\eta| > $ 1.56, corresponding to the ECAL endcap regions.

png pdf
Additional Figure 12:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points for the 2017 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown for electrons with $ |\eta| < $ 1.46, corresponding to the ECAL barrel region.

png pdf
Additional Figure 13:
Observed and expected efficiencies for electrons to pass different $ D_\mathrm{e} $ working points for the 2017 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered electrons are required to be reconstructed as a $ \tau_\mathrm{h} $ candidate and to pass the medium $ D_\text{jet} $ working point. The efficiencies are shown for electrons with $ |\eta| > $ 1.56, corresponding to the ECAL endcap regions.

png pdf
Additional Figure 14:
Observed and expected efficiencies for muons to pass the loose $ D_\mu $ working point for the 2016 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $. The disagreement in the last bin is discussed in the paper.

png pdf
Additional Figure 15:
Observed and expected efficiencies for muons to pass the tight $ D_\mu $ working point for the 2016 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $. The disagreement in the last bin is discussed in the paper.

png pdf
Additional Figure 16:
Observed and expected efficiencies for muons to pass the loose $ D_\mu $ working point for the 2017 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $. The disagreement in the last bin is discussed in the paper.

png pdf
Additional Figure 17:
Observed and expected efficiencies for muons to pass the tight $ D_\mu $ working point for the 2017 data set. Vertical bars correspond to the combined statistical and systematic uncertainties in the individual scale factors. All considered muons are required to be reconstructed as $ \tau_\mathrm{h} $ candidates and to pass the Medium $ D_\text{jet} $ working point. The efficiencies are shown for several bins in $ |\eta| $. The disagreement in the last bin is discussed in the paper.

png pdf
Additional Figure 18:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the loose $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2016 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.

png pdf
Additional Figure 19:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the tight $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2016 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.

png pdf
Additional Figure 20:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the loose $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2017 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.

png pdf
Additional Figure 21:
Observed and expected efficiencies for quark and gluon jets with $ |\eta| < $ 2.3 to be reconstructed as $ \tau_\mathrm{h} $ candidates with $ p_{\mathrm{T}} > $ 20 GeV and to pass the tight $ D_\text{jet} $ working points. The efficiencies are shown as a function of reconstructed jet $ p_{\mathrm{T}} $ and are obtained with data recorded in 2017 and the corresponding simulated events. The shaded uncertainty band includes contributions from the limited number of simulated events and from uncertainties in the jet energy scale and the $ p_{\mathrm{T}}^\text{miss} $ description. Besides the statistical uncertainty in the observed events, the error bars in the ratio of data to simulation also include uncertainties from the subtraction of events with genuine $ \tau_\mathrm{h} $ candidates, electrons, or muons. The efficiency first rises with $ p_{\mathrm{T}} $ near the 20 GeV threshold because it becomes more likely for a jet to give rise to a reconstructed $ \tau_\mathrm{h} $ candidate that passes this threshold. For higher $ p_{\mathrm{T}} $, the particle multiplicity in a quark or gluon jet increases with $ p_{\mathrm{T}} $. Therefore, the jets become easier to distinguish from genuine $ \tau_\mathrm{h} $ candidates and the probability for quark or gluon jets to pass the $ D_\text{jet} $ discriminator decreases with $ p_{\mathrm{T}} $.

png pdf
Additional Figure 22:
Expected misidentification probabilities for quark- or gluon-induced jets from W-plus-jets production to be reconstructed in any of the $ \tau_\mathrm{h} $ decay modes and pass various working points of the DeepTau vs.\ jets discriminant. The misidentification probabilities are shown as a function of the generated jet $ p_{\mathrm{T}} $.

png pdf
Additional Figure 23:
Expected misidentification probabilities for jets from $ {\mathrm{t}\overline{\mathrm{t}}} $-plus-jets production to be reconstructed in any of the $ \tau_\mathrm{h} $ decay modes and pass various working points of the DeepTau vs.\ jets discriminant. The misidentification probabilities are shown as a function of the generated jet $ p_{\mathrm{T}} $.

png pdf
Additional Figure 24:
Correct charge assignment probability as a function of the generated $ \tau_\mathrm{h} p_{\mathrm{T}} $. Reconstructed $ \tau_\mathrm{h} $ are required to pass the medium $ D_\text{jet} $ working point. The blue points represent the selection most commonly used in analyses, where reconstructed $ \tau_\mathrm{h} $ decay modes with missing charged hadrons (2-prong) are not considered. The black points represent the case where these $ \tau_\mathrm{h} $ decay modes are considered. For $ p_{\mathrm{T}} \lesssim $ 100 GeV, the correct charge identification probability is close to 100%. For increasing $ p_{\mathrm{T}} $, the probability drops to 90% (at $ p_{\mathrm{T}}\approx $ 1 TeV). The red points show the specific case where 3-prong taus are reconstructed in a decay mode with a missing charged hadron. The correct charge assignment probability is $ {\approx}2/ $ 3, compatible with the expectation that the hadron that is not reconstructed or identified is not correlated with the $ \tau_\mathrm{h} $ charge.
Additional Tables

png pdf
Additional Table 1:
Efficiencies for genuine $ \tau_\mathrm{h} $, quark- or gluon-induced jets, electrons, and muons to pass the working points defined for the three different DeepTau discriminants. The target $ \tau_\mathrm{h} $ efficiencies are evaluated with an $ \mathrm{H}\to \tau\tau $ event sample for $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}}\in $ [20,100] GeV. The misidentification efficiencies for quark- or gluon induced jets are evaluated with jets from an inclusive W-plus-jets event sample with reconstructed $ \tau_\mathrm{h} $ in the same $ p_{\mathrm{T}} $ range. The efficiencies for genuine electrons (muons) are evaluated with an inclusive $ \mathrm{Z}\to\mathrm{e}\mathrm{e} $ ($ \mu\mu $) event sample, again using electrons (muons) that give rise to a reconstructed $ \tau_\mathrm{h} $ with $ p_{\mathrm{T}}\in $ [20,100 ] GeV. While the efficiencies for genuine $ \tau_\mathrm{h} $, electrons, and muons are generally representative also for different samples and $ p_{\mathrm{T}} $ ranges, the efficiencies for quark- or gluon-induced jets strongly depend on the jet flavour and $ p_{\mathrm{T}} $. For this reason, these numbers, which are integrated over the $ p_{\mathrm{T}} $, $ \eta $, and jet flavour spectra, are given for illustrative purposes only.
References
1 CMS Collaboration Observation of the Higgs boson decay to a pair of $ \tau $ leptons with the CMS detector PLB 779 (2018) 283 CMS-HIG-16-043
1708.00373
2 CMS Collaboration Search for Higgs boson pair production in events with two bottom quarks and two tau leptons in proton--proton collisions at $ \sqrt{s}= $ 13 TeV PLB 778 (2018) 101 CMS-HIG-17-002
1707.02909
3 ATLAS Collaboration Cross-section measurements of the Higgs boson decaying into a pair of $ \tau $-leptons in proton-proton collisions at $ \sqrt{s}= $ 13 TeV with the ATLAS detector PRD 99 (2019) 072001 1811.08856
4 ATLAS Collaboration Search for resonant and non-resonant Higgs boson pair production in the $ {\text{b}\bar{\text{b}}\tau^+\tau^-} $ decay channel in pp collisions at $ \sqrt{s}= $ 13 TeV with the ATLAS detector PRL 121 (2018) 191801 1808.00336
5 ATLAS Collaboration Test of CP invariance in vector-boson fusion production of the Higgs boson in the $ \text{H}\rightarrow\tau\tau $ channel in proton-proton collisions at $ \sqrt{s} $ = 13 TeV with the ATLAS detector PLB 805 (2020) 135426 2002.05315
6 CMS Collaboration Search for additional neutral MSSM Higgs bosons in the $ \tau\tau $ final state in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JHEP 09 (2018) 007 CMS-HIG-17-020
1803.06553
7 ATLAS Collaboration Search for charged Higgs bosons decaying via H$ ^{\pm} \to \tau^{\pm}\nu_{\tau} $ in the $ \tau $+jets and $ \tau $+lepton final states with 36 fb$ ^{-1} $ of pp collision data recorded at $ \sqrt{s} = $ 13 TeV with the ATLAS experiment JHEP 09 (2018) 139 1807.07915
8 CMS Collaboration Search for an exotic decay of the Higgs boson to a pair of light pseudoscalars in the final state with two b quarks and two $ \tau $ leptons in proton-proton collisions at $ \sqrt{s}= $ 13 TeV PLB 785 (2018) 462 CMS-HIG-17-024
1805.10191
9 CMS Collaboration Search for a heavy pseudoscalar Higgs boson decaying into a 125 GeV Higgs boson and a Z boson in final states with two tau and two light leptons at $ \sqrt{s}= $ 13 TeV JHEP 03 (2020) 065 CMS-HIG-18-023
1910.11634
10 CMS Collaboration Search for lepton flavour violating decays of a neutral heavy Higgs boson to $ \mu\tau $ and e$ \tau $ in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JHEP 03 (2020) 103 CMS-HIG-18-017
1911.10267
11 CMS Collaboration Search for a low-mass $ \tau^+\tau^- $ resonance in association with a bottom quark in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JHEP 05 (2019) 210 CMS-HIG-17-014
1903.10228
12 CMS Collaboration Search for charged Higgs bosons in the H$ ^{\pm} \to \tau^{\pm}\nu_\tau $ decay channel in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JHEP 07 (2019) 142 CMS-HIG-18-014
1903.04560
13 ATLAS Collaboration Search for heavy Higgs bosons decaying into two tau leptons with the ATLAS detector using pp collisions at $ \sqrt{s}= $ 13 TeV PRL 125 (2020) 051801 2002.12223
14 CMS Collaboration Search for direct pair production of supersymmetric partners to the $ \tau $ lepton in proton-proton collisions at $ \sqrt{s}= $ 13 TeV EPJC 80 (2020) 189 CMS-SUS-18-006
1907.13179
15 CMS Collaboration Search for heavy neutrinos and third-generation leptoquarks in hadronic states of two $ \tau $ leptons and two jets in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JHEP 03 (2019) 170 CMS-EXO-17-016
1811.00806
16 ATLAS Collaboration Searches for third-generation scalar leptoquarks in $ \sqrt{s} $ = 13 TeV pp collisions with the ATLAS detector JHEP 06 (2019) 144 1902.08103
17 CMS Collaboration Performance of tau-lepton reconstruction and identification in CMS JINST 7 (2012) P01001 CMS-TAU-11-001
1109.6034
18 CMS Collaboration Reconstruction and identification of $ \tau $ lepton decays to hadrons and $ \nu_\tau $ at CMS JINST 11 (2016) P01019 CMS-TAU-14-001
1510.07488
19 CMS Collaboration Performance of reconstruction and identification of $ \tau $ leptons decaying to hadrons and $ \nu_\tau $ in pp collisions at $ \sqrt{s}= $ 13 TeV JINST 13 (2018) P10005 CMS-TAU-16-003
1809.02816
20 ATLAS Collaboration Identification and energy calibration of hadronically decaying tau leptons with the ATLAS experiment in pp collisions at $ \sqrt{s}= $ 8 TeV EPJC 75 (2015) 303 1412.7086
21 ATLAS Collaboration Reconstruction of hadronic decay products of tau leptons with the ATLAS experiment EPJC 76 (2016) 295 1512.05955
22 Y. LeCun et al. Backpropagation applied to handwritten zip code recognition Neural Comput. 1 (1989) 541
23 Y. LeCun et al. Handwritten digit recognition with a back-propagation network in Advances in Neural Information Processing Systems 2, Morgan Kaufmann, 1990
link
24 Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner Gradient-based learning applied to document recognition Proc. IEEE 86 (1998) 2278
25 V. Innocente, Y. F. Wang, and Z. P. Zhang Identification of tau decays using a neural network NIM A 323 (1992) 647
26 ATLAS Collaboration Identification of hadronic tau lepton decays using neural networks in the ATLAS experiment ATLAS PUB Note ATL-PHYS-PUB-2019-033, 2019
27 CMS Collaboration Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV JINST 13 (2018) P05011 CMS-BTV-16-002
1712.07158
28 ATLAS Collaboration Identification of jets containing b-hadrons with recurrent neural networks at the ATLAS experiment ATLAS PUB Note ATL-PHYS-PUB-2017-003, 2017
29 ATLAS Collaboration Deep sets based neural networks for impact parameter flavour tagging in ATLAS ATLAS PUB Note ATL-PHYS-PUB-2020-014, 2020
30 E. Bols et al. Jet flavour classification using DeepJet JINST 15 (2020) P12012 2008.10519
31 CMS Collaboration Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques JINST 15 (2020) P06005 CMS-JME-18-002
2004.08262
32 CMS Collaboration HEPData record for this analysis link
33 CMS Collaboration The CMS experiment at the CERN LHC JINST 3 (2008) S08004
34 CMS Collaboration Particle-flow reconstruction and global event description with the CMS detector JINST 12 (2017) P10003 CMS-PRF-14-001
1706.04965
35 M. Cacciari, G. P. Salam, and G. Soyez The anti-$ k_{\mathrm{T}} $ jet clustering algorithm JHEP 04 (2008) 063 0802.1189
36 M. Cacciari, G. P. Salam, and G. Soyez FastJet user manual EPJC 72 (2012) 1896 1111.6097
37 CMS Collaboration Pileup mitigation at CMS in 13 TeV data JINST 15 (2020) P09018 CMS-JME-18-001
2003.00503
38 CMS Collaboration Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV JINST 12 (2017) P02014 CMS-JME-13-004
1607.03663
39 CMS Collaboration Performance of missing transverse momentum reconstruction in proton-proton collisions at $ \sqrt{s} = $ 13 TeV using the CMS detector JINST 14 (2019) P07004 CMS-JME-17-001
1903.06078
40 CMS Collaboration Performance of the CMS muon detector and muon reconstruction with proton-proton collisions at $ \sqrt{s}= $ 13 TeV JINST 13 (2018) P06015 CMS-MUO-16-001
1804.04528
41 CMS Collaboration Performance of electron reconstruction and selection with the CMS detector in proton-proton collisions at $ \sqrt{s} = $ 8 TeV JINST 10 (2015) P06005 CMS-EGM-13-001
1502.02701
42 CMS Collaboration Performance of the CMS Level-1 trigger in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JINST 15 (2020) P10017 CMS-TRG-17-001
2006.10165
43 CMS Collaboration The CMS trigger system JINST 12 (2017) P01020 CMS-TRG-12-001
1609.02366
44 J. Alwall et al. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations JHEP 07 (2014) 079 1405.0301
45 J. Alwall et al. Comparative study of various algorithms for the merging of parton showers and matrix elements in hadronic collisions EPJC 53 (2008) 473 0706.2569
46 R. Frederix and S. Frixione Merging meets matching in MC@NLO JHEP 12 (2012) 061 1209.6215
47 P. Nason A new method for combining NLO QCD with shower Monte Carlo algorithms JHEP 11 (2004) 040 hep-ph/0409146
48 S. Frixione, P. Nason, and C. Oleari Matching NLO QCD computations with parton shower simulations: the POWHEG method JHEP 11 (2007) 070 0709.2092
49 S. Alioli, P. Nason, C. Oleari, and E. Re NLO single-top production matched with shower in POWHEG: s- and t-channel contributions JHEP 09 (2009) 111 0907.4076
50 S. Alioli, P. Nason, C. Oleari, and E. Re A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX JHEP 06 (2010) 043 1002.2581
51 E. Re Single-top Wt-channel production matched with parton showers using the POWHEG method EPJC 71 (2011) 1547 1009.2450
52 J. M. Campbell, R. K. Ellis, P. Nason, and E. Re Top-pair production and decay at NLO matched with parton showers JHEP 04 (2015) 114 1412.1828
53 T. Sjöstrand et al. An introduction to PYTHIA 8.2 Comput. Phys. Commun. 191 (2015) 159 1410.3012
54 S. Alioli, P. Nason, C. Oleari, and E. Re NLO Higgs boson production via gluon fusion matched with shower in POWHEG JHEP 04 (2009) 002 0812.0578
55 NNPDF Collaboration Parton distributions for the LHC Run II JHEP 04 (2015) 040 1410.8849
56 NNPDF Collaboration Parton distributions from high-precision collider data EPJC 77 (2017) 663 1706.00428
57 CMS Collaboration Extraction and validation of a new set of CMS PYTHIA8 tunes from underlying-event measurements EPJC 80 (2020) 4 CMS-GEN-17-001
1903.12179
58 N. Davidson et al. Universal interface of TAUOLA technical and physics documentation Comput. Phys. Commun. 183 (2012) 821 1002.0543
59 GEANT4 Collaboration GEANT 4---a simulation toolkit NIM A 506 (2003) 250
60 Particle Data Group , P. A. Zyla et al. Review of particle physics Prog. Theor. Exp. Phys. 2020 (2020) 083C01
61 D. Bertolini, P. Harris, M. Low, and N. Tran Pileup per particle identification JHEP 10 (2014) 059 1407.6013
62 I. J. Goodfellow, Y. Bengio, and A. Courville Deep Learning MIT Press, Cambridge, MA, USA, 2016
link
63 S. Ioffe and C. Szegedy Batch normalization: Accelerating deep network training by reducing internal covariate shift 1502.03167
64 et al. Improving neural networks by preventing co-adaptation of feature detectors G.~E. Hinton, 2012
link
1207.0580
65 et al. Dropout: A simple way to prevent neural networks from overfitting N.~, 2014
Srivastava J. Mach. Learn. Res. 15 (2014) 1929
66 K. He, X. Zhang, S. Ren, and J. Sun Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification link 1502.01852
67 T.-Y. Lin et al. Focal loss for dense object detection TPAMI 42 (2018) 318 1708.02002
68 T. Dozat Incorporating Nesterov momentum into ADAM in Conference Track Proceedings, 4th International Conference on Learning Representations (ICLR), 2016
link
69 D. P. Kingma and J. Ba Adam: A method for stochastic optimization in Conference Track Proceedings, 3rd International Conference on Learning Representations (ICLR), 2015 1412.6980
70 CMSnoop Keras Chollet et al., 2015
71 M. Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems 2015. Software available at
link
72 CMS Collaboration Measurements of Higgs boson production in the decay channel with a pair of $ \tau $ leptons in proton-proton collisions at $ \sqrt{s} $ = 13 TeV Submitted to Eur. Phys. J. C, 2022 CMS-HIG-19-010
2204.12957
73 CMS Collaboration Measurement of the Higgs boson production rate in association with top quarks in final states with electrons, muons, and hadronically decaying tau leptons at $ \sqrt{s} = $ 13 TeV EPJC 81 (2021) 378 CMS-HIG-19-008
2011.03652
74 CMS Collaboration Search for singly and pair-produced leptoquarks coupling to third-generation fermions in proton-proton collisions at $ \sqrt{s} = $ 13 TeV PLB 819 (2021) 136446 CMS-EXO-19-015
2012.04178
75 CMS Collaboration Precision luminosity measurement in proton-proton collisions at $ \sqrt{s} = $ 13 TeV in 2015 and 2016 at CMS EPJC 81 (2021) 800 CMS-LUM-17-003
2104.01927
76 CMS Collaboration CMS luminosity measurement for the 2017 data-taking period at $ \sqrt{s} = $ 13 TeV CMS Physics Analysis Summary, 2018
CMS-PAS-LUM-17-004
CMS-PAS-LUM-17-004
77 CMS Collaboration CMS luminosity measurement for the 2018 data-taking period at $ \sqrt{s} = $ 13 TeV CMS Physics Analysis Summary, 2019
CMS-PAS-LUM-18-002
CMS-PAS-LUM-18-002
78 G. Cowan, K. Cranmer, E. Gross, and O. Vitells Asymptotic formulae for likelihood-based tests of new physics EPJC 71 (2011) 1554 1007.1727
79 CMS Collaboration Measurements of inclusive W and Z cross sections in pp collisions at $ \sqrt{s}= $ 7 TeV JHEP 01 (2011) 080 CMS-EWK-10-002
1012.2466
Compact Muon Solenoid
LHC, CERN