CMS logoCMS event Hgg
Compact Muon Solenoid
LHC, CERN

CMS-HIG-18-027 ; CERN-EP-2019-261
A deep neural network for simultaneous estimation of b jet energy and resolution
Computing and Software for Big Science 4 (2020) 10
Abstract: We describe a method to obtain point and dispersion estimates for the energies of jets arising from b quarks produced in proton-proton collisions at an energy of $\sqrt{s} = $ 13 TeV at the CERN LHC. The algorithm is trained on a large simulated sample of b jets and validated on data recorded by the CMS detector in 2017 corresponding to an integrated luminosity of 41 fb$^{-1}$. A multivariate regression algorithm based on a deep feed-forward neural network employs jet composition and shape information, and the properties of reconstructed secondary vertices associated with the jet. The results of the algorithm are used to improve the sensitivity of analyses that make use of b jets in the final state, such as the observation of Higgs boson decay to $\mathrm{b\bar{b}}$.
Figures & Tables Summary References CMS Publications
Figures

png pdf
Figure 1:
(left) The $ {{p_{\mathrm {T}}} ^\text {reco}}$ distribution for reconstructed b jets in an MC ${\mathrm{t} {}\mathrm{\bar{t}}}$ sample. (right) Distribution of the regression target for the MC ${\mathrm{t} {}\mathrm{\bar{t}}}$ training sample.

png pdf
Figure 1-a:
The $ {{p_{\mathrm {T}}} ^\text {reco}}$ distribution for reconstructed b jets in an MC ${\mathrm{t} {}\mathrm{\bar{t}}}$ sample.

png pdf
Figure 1-b:
Distribution of the regression target for the MC ${\mathrm{t} {}\mathrm{\bar{t}}}$ training sample.

png pdf
Figure 2:
The 25, 40, 50, and 75% quantiles are shown for the b jet energy scale $ {{p_{\mathrm {T}}} ^{\text {gen}}}/ {{p_{\mathrm {T}}} ^\text {reco}}$ distribution before (blue dashdot) and after (red solid) applying the regression correction as a function of jet ${p_{\mathrm {T}}}$ (left), $\eta $ (center), and $\rho $ (right).

png pdf
Figure 2-a:
The 25, 40, 50, and 75% quantiles are shown for the b jet energy scale $ {{p_{\mathrm {T}}} ^{\text {gen}}}/ {{p_{\mathrm {T}}} ^\text {reco}}$ distribution before (blue dashdot) and after (red solid) applying the regression correction as a function of jet ${p_{\mathrm {T}}}$.

png pdf
Figure 2-b:
The 25, 40, 50, and 75% quantiles are shown for the b jet energy scale $ {{p_{\mathrm {T}}} ^{\text {gen}}}/ {{p_{\mathrm {T}}} ^\text {reco}}$ distribution before (blue dashdot) and after (red solid) applying the regression correction as a function of $\eta $.

png pdf
Figure 2-c:
The 25, 40, 50, and 75% quantiles are shown for the b jet energy scale $ {{p_{\mathrm {T}}} ^{\text {gen}}}/ {{p_{\mathrm {T}}} ^\text {reco}}$ distribution before (blue dashdot) and after (red solid) applying the regression correction as a function of jet $\rho $.

png pdf
Figure 3:
Relative jet energy resolution, ${\overline {\mathrm {s}}}$, as a function of generator-level jet $ {{p_{\mathrm {T}}} ^{\text {gen}}}$ (left), $\eta $ (center), and $\rho $ (right) for b jets from ${\mathrm{t} {}\mathrm{\bar{t}}}$ MC events. The average ${p_{\mathrm {T}}}$ of these b jets is 80 GeV. The blue stars and red squares represent ${\overline {\mathrm {s}}}$ before and after the DNN correction, respectively. The relative difference $\Delta {\overline {\mathrm {s}}} / {\overline {\mathrm {s}}} _{\text {baseline}}$ between the ${\overline {\mathrm {s}}}$ values before and after DNN corrections is shown in the lower panels.

png pdf
Figure 3-a:
Relative jet energy resolution, ${\overline {\mathrm {s}}}$,

png pdf
Figure 3-b:
Relative jet energy resolution, ${\overline {\mathrm {s}}}$,

png pdf
Figure 3-c:
Relative jet energy resolution, ${\overline {\mathrm {s}}}$,

png pdf
Figure 4:
Correlation between jet energy resolution $\mathrm {s}$ and the average jet energy resolution estimator $< \hat{\mathrm {s}}> $ for b jets from ${\mathrm{t} {}\mathrm{\bar{t}}}$ MC events. The blue circles correspond to the inclusive ${p_{\mathrm {T}}}$ spectrum, while the blue band represents 20% up and down variations of the fitted $< \hat{\mathrm {s}}> $ trend for the inclusive ${p_{\mathrm {T}}}$ spectrum. The red stars correspond to jets with ${p_{\mathrm {T}}}$ $\in $ [30, 50] GeV, orange diamonds to ${p_{\mathrm {T}}}$ $\in $ [50, 70] GeV, and green crosses to ${p_{\mathrm {T}}}$ $\in $ [110,120] GeV.

png pdf
Figure 5:
Dijet invariant mass distributions for simulated samples of ${\mathrm{Z} (\to \ell ^+\ell ^-)\mathrm{H} (\to b \mathrm{\bar{b}})}$ events, where two jets and two leptons were selected. Distributions are shown before (dotted blue) and after (solid red) applying the b jet energy corrections. A Bukin function [40] was used to fit the distribution. The fitted mean and width of the core of each distribution are displayed in the figure.

png pdf
Figure 6:
Distribution of the ratio between the transverse momentum of the leading b-tagged jet and that of the dilepton system from the decay of the Z boson. Distributions are shown before (left) and after (right) applying the b jet energy corrections. The ${\overline {\mathrm {s}}}$ values of the core distributions are included in the figures. The black points and histogram show the distributions for data and simulated events, respectively.

png pdf
Figure 6-a:
Distribution of the ratio between the transverse momentum of the leading b-tagged jet and that of the dilepton system from the decay of the Z boson. Distributions are shown before applying the b jet energy corrections. The ${\overline {\mathrm {s}}}$ values of the core distributions are included in the figures. The black points and histogram show the distributions for data and simulated events, respectively.

png pdf
Figure 6-b:
Distribution of the ratio between the transverse momentum of the leading b-tagged jet and that of the dilepton system from the decay of the Z boson. Distributions are shown after applying the b jet energy corrections. The ${\overline {\mathrm {s}}}$ values of the core distributions are included in the figures. The black points and histogram show the distributions for data and simulated events, respectively.
Tables

png pdf
Table 1:
Relative differences $\Delta {\overline {\mathrm {s}}} / {\overline {\mathrm {s}}} _\text {baseline}$ between the ${\overline {\mathrm {s}}}$ values obtained before and after applying the DNN energy correction for b jets produced in the different physics processes indicated.
Summary
We have described an algorithm that makes it possible to obtain point and dispersion estimates of the energy of jets arising from b quarks in proton-proton collisions. We trained a deep, feed-forward neural network, with inputs based on jet composition and shape information, and on properties of the associated reconstructed secondary vertex for a sample of simulated b jets arising from the decays of top quark-antiquark pairs. The neural network simultaneously finds robust mean, 25 and 75% quantile estimators for the energy of a b jet. The mean estimator is based on the Huber loss function and is used as an energy correction, while the 25 and 75% quantile estimators are used to build a jet-by-jet resolution estimator, defined as half the difference between these quantiles.

The DNN-based algorithm leverages the information contained in a large training data set consisting of nearly 100 million simulated b jets, and improves the resolution of the b jet energy by 12-15% relative to that which is found after baseline corrections. An improvement of about 20% is observed in the resolution of the invariant mass of b jet pairs resulting from the decay of a Higgs boson produced in association with a Z boson. Events containing a dilepton decay of a Z boson produced in association with a b jet are used to validate the performance of the algorithm on proton-proton collision data recorded with the CMS detector. The jet energy resolution improvement observed in data is consistent with that found in simulation. The resolution estimator is further shown to predict the resolution of b jets with an accuracy of 20% over a ${p_{\mathrm{T}}}$ range between 30 and 350 GeV.

The results described here are being used by the CMS Collaboration in several physics analyses targeting final states containing b jets, including the observation of the Higgs boson decay to $\mathrm{b\bar{b}}$ [13].
References
1 ATLAS Collaboration Observation of a new particle in the search for the standard model Higgs boson with the ATLAS detector at the LHC PLB 716 (2012) 1 1207.7214
2 CMS Collaboration Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC PLB 716 (2012) 30 CMS-HIG-12-028
1207.7235
3 CMS Collaboration A new boson with a mass of 125 GeV observed with the CMS experiment at the Large Hadron Collider Science 338 (2012) 1569
4 ATLAS Collaboration Measurements of Higgs boson production and couplings in the four-lepton channel in pp collisions at center-of-mass energies of 7 and 8 TeV with the ATLAS detector PRD 91 (2015) 012006 1408.5191
5 ATLAS Collaboration Observation and measurement of Higgs boson decays to WW$ ^* $ with the ATLAS detector PRD 92 (2015) 012006 1412.2641
6 ATLAS Collaboration Measurement of Higgs boson production in the diphoton decay channel in pp collisions at center-of-mass energies of 7 and 8 TeV with the ATLAS detector PRD 90 (2014) 112015 1408.7084
7 CMS Collaboration Measurement of the properties of a Higgs boson in the four-lepton final state PRD 89 (2014) 092007 CMS-HIG-13-002
1312.5353
8 CMS Collaboration Measurement of Higgs boson production and properties in the WW decay channel with leptonic final states JHEP 01 (2014) 096 CMS-HIG-13-023
1312.1129
9 CMS Collaboration Observation of the diphoton decay of the Higgs boson and measurement of its properties EPJC 74 (2014) 3076 CMS-HIG-13-001
1407.0558
10 CMS Collaboration Observation of the Higgs boson decay to a pair of $ \tau $ leptons with the CMS detector PLB 779 (2018) 283 CMS-HIG-16-043
1708.00373
11 ATLAS Collaboration Observation of Higgs boson production in association with a top quark pair at the LHC with the ATLAS detector PLB 784 (2018) 173 1806.00425
12 CMS Collaboration Observation of $ \mathrm{t\overline{t}} $H production PRL 120 (2018) 231801 CMS-HIG-17-035
1804.02610
13 CMS Collaboration Observation of Higgs boson decay to bottom quarks PRL 121 (2018) 121801 CMS-HIG-18-016
1808.08242
14 ATLAS Collaboration Observation of $ \mathrm{H \rightarrow \mathrm{b\bar{b}}} $ decays and VH production with the ATLAS detector PLB 786 (2018) 59 1808.08238
15 CDF Collaboration Search for the standard model Higgs boson decaying to a $ \mathrm{b\bar{b}} $ pair in events with one charged lepton and large missing transverse energy using the full CDF data set PRL 109 (2012) 111804 1207.1703
16 CMS Collaboration Search for the standard model Higgs boson produced through vector boson fusion and decaying to $ \mathrm{b\bar{b}} $ PRD 92 (2015) 032008 CMS-HIG-14-004
1506.01010
17 P. J. Huber Robust estimation of a location parameter Ann. Math. Statist. 35 (1994) 731
18 R. W. Koenker and G. Bassett Regression quantiles Econometrica 46 (1978) 33
19 CMS Collaboration The CMS experiment at the CERN LHC JINST 3 (2008) S08004 CMS-00-001
20 CMS Collaboration Particle-flow reconstruction and global event description with the CMS detector JINST 12 (2017) P10003 CMS-PRF-14-001
1706.04965
21 M. Cacciari, G. P. Salam, and G. Soyez The anti-$ {k_{\mathrm{T}}} $ jet clustering algorithm JHEP 04 (2008) 063 0802.1189
22 M. Cacciari, G. P. Salam, and G. Soyez FastJet user Manual EPJC 72 (2012) 1896 1111.6097
23 CMS Collaboration Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV JINST 12 (2017) P02014 CMS-JME-13-004
1607.03663
24 CMS Collaboration Determination of jet energy calibration and transverse momentum resolution in CMS JINST 6 (2011) P11002 CMS-JME-10-011
1107.4277
25 J. M. Campbell, R. K. Ellis, P. Nason, and E. Re Top-Pair production and decay at NLO matched with parton showers JHEP 04 (2015) 114 1412.1828
26 J. Alwall et al. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations JHEP 07 (2014) 079 1405.0301
27 CMS Collaboration The CMS trigger system JINST 12 (2017) P01020 CMS-TRG-12-001
1609.02366
28 T. Sjostrand et al. An introduction to PYTHIA 8.2 CPC 191 (2015) 159 1410.3012
29 CMS Collaboration Event generator tunes obtained from underlying event and multiparton scattering measurements EPJC 76 (2016) 155 CMS-GEN-14-001
1512.00815
30 GEANT4 Collaboration GEANT4---a simulation toolkit NIMA 506 (2003) 250
31 M. Cacciari and G. P. Salam Pileup subtraction using jet areas PLB 659 (2008) 119 0707.1378
32 CMS Collaboration Description and performance of track and primary-vertex reconstruction with the CMS tracker JINST 9 (2014) P10009 CMS-TRK-11-001
1405.6569
33 CMS Collaboration Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV JINST 13 (2018) P05011 CMS-BTV-16-002
1712.07158
34 S. Ioffe and C. Szegedy Batch normalization: accelerating deep network training by reducing internal covariate shift in Proceedings of Machine Learning Research, vol. 37, 2015 1502.03167
35 A. L. Maas et al. Rectifier nonlinearities improve neural network acoustic models
36 F. Chollet et al. Keras Software available from keras.io (2015)
37 M. Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems Software available from tensorflow.org (2015)
38 D. P. Kingma and J. Ba Adam: A method for stochastic optimization 1412.6980
39 T. Hastie, R. Tibshirani, and J. Friedman The Elements of Statistical Learning Springer-Verlag New York, 2nd edition
40 A. D. Bukin Fitting function for asymmetric peaks 0711.4449
41 CMS Collaboration Performance of the CMS missing transverse momentum reconstruction in pp data at $ \sqrt{s} = $ 8 TeV JINST 10 (2015) P02006 CMS-JME-13-003
1411.0511
Compact Muon Solenoid
LHC, CERN