CMS logoCMS event Hgg
Compact Muon Solenoid
LHC, CERN

CMS-PAS-HIG-18-030
Measurement of $\mathrm{t\overline{t}H}$ production in the $\mathrm{H\rightarrow b\overline{b}}$ decay channel in 41.5 fb$^{-1}$ of proton-proton collision data at $\sqrt{s}=$ 13 TeV
Abstract: A measurement of the associated production of a standard model Higgs boson with a top quark-antiquark pair ($\mathrm{t\overline{t}H}$) in proton-proton collisions at $\sqrt{s}=$ 13 TeV is presented. The result is based on data recorded with the CMS detector at the CERN LHC in 2017 and corresponds to an integrated luminosity of 41.5 fb$^{-1}$ Candidate $\mathrm{t\overline{t}H}$ events are selected based on the number of leptons in the event, targeting all $\mathrm{t\overline{t}}$ decay channels, and are categorised according to the number of jets. Multivariate analysis techniques are employed to further categorise the events and eventually discriminate between signal and background. A combined fit of multivariate discriminant distributions in all categories results in a best fit value of the $\mathrm{t\overline{t}H}$ signal strength relative to the standard model cross section, $\mu = \sigma/\sigma_{\mathrm{SM}}$, of $\hat{\mu} = $ 1.49 $^{+0.21}_{-0.20}$(stat) $^{+0.39}_{-0.35}$ (syst), corresponding to an observed (expected) significance of 3.7 (2.6) standard deviations. Combined with previous results obtained with 36.9 fb$^{-1}$ of data recorded in 2016, a best-fit value of $\hat{\mu} = $ 1.15 $^{+0.15}_{-0.15}$ (stat) $^{+0.28}_{-0.25}$ (syst) is found, corresponding to an observed (expected) significance of 3.9 (3.5) standard deviations above the background-only hypothesis.
Figures & Tables Summary Additional Figures & Tables References CMS Publications
Figures

png pdf
Figure 1:
Distribution of the $\Delta \eta _{\text {jets}}$ for events with 8 jets and $\geq $4 b-tags in an extended signal region (SRext), which corresponds to the regular SR but excluding the requirement of $\Delta \eta _{\text {jets}} \leq $ 2.52 for this category. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties. The distributions observed in data (markers) are overlayed. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 2:
Distributions of the QGLR after excluding the first three (left) and first four (right) b-tagged jets (ranked by the DeepCSV output value) for the calculation in the fully-hadronic channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${\mathrm{t} {}\mathrm{\bar{t}}}$+hf processes) added in quadrature. The distributions observed in data (markers) are overlayed. The last bin includes overflow events. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 2-a:
Distribution of the QGLR after excluding the first three b-tagged jets (ranked by the DeepCSV output value) for the calculation in the fully-hadronic channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${\mathrm{t} {}\mathrm{\bar{t}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Figure 2-b:
Distribution of the QGLR after excluding the first four b-tagged jets (ranked by the DeepCSV output value) for the calculation in the fully-hadronic channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${\mathrm{t} {}\mathrm{\bar{t}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Figure 3:
Distributions of representative variables used as input to the ANN in the $\geq $6 jets, $\geq $3 jets category of the single-lepton (SL) channel: likelihood ratio discriminating between events with 4 b quark jets and b quark jets (BLR), sum of the masses of all jets normalised to the number of dijet pairs in the event (${m'_{\text {j}}}$), MEM discriminant (MEM), and scalar sum of ${p_{\mathrm {T}}}$ of b-tagged jets (${H_{\text {T}}^{\text {b}}}$). The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distributions observed in data (markers) are overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 3-a:
Distribution of the likelihood ratio discriminating between events with 4 b quark jets and b quark jets (BLR) in the $\geq $6 jets, $\geq $3 jets category of the single-lepton (SL) channel. The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distribution observed in data (markers) is overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Figure 3-b:
Distribution of the sum of the masses of all jets normalised to the number of dijet pairs in the event (${m'_{\text {j}}}$) in the $\geq $6 jets, $\geq $3 jets category of the single-lepton (SL) channel. The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distribution observed in data (markers) is overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Figure 3-c:
Distribution of the MEM discriminant (MEM) in the $\geq $6 jets, $\geq $3 jets category of the single-lepton (SL) channel. The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distribution observed in data (markers) is overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Figure 3-d:
Distribution of the scalar sum of ${p_{\mathrm {T}}}$ of b-tagged jets (${H_{\text {T}}^{\text {b}}}$) in the $\geq $6 jets, $\geq $3 jets category of the single-lepton (SL) channel. The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distribution observed in data (markers) is overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Figure 4:
Distributions of representative variables used as input to the BDT in the $\geq $4 jets, 3 b-tags (left) and $\geq $4 jets, $\geq$4 b-tags (right) categories of the dilepton channel: MEM discriminant (MEM), average b-tagging discriminant value of all b-tagged jets normalised to the total number of jets (average DeepCSV value (b-jets)), and maximum $\Delta \eta $ between any two b-tagged jets ($\Delta \eta ^{\text {max}}_{\text {b},\text {b}}$). The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distributions observed in data (markers) are overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 4-a:
Distributions of representative variables used as input to the BDT in the $\geq $4 jets, 3 b-tags (left) and $\geq $4 jets, $\geq$4 b-tags (right) categories of the dilepton channel: MEM discriminant (MEM), average b-tagging discriminant value of all b-tagged jets normalised to the total number of jets (average DeepCSV value (b-jets)), and maximum $\Delta \eta $ between any two b-tagged jets ($\Delta \eta ^{\text {max}}_{\text {b},\text {b}}$). The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distributions observed in data (markers) are overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 4-b:
Distributions of representative variables used as input to the BDT in the $\geq $4 jets, 3 b-tags (left) and $\geq $4 jets, $\geq$4 b-tags (right) categories of the dilepton channel: MEM discriminant (MEM), average b-tagging discriminant value of all b-tagged jets normalised to the total number of jets (average DeepCSV value (b-jets)), and maximum $\Delta \eta $ between any two b-tagged jets ($\Delta \eta ^{\text {max}}_{\text {b},\text {b}}$). The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distributions observed in data (markers) are overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 4-c:
Distributions of representative variables used as input to the BDT in the $\geq $4 jets, 3 b-tags (left) and $\geq $4 jets, $\geq$4 b-tags (right) categories of the dilepton channel: MEM discriminant (MEM), average b-tagging discriminant value of all b-tagged jets normalised to the total number of jets (average DeepCSV value (b-jets)), and maximum $\Delta \eta $ between any two b-tagged jets ($\Delta \eta ^{\text {max}}_{\text {b},\text {b}}$). The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distributions observed in data (markers) are overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 4-d:
Distributions of representative variables used as input to the BDT in the $\geq $4 jets, 3 b-tags (left) and $\geq $4 jets, $\geq$4 b-tags (right) categories of the dilepton channel: MEM discriminant (MEM), average b-tagging discriminant value of all b-tagged jets normalised to the total number of jets (average DeepCSV value (b-jets)), and maximum $\Delta \eta $ between any two b-tagged jets ($\Delta \eta ^{\text {max}}_{\text {b},\text {b}}$). The background and signal contributions (filled histograms) are stacked, and the hatched uncertainty bands correspond to the total statistical and systematic uncertainties. Shown are the post-fit contributions, where the model parameters are obtained from the final fit of the discriminant distributions to data, described in Section 7, and applied to the shown input variable distributions. The distributions observed in data (markers) are overlayed. In addition, the SM ${{\mathrm{t} {}\mathrm{\bar{t}}} \mathrm{H}}$ signal expectation (line) is overlayed (scaled by a factor 15 for better visibility). The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5-a:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5-b:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5-c:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5-d:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5-e:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 5-f:
Final discriminant shapes in the categories with the highest sensitivity in fully-hadronic (top), semi-leptonic (middle), and dilepton (bottom) channels before (left) and after (right) the fit to data. The expected background contributions (filled histograms) are stacked. In the pre-fit case, the expected signal contribution (line), scaled by a factor 15, is superimposed. In the post-fit case, the fitted signal contribution is also stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Figure 6:
Post-fit pull and best-fit value of the nuisance parameters included in the fit to the 2017 data as well as their impact on the signal strength $\mu $, ordered by their impact. Only the 20 highest ranked parameters are shown. The two highest-ranked nuisance parameters related to the jet energy scale uncertainty sources are shown as indicated in parentheses. The pulls of the nuisance parameters (black markers) are computed relative to their pre-fit values $\theta _{0}$ and uncertainties $\Delta \theta $. The impact $\Delta \hat{\mu}$ is computed as the difference of the nominal best fit value of $\mu $ and the best fit value obtained when fixing the nuisance parameter under scrutiny to its best fit value $\hat{\theta}$ plus/minus its post-fit uncertainty (coloured areas).

png pdf
Figure 7:
Best fit values of the signal strength modifiers $\mu $ obtained in the fit of the 2017 dataset (left) and in the combined fit of the 2016 and 2017 datasets (right) per channel and dataset and in the full combination. Also shown are the 68% expected confidence intervals (outer error bar), also split into their statistical (inner error bar) and systematic components.

png pdf
Figure 7-a:
Best fit values of the signal strength modifiers $\mu $ obtained in the fit of the 2017 dataset (left) and in the combined fit of the 2016 and 2017 datasets (right) per channel and dataset and in the full combination. Also shown are the 68% expected confidence intervals (outer error bar), also split into their statistical (inner error bar) and systematic components.

png pdf
Figure 7-b:
Best fit values of the signal strength modifiers $\mu $ obtained in the fit of the 2017 dataset (left) and in the combined fit of the 2016 and 2017 datasets (right) per channel and dataset and in the full combination. Also shown are the 68% expected confidence intervals (outer error bar), also split into their statistical (inner error bar) and systematic components.

png pdf
Figure 8:
Post-fit pull and best-fit value of the constrained (text in black) and unconstrained (text in grey) nuisance parameters included in the fit to the 2016 plus 2017 data as well as their impact on the signal strength $\mu $, ordered by their impact. Only the 20 highest ranked parameters are shown. The pulls of the nuisance parameters (black markers) are computed relative to their pre-fit values $\theta _{0}$ and uncertainties $\Delta \theta $. The impact $\Delta \hat{\mu}$ is computed as the difference of the nominal best fit value of $\mu $ and the best fit value obtained when fixing the nuisance parameter under scrutiny to its best fit value $\hat{\theta}$ plus/minus its post-fit uncertainty (coloured areas).
Tables

png pdf
Table 1:
Baseline event selection criteria in the fully-hadronic (FH), single-lepton (SL), and dilepton (DL) channels.

png pdf
Table 2:
Event yields observed in data and predicted by the simulation after the baseline selection in the fully-hadronic (FH), single-lepton (SL), and dilepton (DL) channels prior to the fit to data. Here, the QCD prediction is taken from simulation. The quoted uncertainties correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${\mathrm{t} {}\mathrm{\bar{t}}}$+hf processes).

png pdf
Table 3:
Definition and description of the four mutually exclusive regions in the analysis.

png pdf
Table 4:
Systematic uncertainties considered in the analysis.

png pdf
Table 5:
Best fit value of the signal strength modifier $\mu $ and the corresponding observed (obs) and expected (exp) significance in standard deviations in the fully-hadronic (FH), single-lepton (SL), and dilepton (DL) channels and in the channel combination.

png pdf
Table 6:
Contributions of different sources of uncertainties to the result for the combined fit to the 2016 and 2017 datasets (observed) and to the expectation from simulation (expected). The quoted uncertainties $\Delta \hat{\mu}$ in $\hat{\mu}$ are obtained by fixing the listed sources of uncertainties to their post-fit values in the fit and subtracting the obtained result in quadrature from the result of the full fit. The statistical uncertainty is evaluated by fixing all nuisance parameters to their post-fit values. The quadratic sum of the contributions is different from the total uncertainty because of correlations between the nuisance parameters.
Summary
A measurement of the associated production of a Higgs boson and a top quark-antiquark pair (${\mathrm{t\bar{t}}\mathrm{H}}$) in the $\mathrm{b\bar{b}}$ final state of the Higgs boson has been presented. All decay channels of the $\mathrm{t\bar{t}}$ system are considered.

The analysis has been performed in 41.5 fb$^{-1}$ of pp collision data recorded with the CMS detector at a centre-of-mass energy of 13 TeV in 2017. Candidate events are selected in mutually exclusive categories according to the $\mathrm{t\bar{t}}$ decay channel and jet multiplicity. Multivariate discriminants are used to further categorise the events and to separate the ${\mathrm{t\bar{t}}\mathrm{H}}$ signal from the $\mathrm{t\bar{t}}$-dominated background contributions. The signal is extracted in a simultaneous fit of the classifier distributions to the data across all categories and channels.

The best fit value of the ${\mathrm{t\bar{t}}\mathrm{H}}$ signal cross-section on the 2017 dataset is $\hat{\mu} = $ 1.49 $^{+0.21}_{-0.20}$(stat) $^{+0.39}_{-0.35}$ (syst) relative to the SM expectation, corresponding to an observed (expected) significance of 3.7 (2.6) standard deviations above the background-only hypothesis. Combined with previous results obtained with 36.9 fb$^{-1}$ of data recorded in 2016, a best-fit value of $\hat{\mu} = $ 1.15 $^{+0.15}_{-0.15}$ (stat) $^{+0.28}_{-0.25}$ (syst) is found, corresponding to an observed (expected) significance of 3.9 (3.5) standard deviations above the background-only hypothesis.

The presented result, which improves on previous CMS measurements in this channel owing to the increase in integrated luminosity and the usage of a more performant b tagging algorithm as well as refined analysis methods, constitutes the first evidence for ${\mathrm{t\bar{t}}\mathrm{H}}$ production in the $\mathrm{b\bar{b}}$ decay mode of the Higgs boson.
Additional Figures

png pdf
Additional Figure 1:
Jet (left) and b-tagged jet (right) multiplicity in the fully-hadronic (top), single-lepton (middle), and dilepton (bottom) channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distributions observed in data (markers) are overlayed. The last bin includes overflow events. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 1-a:
Jet multiplicity in the fully-hadronic channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 1-b:
b-tagged jet multiplicity in the fully-hadronic channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 1-c:
Jet multiplicity in the single-lepton channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 1-d:
b-tagged jet multiplicity in the single-lepton channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 1-e:
Jet multiplicity in the dilepton channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 1-f:
b-tagged jet multiplicity in the dilepton channel after the baseline selection. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Here, the QCD-multijet prediction is taken from simulation. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by for better visibility. The hatched uncertainty bands correspond to the total statistical and systematic uncertainties (excluding the 50% uncertainties on the normalisation of the ${{\mathrm {t}\overline {\mathrm {t}}}}$+hf processes) added in quadrature. The distribution observed in data (markers) is overlayed. The last bin includes overflow events. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 2:
Expected fraction of signal and background events contributing to the analysis categories in the fully-hadronic (FH) channel before the fit to data. The QCD-multijet contribution is estimated from data.

png pdf
Additional Figure 3:
Expected fraction of signal and background events contributing to the analysis categories in the single-lepton (SL) channel before the fit to data.

png pdf
Additional Figure 3-a:
Expected fraction of signal and background events contributing to the analysis categories in the single-lepton (SL) channel before the fit to data.

png pdf
Additional Figure 3-b:
Expected fraction of signal and background events contributing to the analysis categories in the single-lepton (SL) channel before the fit to data.

png pdf
Additional Figure 3-c:
Expected fraction of signal and background events contributing to the analysis categories in the single-lepton (SL) channel before the fit to data.

png pdf
Additional Figure 4:
Expected fraction of signal and background events contributing to the analysis categories in the dilepton (DL) channel before the fit to data.

png pdf
Additional Figure 5:
Normalised distribution of $\Delta \eta _{\text {jets}}$ used for QCD rejection in the fully-hadronic (FH) channel for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$, ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets, and QCD multijet events (left) and background vs. signal selection efficiency for different requirements on $\Delta \eta _{\text {jets}}$, evaluated with ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets or multijet events as background (right) in the 8 jets, $\geq $4 b-tags category in an extended signal region (SR ext) corresponding to the analysis signal region but without the requirement on $\Delta \eta _{\text {jets}}$ itself. The distributions for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets are taken from simulation while the QCD-multijet background is estimated from data.

png pdf
Additional Figure 5-a:
Normalised distribution of $\Delta \eta _{\text {jets}}$ used for QCD rejection in the fully-hadronic (FH) channel for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$, ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets, and QCD multijet events in the 8 jets, $\geq $4 b-tags category in an extended signal region (SR ext) corresponding to the analysis signal region but without the requirement on $\Delta \eta _{\text {jets}}$ itself. The distributions for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets are taken from simulation while the QCD-multijet background is estimated from data.

png pdf
Additional Figure 5-b:
Background vs. signal selection efficiency for different requirements on $\Delta \eta _{\text {jets}}$, evaluated with ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets or multijet events as background in the 8 jets, $\geq $4 b-tags category in an extended signal region (SR ext) corresponding to the analysis signal region but without the requirement on $\Delta \eta _{\text {jets}}$ itself. The distributions for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets are taken from simulation while the QCD-multijet background is estimated from data.

png pdf
Additional Figure 6:
Illustration of the defintion of the signal (SR) and control (CR) regions used to determine the QCD-multijet background shapes in the fully-hadronic channel, as well as the validation regions (CRval, VR).

png pdf
Additional Figure 7:
MEM discriminant distribution in the validation region of the fully-hadronic channel (FH VR) in the 8 jets, $\geq $4 b-tags category for data (markers) and backgrounds (stacked distributions). The QCD-multijet background is estimated from data while the other backgrounds are taken from the simulation. The difference between data and the total background estimate divided by the quadratic sum of the statistical and systematic uncertainties (pulls) are shown below the main panel. The last bin includes overflows.

png pdf
Additional Figure 8:
Normalised MEM discriminant distribution for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and different background processes (left) and background vs. ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ signal selection efficiency for different requirements on the MEM discriminant output, evaluated for different background processes (right) in the 8 jets, $\geq $4 b-tags category of the fully-hadronic (FH) channel. The distributions for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets are taken from simulation while the QCD-multijet background is estimated from data.

png pdf
Additional Figure 8-a:
Background vs. ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ signal selection efficiency for different requirements on the MEM discriminant output, evaluated for different background processes in the 8 jets, $\geq $4 b-tags category of the fully-hadronic (FH) channel. The distributions for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets are taken from simulation while the QCD-multijet background is estimated from data.

png pdf
Additional Figure 8-b:
Background vs. ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ signal selection efficiency for different requirements on the MEM discriminant output, evaluated for different background processes in the 8 jets, $\geq $4 b-tags category of the fully-hadronic (FH) channel. The distributions for ${{{\mathrm {t}\overline {\mathrm {t}}}} {\mathrm {H}}}$ and ${{\mathrm {t}\overline {\mathrm {t}}}}$+jets are taken from simulation while the QCD-multijet background is estimated from data.

png pdf
Additional Figure 9:
Bins of the final discriminants as used in the fit of the 2017 dataset (left) and in the combined fit of the 2016 and 2017 datasets (right), reordered by the pre-fit expected signal-to-background ratio (S/B). Each of the shown bins includes multiple bins of the final discriminants with similar S/B. The fitted signal (cyan) is compared to the expectation for the SM Higgs boson $\mu = $ 1 (red).

png pdf
Additional Figure 9-a:
Bins of the final discriminants as used in the fit of the 2017 dataset, reordered by the pre-fit expected signal-to-background ratio (S/B). Each of the shown bins includes multiple bins of the final discriminants with similar S/B. The fitted signal (cyan) is compared to the expectation for the SM Higgs boson $\mu = $ 1 (red).

png pdf
Additional Figure 9-b:
Bins of the final discriminants as used in the combined fit of the 2016 and 2017 datasets, reordered by the pre-fit expected signal-to-background ratio (S/B). Each of the shown bins includes multiple bins of the final discriminants with similar S/B. The fitted signal (cyan) is compared to the expectation for the SM Higgs boson $\mu = $ 1 (red).

png pdf
Additional Figure 10:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets (left) and $\geq $4 b-tagged jets (right) with 7, 8, and $\geq $9 jets (from top to bottom). The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 10-a:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets with 7 jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 10-b:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with $\geq $4 b-tagged jets with 7 jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 10-c:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets with 8 jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 10-d:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with $\geq $4 b-tagged jets with 8 jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 10-e:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets with $\geq $9 jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 10-f:
Final discriminant shapes in the fully-hadronic (FH) channel before the fit to data: MEM discriminant in the jet-tag categories with $\geq $4 b-tagged jets with $\geq $9 jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model except the effect due to the freely-floating QCD-background normalisation. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 11:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 11-a:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 11-b:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 11-c:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 11-d:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 11-e:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 11-f:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12-a:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12-b:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12-c:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12-d:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12-e:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 12-f:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13-a:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13-b:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13-c:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13-d:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13-e:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 13-f:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions of the signal+background SM prediction (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 14:
Final discriminant shapes in the dilepton (DL) channel before the fit to data: BDT discriminant in the jet-tag categories with 3 jets (left) and $\geq $4 jets (right) with 2, 3, and $\geq $4 b-tagged jets (from top to bottom). The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 14-a:
Final discriminant shapes in the dilepton (DL) channel before the fit to data: BDT discriminant in the jet-tag categories with 3 jets with 2 b-tagged jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 14-b:
Final discriminant shapes in the dilepton (DL) channel before the fit to data: BDT discriminant in the jet-tag categories with $\geq $4 jets with 2 b-tagged jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 14-c:
Final discriminant shapes in the dilepton (DL) channel before the fit to data: BDT discriminant in the jet-tag categories with 3 jets with 3 b-tagged jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 14-d:
Final discriminant shapes in the dilepton (DL) channel before the fit to data: BDT discriminant in the jet-tag categories with $\geq $4 jets with 3 b-tagged jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 14-e:
Final discriminant shapes in the dilepton (DL) channel before the fit to data: BDT discriminant in the jet-tag categories with $\geq $4 jets with $\geq $4 b-tagged jets. The expected background contributions (filled histograms) are stacked, and the expected signal distribution (line) is superimposed. Each contribution is normalised to an integrated luminosity of 41.5 fb$^{-1}$, and the signal distribution is additionally scaled by a factor of 15 for better visibility. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 15:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets (left) and $\geq $4 b-tagged jets (right) with 7, 8, and $\geq $9 jets (from top to bottom). The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 15-a:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets with 7 jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 15-b:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with $\geq $4 b-tagged jets with 7 jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 15-c:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets with 8 jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 15-d:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with $\geq $4 b-tagged jets with 8 jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 15-e:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with 3 b-tagged jets with $\geq $9 jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 15-f:
Final discriminant shapes in the fully-hadronic (FH) channel after the fit to data: MEM discriminant in the jet-tag categories with $\geq $4 b-tagged jets with $\geq $9 jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 16:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 16-a:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 16-b:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 16-c:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 16-d:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 16-e:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 16-f:
ANN discriminant shapes in the semi-leptonic (SL) channel after the fit to data in the jet-process categories with 4 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17-a:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17-b:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17-c:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17-d:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17-e:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 17-f:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 5 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18-a:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18-b:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18-c:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18-d:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18-e:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 18-f:
ANN discriminant shapes in the semi-leptonic (SL) channel before the fit to data in the jet-process categories with 6 jets, $\geq $3 b-tags. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 19:
Final discriminant shapes in the dilepton (DL) channel after the fit to data: BDT discriminant in the jet-tag categories with 3 jets (left) and $\geq $4 jets (right) with 2, 3, and $\geq $4 b-tagged jets (from top to bottom). The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distributions observed in data (markers) are overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plots show the ratio of the data to the background prediction.

png pdf
Additional Figure 19-a:
Final discriminant shapes in the dilepton (DL) channel after the fit to data: BDT discriminant in the jet-tag categories with 3 jets with 2 b-tagged jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 19-b:
Final discriminant shapes in the dilepton (DL) channel after the fit to data: BDT discriminant in the jet-tag categories with 2 jets with $\geq $4 b-tagged jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 19-c:
Final discriminant shapes in the dilepton (DL) channel after the fit to data: BDT discriminant in the jet-tag categories with 3 jets with 3 b-tagged jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 19-d:
Final discriminant shapes in the dilepton (DL) channel after the fit to data: BDT discriminant in the jet-tag categories with 3 jets with $\geq $4 b-tagged jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.

png pdf
Additional Figure 19-e:
Final discriminant shapes in the dilepton (DL) channel after the fit to data: BDT discriminant in the jet-tag categories with 3 jets with $\geq $4 b-tagged jets. The expected signal and background contributions (filled histograms) are stacked. The hatched uncertainty bands include the total uncertainty of the fit model. The distribution observed in data (markers) is overlayed. The first and the last bins include underflow and overflow events, respectively. The lower plot shows the ratio of the data to the background prediction.
Additional Tables

png pdf
Additional Table 1:
Input variables used in the ANNs or BDTs in the different categories of the single-lepton (SL) and dilepton (DL) channels. Variables used in a specific multivariate method and analysis category are denoted by a "$+$'' and unused variables by a "$-$''. (Continued in Additional Table 2.)

png pdf
Additional Table 2:
Continued from Additional Table 1 and continued in Additional Table 3.

png pdf
Additional Table 3:
Continued from Additional Table 2.

png pdf
Additional Table 4:
Hyperparameters and number of input variables of the neural networks per jet-multiplicity category in the single-lepton channel.

png pdf
Additional Table 5:
BDT hyperparameters used in the five categories of the dilepton channel, followed by the AUC values of the corresponding ROC curves.

png pdf
Additional Table 6:
Observed and expected event yields per jet-tag category in the fully-hadronic channel, prior to the fit to data (after the fit to data). The quoted uncertainties denote the total statistical and systematic uncertainty.

png pdf
Additional Table 7:
Observed and expected event yields per jet-process category (node) in the single-lepton channel in the 4 jets, $\geq $3 b-tags category, prior to the fit to data (after the fit to data). The quoted uncertainties denote the total statistical and systematic components.

png pdf
Additional Table 8:
Observed and expected event yields per jet-process category (node) in the single-lepton channel in the 5 jets, $\geq $3 b-tags category, prior to the fit to data (after the fit to data). The quoted uncertainties denote the total statistical and systematic components.

png pdf
Additional Table 9:
Observed and expected event yields per jet-process category (node) in the single-lepton channel in the 6 jets, $\geq $3 b-tags category, prior to the fit to data (after the fit to data). The quoted uncertainties denote the total statistical and systematic components.

png pdf
Additional Table 10:
Observed and expected event yields in the 3 jets, 2 b-tags and 3 jets, 3 b-tags categories of the dilepton channel, prior to the fit to data (after the fit to data). The quoted uncertainties denote the total statistical and systematic uncertainty.

png pdf
Additional Table 11:
Observed and expected event yields in the $\geq $4 jets, 2 b-tags, $\geq $4 jets, 3 b-tags and $\geq $4 jets, $\geq $4 b-tags categories of the dilepton channel, prior to the fit to data (after the fit to data). The quoted uncertainties denote the total statistical and systematic uncertainty.
References
1 ATLAS Collaboration Observation of Higgs boson production in association with a top quark pair at the LHC with the ATLAS detector PLB784 (2018) 159 1806.00425
2 CMS Collaboration Observation of $ \mathrm{t\overline{t}} $H production PRL 120 (2018) 231801 CMS-HIG-17-035
1804.02610
3 LHC Higgs Cross Section Working Group Handbook of LHC Higgs cross sections: 4. deciphering the nature of the Higgs sector CERN (2016) 1610.07922
4 ATLAS Collaboration Search for the standard model Higgs boson produced in association with top quarks and decaying into a $ b\bar{b} $ pair in $ pp $ collisions at $ \sqrt{s} = $ 13 TeV with the ATLAS detector PRD97 (2018) 072016 1712.08895
5 CMS Collaboration Search for $ \mathrm{t\overline{t}} $H production in the $ H\to\mathrm{b\overline{b}} $ decay channel with leptonic $ \mathrm{t\overline{t}} $ decays in proton-proton collisions at $ \sqrt{s}= $ 13 TeV CMS-HIG-17-026
1804.03682
6 CMS Collaboration Search for $ \mathrm{t}\overline{\mathrm{t}} $H production in the all-jet final state in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JHEP 06 (2018) 101 CMS-HIG-17-022
1803.06986
7 GEANT4 Collaboration GEANT4--a simulation toolkit NIMA 506 (2003) 250
8 P. Nason A new method for combining NLO QCD with shower Monte Carlo algorithms JHEP 11 (2004) 040 hep-ph/0409146
9 S. Frixione, P. Nason, and C. Oleari Matching NLO QCD computations with parton shower simulations: the POWHEG method JHEP 11 (2007) 070 0709.2092
10 S. Alioli, P. Nason, C. Oleari, and E. Re A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX JHEP 06 (2010) 043 1002.2581
11 H. B. Hartanto, B. Jager, L. Reina, and D. Wackeroth Higgs boson production in association with top quarks in the POWHEG BOX PRD 91 (2015) 094003 1501.04498
12 J. Alwall et al. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations JHEP 07 (2014) 079 1405.0301
13 T. Sjostrand et al. An Introduction to PYTHIA 8.2 CPC 191 (2015) 159 1410.3012
14 NNPDF Collaboration Parton distributions from high-precision collider data EPJC 77 (2017) 663 1706.00428
15 CMS Collaboration Extraction and validation of a new set of CMS PYTHIA8 tunes from underlying-event measurements Submitted to EPJC CMS-GEN-17-001
1903.12179
16 S. Alioli, P. Nason, C. Oleari, and E. Re NLO single-top production matched with shower in POWHEG: $ s $- and $ t $-channel contributions JHEP 09 (2009) 111 0907.4076
17 E. Re Single-top Wt-channel production matched with parton showers using the POWHEG method EPJC 71 (2011) 1547 1009.2450
18 R. Frederix and S. Frixione Merging meets matching in MC@NLO JHEP 12 (2012) 061 1209.6215
19 M. Cacciari et al. Top-pair production at hadron colliders with next-to-next-to-leading logarithmic soft-gluon resummation PLB 710 (2012) 612 1111.5869
20 P. Barnreuther, M. Czakon, and A. Mitov Percent-level-precision physics at the Tevatron: next-to-next-to-leading order QCD corrections to $ \mathrm{q}\mathrm{\bar{q}}\to\mathrm{t\bar{t}}\text{+X} $ PRL 109 (2012) 132001 1204.5201
21 M. Czakon and A. Mitov NNLO corrections to top-pair production at hadron colliders: the all-fermionic scattering channels JHEP 12 (2012) 054 1207.0236
22 M. Czakon and A. Mitov NNLO corrections to top pair production at hadron colliders: the quark-gluon reaction JHEP 01 (2013) 080 1210.6832
23 M. Beneke, P. Falgari, S. Klein, and C. Schwinn Hadronic top-quark pair production with NNLL threshold resummation NPB 855 (2012) 695 1109.1536
24 M. Czakon, P. Fiedler, and A. Mitov Total top-quark pair-production cross section at hadron colliders through $ o({\alpha_s}^4) $ PRL 110 (2013) 252004 1303.6254
25 M. Czakon and A. Mitov Top++: a program for the calculation of the top-pair cross-section at hadron colliders CPC 185 (2014) 2930 1112.5675
26 N. Kidonakis Two-loop soft anomalous dimensions for single top quark associated production with $ \mathrm{W^-} $ or $ \mathrm{H^-} $ PRD 82 (2010) 054018 1005.4451
27 M. Aliev et al. HATHOR: HAdronic Top and Heavy quarks crOss section calculatoR CPC 182 (2011) 1034 1007.1327
28 P. Kant et al. HatHor for single top-quark production: Updated predictions and uncertainty estimates for single top-quark production in hadronic collisions CPC 191 (2015) 74 1406.4403
29 F. Maltoni, D. Pagani, and I. Tsinikos Associated production of a top-quark pair with vector bosons at NLO in QCD: impact on $ \mathrm{t}\overline{\mathrm{t}}\mathrm{H} $ searches at the LHC JHEP 02 (2016) 113 1507.05640
30 J. M. Campbell, R. K. Ellis, and C. Williams Vector boson pair production at the LHC JHEP 07 (2011) 018 1105.0020
31 CMS Collaboration Particle-flow reconstruction and global event description with the CMS detector JINST 12 (2017) P10003 CMS-PRF-14-001
1706.04965
32 CMS Collaboration Description and performance of track and primary-vertex reconstruction with the CMS tracker JINST 9 (2014) P10009 CMS-TRK-11-001
1405.6569
33 CMS Collaboration Performance of electron reconstruction and selection with the CMS detector in proton-proton collisions at $ \sqrt{s} = $ 8 TeV JINST 10 (2015) P06005 CMS-EGM-13-001
1502.02701
34 CMS Collaboration Performance of the CMS muon detector and muon reconstruction with proton-proton collisions at $ \sqrt{s} = $ 13 TeV JINST 13 (2018) P06015 CMS-MUO-16-001
1804.04528
35 M. Cacciari, G. P. Salam, and G. Soyez The anti-$ {k_{\mathrm{T}}} $ jet clustering algorithm JHEP 04 (2008) 063 0802.1189
36 M. Cacciari, G. P. Salam, and G. Soyez FastJet user manual EPJC 72 (2012) 1896 1111.6097
37 M. Cacciari, G. P. Salam, and G. Soyez The catchment area of jets JHEP 04 (2008) 005 0802.1188
38 CMS Collaboration Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV JINST 12 (2017) P02014 CMS-JME-13-004
1607.03663
39 CMS Collaboration Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV JINST 13 (2018) P05011 CMS-BTV-16-002
1712.07158
40 CMS Collaboration Performance of b tagging algorithms in proton-proton collisions at 13 TeV with Phase 1 CMS detector CDS
41 CMS Collaboration Performance of quark/gluon discrimination in 8 TeV pp data CMS-PAS-JME-13-002 CMS-PAS-JME-13-002
42 CMS Collaboration Performance of quark/gluon discrimination in 13 TeV data CDS
43 F. Chollet et al. Keras link
44 I. Goodfellow, Y. Bengio, and A. Courville Deep Learning MIT Press
45 A. Hocker et al. TMVA: Toolkit for multivariate data analysis PoS ACAT (2007) 040 physics/0703039
46 J. Kennedy and R. Eberhart Particle swarm optimization in Proceedings of the IEEE International Conference on neural networks, volume 4, p. 1942 1995
47 K. El Morabit A study of the multivariate analysis of Higgs boson production in association with a top quark-antiquark pair in the boosted regime at the CMS experiment Master's thesis, Karlsruher Institut f\"ur Technologie (KIT), 2015 EKP-2016-00035
48 CMS Collaboration CMS luminosity measurement for the 2017 data-taking period at $ \sqrt{s} = $ 13 ~ TeV CMS-PAS-LUM-17-004 CMS-PAS-LUM-17-004
49 ATLAS Collaboration Measurement of the inelastic proton-proton cross section at $ \sqrt{s} = $ 13 TeV with the ATLAS detector at the LHC PRL 117 (2016) 182002 1606.02625
50 P. Skands, S. Carrazza, and J. Rojo Tuning PYTHIA 8.1: the Monash 2013 tune EPJC 74 (2014) 3024 1404.5630
51 CMS Collaboration Investigations of the impact of the parton shower tuning in Pythia 8 in the modelling of $ \mathrm{t\bar{t}} $ at $ \sqrt{s}= $ 8 and 13 TeV CMS-PAS-TOP-16-021 CMS-PAS-TOP-16-021
52 NNPDF Collaboration Parton distributions for the LHC Run II JHEP 04 (2015) 040 1410.8849
53 R. J. Barlow and C. Beeston Fitting using finite Monte Carlo samples CPC 77 (1993) 219
54 J. S. Conway Incorporating nuisance parameters in likelihoods for multisource spectra in Proceedings, PHYSTAT 2011 Workshop on Statistical Issues Related to Discovery Claims in Search Experiments and Unfolding, CERN 2011 1103.0354
Compact Muon Solenoid
LHC, CERN