Results and Statistical Analysis

Overview

Teaching: 0 min
Exercises: 180 min

Questions

What is the unblinded result?

How to constrain the ABCD relationship for background in datacards?

Objectives

Understand how to make datacard and constrain the ABCD relationship for background in the datacards

Run Higgs Combine to get Asymptotic limits on BR from the datacards

Produce limit plot with respect to LLP mean proper decay length

Create datacards

Now that we have a well-developed analysis stratey and robus background estimation method, we are ready to produce datacards, look at the unblinded result and perform statistical analysis.

In this exercise, we will walk you through to apply all the signal region selections, add systematic uncertainties and and write the signal and background yield in the ABCD plane to datacards.

Open a notebook

For this part, open the notebook called create_datacard.ipynb to load the ntuples, apply signal region selection, add systematic uncertainties, and write the signal and background yield in the ABCD plane to datacards.

Discussion 6.1

Do you understand the example datacards? Where is the ABCD relationship constrained for background?

Question 6.1

Do you know what does the rateParam norm do and why do we add the rateParam to change the normalization of the signal yield?

Solution 6.1

The rateParam in principle can shift the normalization of the signal yield if its allowed to float during the fit. However, we will freeze the rateParam when we run Combine by adding arguments --freezeParameters norm --setParameters norm=0.001 The reason we add a rateParam and scale the signal is because the signal yield varies from O(1) to O(1000) for signals with different LLP lifetimes. However, Combine only works well when the fitted signal strength is between 1 to 15, so we scale the signal yield up/down so that the fitted signal strength fits in that range, and we multiply scale factor back when we plot the limits.

Results

Based on the unblinded data, we can perform a background-only fit to see if the observation agrees with the background prediction.

Run the followiing command to install Higgs Combine:

cd ${CMSSW_BASE}/src
cmsenv
git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
cd $CMSSW_BASE/src/HiggsAnalysis/CombinedLimit
git fetch origin
git checkout  v10.0.2
scramv1 b clean; scramv1 b # always make a clean build

Choose any of the datacards that you’ve produced and run the following commands:

combine -M MultiDimFit datacard.txt --saveWorkspace -n Snapshot --freezeParameters r,norm --setParameters r=0,norm=0.001

combine -M FitDiagnostics  --snapshotName MultiDimFit --bypassFrequentistFit higgsCombineSnapshot.MultiDimFit.mH120.root --saveNormalizations --saveShapes --saveWithUncertainties

You will get an output file called fitDiagnosticsTest.root. Open the ROOT file and navigate to the TH1F in shapes_fit_b/chA/total_background by running the following command:

root -l fitDiagnosticsTest.root
shapes_fit_b->cd()
chA->cd()
total_background->GetBinContent(1) # this will print the background prediction
total_background->GetBinError(1) # this will print the uncertainty on the background prediction

Does the background prediction agree with the observation?

Run Higgs Combine to compute limits

To run the Asymptotic frequentist limits we can use the following command (replace test.txt with your datacard path):

combine -M AsymptoticLimits test.txt --freezeParameters norm --setParameters norm=1
# a good rule of thumb for norm is to set it equal to 1./sigA, where sigA is the signal yield in bin A

The program will print the limit on the signal strength r (number of signal events / number of expected signal events) e .g. Observed Limit: r < XXX @ 95% CL , the median expected limit Expected 50.0%: r < XXX, and edges of the 68% and 95% ranges for the expected limits. The program will also create a ROOT file higgsCombineTest.AsymptoticLimits.mH120.root containing a ROOT tree limit that contains the limit values that we will use later to produce limits plots.

Since we have normalized our signal yield to assume BR(h$\rightarrow$ SS) = 1, the limit on the signal strength r from Combine essentially tells us the limit on BR(h$\rightarrow$ SS), modulo the normalization in signal yield.

Open a script

For this part, open the python script scripts/run_combine.py to run over all of the datacards that you produced in the previous exercise and save all the ROOT files in a directory.

You will have to update the directory of the datacards in the script*

Make limit plots

In this exercise, we will use the limits that were saved in ROOT files produced in the previous exercise to calculate the limit on BR(h $\rightarrow$ SS) with respect to the LLP mean proper decay lengths.

Open a notebook

For this part, open the notebook called limitPlot.ipynb to plot the expected and observed limits. See if the limit agrees with the one showed on slide 25 of the introduction slides.

Discussion 6.1

Did you expect the shape of the limit to look like this? Why does the sensitivity decreases for very long or very short lifetimes?

Exercise

Now that you have the limit plot for 40 GeV, try to see if you can create datacards, run combine, and make limits plot for 15 and 55 GeV. Try to plot them in the same canvas, like the one in slide 25 of the introduction slide.

Key Points

We can use Higgs Combine to perform statistical analysis and set limits

The value of the limits depend significantly on the LLP mean proper decay length, as the probability of LLP decaying in the muon system strongly correlates with the LLP lifetimes

previous episode

CMSDAS: Muon Detector Shower (MDS) Long Exercise

next episode

Results and Statistical Analysis

Overview

Create datacards

Open a notebook

Discussion 6.1

Question 6.1

Solution 6.1