=============================== This repository contains tools to evaluate the performance of the EMTF BDT after retraining.
source /cvmfs/cms.cern.ch/cmsset_default.sh
cmsrel CMSSW_10_6_1_patch2
cd CMSSW_10_6_1_patch2/src
cmsenv
git clone [email protected]:jrotter2/EMTF_BDT_PerformancePlotter.git
cd EMTF_BDT_PerformancePlotter
pip3 install -r requirements.txt --user
You should first fork this repository.
source /cvmfs/cms.cern.ch/cmsset_default.sh
cmsrel CMSSW_10_6_1_patch2
cd CMSSW_10_6_1_patch2/src
cmsenv
git clone [email protected]:<your_GitHub_username>/EMTF_BDT_PerformancePlotter.git
git checkout -b <your_branch_name>
git push origin <your_branch_name>
pip3 install -r requirements.txt --user
After you have made changes you can push them to your branch using,
git add .
git commit -m "Some Message..."
git push
Once your changes are stable and complete they can merged via a PR to the master branch.
It is recommended that you add these to your bash profile. In order to access files from EOS you will need to setup your environment for your session using,
source /cvmfs/cms.cern.ch/cmsset_default.sh
voms-proxy-init --voms cms
cd ~/path/to/your/directory/CMSSW_10_6_1_patch2/src/
cmsenv
The repository is structure so that one could run each individual plotter seperataly or call the general plotter to make multiple different types of performance plots.
Additionally there are helper classes in the helpers
directory which can be used to store multiuse functions or useful calculations.
plotter.py
is responsible for making general plots. It can make efficiency plots and resolution plots for different selections. It can be called by,
python3 plotter.py <options> outputDir outputFileName inputFile
The options -e
(or --eff
) will set a flag to create efficiency plots and -r
(or --res
) will set a flag to create resolution plots. Additional options can be seen by running python3 plotter.py --help
.
efficiencyPlotter.py
is responsible for making efficiency plots. It can be called directly by:
python3 efficiencyPlotter.py <options> outputDir outputFileName inputFile
To see a full list of options you can execute python3 efficiencyPlotter.py --help
This plotter will generate efficiency vs pT, efficiency vs eta, and efficiency vs phi plots for multiple selections passed through <options>
. These plots will be saved to a pdf specified by outputDir
and outputFileName
.
occupancyPlotter.py
is responsible for making occupancy plots.
resolutionPlotter.py
is responsible for making resolution plots, which are probability distributions for missing pT for a certain number of events (i.e. how precisely the trigger is estimating muon pT). The resolution plotter gives more information on how to scale the efficiency plots for a turn-on rate efficiency of =>90%. It can be called directly by:
python3 resolutionPlotter.py <options> outputDir inputFile
This plotter will generate resolutions using a Gaussian distribution.
Stored in the helpers
directory, are used to store multiuse functions or useful calculations.
One side effect of the GBDT Regression Algorithm is that the pT assignment will be 50% efficient at the pT threshold. As a convention of the L1 Trigger, the trigger should be >90% efficient at the pT threshold. Therefore, a scaling factor is implemented to make the BDT pT assignment fit the convention. The scaling factor is,
pT_xml = min(20, pT_unscaled)
pT_scaled = A * pT_unscaled / (1 + B * pT_xml)
For Run 2, A=1.2
and B=.015
. For Run 3, A=1.3
and B=.004
.
For each track in the input file there is unbinned information for GEN_pt
, BDTG_AWB_sq
, GEN_eta
, GEN_phi
, and TRK_hit_ids
which are of interest to our plotters (These will be stored in unbinned_EVT_data
).
Efficiency is calculated by first generating a set of tracks that meet certain denominator cuts (i.e. eta, phi, or track hit ids cuts), then generating a subset of those tracks by applying numerator cuts (i.e. pT cut on BDT) - These can be thought of as the denominator
and numerator
sets respectively.
Then the efficiency is generated by binning both the denominator and numerator sets and finding the ratio in each bin. The confidence interval for efficiency is based on the confidence interval for a binomial distibution for x
successes in k
trials (where x
would be number of tracks in the binned numerator and k
would be number of tracks in the binned denominator). This confidence interval is known as the Clopper-Pearson Exact Confidence Interval.
For each track in the input file there is unbinned information for GEN_pt
, BDTG_AWB_sq
, GEN_eta
, GEN_phi
, and TRK_hit_ids
which are of interest to our plotters (These will be stored in unbinned_EVT_data
).
Resolution is calculated by creating an unbinned array of (GEN_pt - BDT_pt)/GEN_pt
or (log(GEN_pt) - log(BDT_pt))/log(GEN_pt)
. The distribution should be roughly normal around zero.