This repository reproduces the CMS HGCAL L1 Stage2 reconstruction chain in Python for quick testing. It can generate an event visualization app. It was originally used for understanding and fixing the observed cluster splitting.
# setup conda environment
create -n <EnvName> python=3 pandas uproot pytables h5py
conda activate <EnvName>
# setup a ssh key if not yet done and clone the repository
git clone [email protected]:bfonta/bye_splits.git
# enforce git hooks locally (required for development)
git config core.hooksPath .githooks
To make the size of the files more manageable, a skimming step was implemented in C++
. It depends on the yaml-cpp
package (source, release used: version 0.7.0
). The instructions on the README
page of the project are a bit cryptic. Below follows a step-by-step guide:
# 1) Download the 0.7.0 '.zip' release
# 2) Unzip it
unzip yaml-cpp-yaml-cpp-0.7.0.zip
# 3) The package uses CMake for compilation. To avoid cluttering the same folder with many CMake-related files, create a new folder and build the project there
cd yaml-cpp-yaml-cpp-0.7.0
mkdir build
cd build
# 4) Compile a static library (the flag removes a -Werror issue)
cmake -DYAML_CPP_BUILD_TESTS=OFF ..
# 5) Build the package
cmake --build .
# 6) Verify the library 'libyaml-cpp.a' was created
ls -l
# 7) Check the package was correctly installed by compiling a test example (this assumes you have g++ installed):
g++ bye_splits/tests/test_yaml_cpp.cc -I <installation_path_yaml_cpp>/yaml-cpp-yaml-cpp-0.7.0/include/ -L <installation_path_yaml_cpp>/yaml-cpp-yaml-cpp-0.7.0/build/ -std=c++11 -lyaml-cpp -o test_yaml_cpp.o
# 8) Run the executable
./test_yaml_cpp.o
The above should print the contents stored in bye_splits/tests/params.yaml
.
Occasionally the following error message is printed: relocation R_X86_64_32 against symbol `_ZTVN4YAML9ExceptionE' can not be used when making a PIE object; recompile with -fPIE
. This is currently not understood but removing the yaml-cpp-yaml-cpp-0.7.0
folder (rm -rf
) and running the above from scratch solves the issue.
To run the skimming step, you will need to compile the C++
files stored under bye_splits/production
. Run make
from the top repository directory.
Run the following, specifying the particle type:
./produce.exe electrons
The above skims the input files, applying:
- Δ R matching between the generated particle and the clusters (currently works only for one generated particle, and throws an assertion error otherwise)
- a trigger cell energy threshold cut
- unconverted photon cut
- positive endcap cut (reduces the amount of data processed by a factor of 2)
- type conversion for
uproot
usage at later steps
This framework relies on photon-, electron- and pion-gun samples produced via CRAB. The most up to date versions are currently stored under:
Photons PU(0) | /dpm/in2p3.fr/home/cms/trivcat/store/user/lportale/DoublePhoton_FlatPt-1To100/GammaGun_Pt1_100_PU0_HLTSummer20ReRECOMiniAOD_2210_BCSTC-FE-studies_v3-29-1_realbcstc4/221025_153226/0000/ |
Electrons (PU0) | /dpm/in2p3.fr/home/cms/trivcat/store/user/lportale/DoubleElectron_FlatPt-1To100/ElectronGun_Pt1_100_PU200_HLTSummer20ReRECOMiniAOD_2210_BCSTC-FE-studies_v3-29-1_realbcstc4/221102_102633/0000/ |
Pions (PU0) | /dpm/in2p3.fr/home/cms/trivcat/store/user/lportale/SinglePion_PT0to200/SinglePion_Pt0_200_PU0_HLTSummer20ReRECOMiniAOD_2210_BCSTC-FE-studies_v3-29-1_realbcstc4/221102_103211/0000 |
The files above were merged and are stored under /data_CMS/cms/alves/L1HGCAL/
, accessible to LLR users and under /eos/user/b/bfontana/FPGAs/new_algos/
, accessible to all lxplus and LLR users. The latter is used since it is well interfaced with CERN services.
The chain is implemented in Python. To replicate the study of TC shifting run the following:
bash bye_splits/run_iterative_optimization.sh
where one can use the -h
flag to visualize available options. To use the steps separately in your own script use the functions defined under bye_splits/tasks/
, just as done in the iterative_optimization.py
script.
For plotting results as a function of the optimization trigger cell parameter:
python plot/meta_algorithm.py
The above will create html
files with interactive outputs.
The repository creates two web apps that can be visualized in a browser. The code is stored under bye_splits/plot
.
Please install the following from within the conda
environment you should have already created:
conda install -c conda-forge pyarrow
#if the above fails: python -m pip install pyarrow
python3 -m pip install --upgrade pip setuptools #to avoid annoying "Setuptools is replacing distutils." warning
Since browser usage directly in the server will necessarily be slow, we can:
Use LLR’s intranet at llruicms01.in2p3.fr:<port>/display
Forward it to our local machines via ssh
. To establish a connection between the local machine and the remote llruicms01
server, passing by the gate, use:
ssh -L <port>:llruicms01.in2p3.fr:<port> -N <llr_username>@llrgate01.in2p3.fr
# for instance: ssh -L 8080:lruicms01.in2p3.fr:8080 -N [email protected]
The two ports do not have to be the same, but it avoids possible confusion. Leave the terminal open and running (it will not produce any output).
In a new terminal window go to the llruicms01
mahcines and launch one of the apps, for instance:
bokeh serve bye_splits/plot/display/ --address llruicms01.in2p3.fr --port <port> --allow-websocket-origin=localhost:<port>
# if visualizing directly at LLR: --allow-websocket-origin=llruicms01.in2p3.fr:<port>
This uses the server-creation capabilities of bokeh
, a python
package for interactive visualization (docs). Note the port number must match. For further customisation of bokeh serve
see the serve documentation.
The above command should give access to the visualization under http://localhost:8080/display
. For debugging, just run python bye_splits/plot/display/main.py
and see that no errors are raised.
We use the S2I (Source to Image) service via CERN’s PaaS (Platform-as-a-Service) using OpenShift to deploy and host web apps in the CERN computing environment here. There are three ways to deploys such an app: S2I represents the easiest (but less flexible) of the three; instructions here. It effectively abstracts away the need for Dockerfiles.
We will use S2I’s simplest configuration possible under app.sh
. The image is created alongside the packages specified in requirements.txt
. The two latter definitions are documented here.
We are currently running a pod at https://viz2-hgcal-event-display.app.cern.ch/. The port being served by bokeh
in app.sh
must match the one the pod is listening to, specified at configuration time before deployment in the OpenShift management console at CERN. The network visibility was also updated to allow access from outside the CERN network.
Flask is a python micro web framework to simplify web development. It is considered “micro” because it’s lightweight and only provides essential components.
Given that plotly
’s dashboard framework, dash
, runs on top of flask
, and that bokeh
can produce html components programatically (which can be embedded in a flask
app), it should be possible to develop a flask
-powered web app mixing these two plotting packages. Having a common web framework also simplifies future integration.
The embedding of bokeh and plotly plots within flask is currently demonstrated in plot/join/app.py
. Two servers run: one from flask
and the other from bokeh
, so special care is required to ensure the browser where the app is being served listens to both ports. Listening to flask
’s port only will cause the html plot/join/templates/embed.html
to be rendered without bokeh plots.
Running a server is required when more advanced callbacks are needed. Currently only bokeh
has a server of its own; plotly
simply creates an html block with all the required information. If not-so-simple callbacks are required for plotly
plots, another port will have to be listened to.