lhcb / second-analysis-steps Goto Github PK

Level two LHCb data analysis lessons

Home Page: https://lhcb.github.io/second-analysis-steps/

second-analysis-steps's Issues

Explain difference between cc and CC

In the 01-building-decays section it would be nice if the difference between cc and CC in the context of a decay descriptor was explained.

Choice of topics

This is a meta-issue to curate a list of issues suggesting topics to cover in the second-analysis-steps material and also what to cover in an intermediate-kit. Those two don't have to be the same. (The starterkit as an event uses two repositories of material analysis-essentials and first-analysis-steps)

There are already: #7, #6 and #3

In addition an unordered list of potentially interesting topics:

statistics (tools for limit setting and measurement)
machine learning tools
the material we moved from first-analysis-steps to here
hacking the LHCb software (like Brunel or DaVinci)
analysis automation (snakemake and friends)
using the scientific python ecosystem

Please add more ideas here, or link to the issue. The aim of this issue is to keep on top of the ideas that are out there and after some discussion converge on a set of topics.

No link from homepage to 02-lb-get

There's no link from the homepage to 02-lb-get, so it has to be typed in manually

Checklist of things to take into account when starting an analysis

It would be nice to have a reference list students can check with all the things they need to take into account when starting an analysis. For example

Run the Track Smearing
Run Momentum Calibration
...

Add next/prev button to pages

Can I suggest you put a next/previous button on the bottom of the pages. It's a little annoying having to go back and then find the next page.

Thanks

Iwan

External link checker

It's easy for links to become stale, see #24. We should automatically check all links in the CI.

The checker does already do some link checking, but it skips external links. I think that Travis CI has access to the Internet, so it should be OK to just curl -I the pages or something.

Downside is: it will break if a site is temporarily down or if the machine the tests are being run on don't have internet access (writing lessons on the 🚄 for example).

Updating the HLT lessons

This is mostly for @roelaaij 😄

Is the introduction to the HLT up-to-date? Is all the content there that we want to teach?

One thing is to change the text about Run 2, for example

In Run II the reconstructions used in HLT2 and offline will be identical

Also things like if the HLT1 pT threshold may have changed in Run 2 (currently stated that HLT1 only reconstructs pT > 500 MeV).

There are some broken links as well, like in the sentence:

We will use the script we created earlier as a starting point, and the file you downloaded before.

Would you have time to go through the text and update it @roelaaij?

Complaint about the title :)

I got as feedback from some students that they would like to know more how LHCb software works behind the scenes, like writing your own tuple tool, understanding the tracking software etc.. So I was wondering if the topics and the title should include that.

Ganga page needs updating

The page on submitting cmake tasks with Ganga needs updating now it is properly supported with GaudiExec.

"Ganga with cmake" does not work for new tools

I tried to use the Ganga with cmake tutorial to submit a job with a development application. It failed to import the some custom tools although they were defined within InstallArea/blabla/python/Application/ApplicationConf.py. It looks like the PYTHONPATH doesn't know about the modules. When I add InstallArea/blabla/python to it, at least a SetupProject + gaudirun.py works.

I don't know how ganga sets up the PYTHONPATH but maybe there is another workaround?

Idea: Hackers guide to LHCb

This is just a quick idea to drop here for discussion:

I believe the main focus of the starterkits should be to get people to work productively with the existing software.

However, there are things that the community here would like to teach, which are not yet fully supported by the framework. A typical example that caught a few people is that ganga does not yet work with lbrun. Anotherone is the use of cmake, which at least to my mind is still very unstable.

So ... I'd like to suggest that the main lessons aim at presenting a solution that is working in analysis-daily-life now (even if we know/hope that it will be outdated in half a year). In addition there could be a section like The Hackers Guide to LHCb, where BETA features are presented, showing not only where the train is heading in the future, but also where interested people could make a valuable contribution.

Let me know what you think!

Use KaTeX for math rendering

The first analysis steps uses KaTeX to render $math$ , after this commit. We should use that here as it looks better (IMO).

Add lesson on Ganga tasks

It was covered very briefly, showing a Gauss → DaVinci task, but it needs writing up.

Rerun s21 over s20 MC

S21 is a special case since the calo reconstruction was rerun so that it differs from the standard Reco14 used in S20. To be consistent with data, the following options should be appended:
https://svnweb.cern.ch/trac/lhcb/browser/DBASE/trunk/AppConfig/options/DaVinci/DV-RedoCaloPID-Stripping21.py.
Also the appropriate DB tags should be used (same as in s21 MC productions)

Moreover since this is usually done on flagged MC, also global cuts (stripping dependent) should be applied. In the case of S21 these options should be inserted at the beginning
# Tighten Trk Chi2 to <3 from CommonParticles.Utils import DefaultTrackingCuts DefaultTrackingCuts().Cuts = { "Chi2Cut" : [ 0, 3 ], "CloneDistCut" : [5000, 9e+99 ] }

Although the aim of this step is to tech about technical aspects I would also put a recommendation to contact MC and Stripping liaison before attempting any restripping of MC. There are details of which the users are not always aware.

Stripping DaVinci__NXBodyDecays

It would be nice to have some documentation about how to write stripping lines and which methods should (not) be used. @apuignav

file links are wrong

There are some file links do not work, like:
https://lhcb.github.io/second-analysis-steps/code/building-decays/01.historical.py
mentioned in
https://lhcb.github.io/second-analysis-steps/building-decays-part1.html.
And
https://lhcb.github.io/second-analysis-steps/code/building-decays/02.optimized.py
mentioned in
https://lhcb.github.io/second-analysis-steps/building-decays-part2.html

Improvements in Selection lesson

Coming from Vanya!

Please note that there are many new “wrappers” in PhysSel/PhysSelPython/python/PhysSelPython/Wrappers.py
Please teach them that “decay-tree-tuple’ is ALSO selection
Please teach them that appearence of Input=… is a bad sign..

Add lesson on contributing to the software

The second analysis steps seem to be a good place for a lesson on contributing to the LHCb software.

Can start with some basic contribution guidelines ("Everyone is encouraged to contribute. You will receive help, etc.")

Should answer questions

What can I contribute?
Where can I contribute?
What are the rules?
What is the process?

For the contribution process, we could describe a merge request based workflow in Gitlab.

Create a lesson on the PID @ LHCb

Explain how the PID works.
Cover tools like PIDCalib, explain the basics of resampling etc

DIRAC CLI tools

@ibab said he knew of some handy DIRAC command line tools. It would be good to add a little lesson explaining them in PR #27.

Care to share @ibab?

Add a tutorial on using autocompletion with LHCb software

@bartoszmalecki was interested in enabling autocompletion with the LHCb software.
We could make a lesson in the second-analysis-steps that explains it.

Replace logo with new one

The logo in the header is the old ‘toolbox’ logo. It should be replaced with the hexagon as on the Starterkit website.

Add calorimeter processing options file to Stripping21 re-running

Issue reported by Ricardo Vazquez Gomez.

I have found that in the instructions describing how to rerun the stripping over the MC sample there is one important thing missing.

The example uses S21 which was special in many senses. One of them, was that during the stripping a new reconstruction of the calorimeter was applied. This was the first (and last) time that such a thing was done as it created a lot of problems.

The bottomline is that if the new calo reconstruction is not applied when redoing the stripping, then data and MC don’t have the same reconstruction version and cannot be compared (the same effect as having Reco12 MC and Reco14 data). This affect not only photons and electrons, but also all of the PID-related variables for every particle species.

The options to add the reconstruction for the calorimeter can be found in:

https://svnweb.cern.ch/trac/lhcb/browser/DBASE/trunk/AppConfig/options/DaVinci/DV-RedoCaloPID-Stripping21.py

and again, should only be used for S21.

The way to introduce them is simply by appending them to the gaudirun execution:

gaudirun.py options.py DV-RedoCaloPID-Stripping21.py

DSTs instead of nTuples

This could be an interesting topic to cover.

Lucio pointed out during the computing workshop in Paris (Nov 2015) that you could simply write a DST instead of a DecayTreeTuple from your grid jobs. The average size per event should be comparable but you can access "all the information". Once you have your DST you can then remake your nTuple as many times as you wish, potentially even locally without the grid.

I think this is a nice idea to reduce the pain of having huge nTuples with all the branches, re-running things on the grid missing some events, etc.

lhcb / second-analysis-steps Goto Github PK

second-analysis-steps's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs