Comments (17)
👍
from second-analysis-steps.
Yes, that would be interesting. There are two options, you make a simple filter on tour stripping line and write out all events. This is quite easy to configure. The second option is to write out a micro DST which contains only your candidate. This has a much lower event size. I always wanted to learn how to do that but never got around it. However, this is what is done in the stripping as well, so one can probably learn from that.
from second-analysis-steps.
Yes, we could use that to teach CombineParticles
, FilterDesktop
and so
on... Kind of: apply your preselection on the DST and get a "preselected"
DST and then you can do whatever you want.
This reminds me of the classic Ruf presentation:
http://lhcb-reconstruction.web.cern.ch/lhcb-reconstruction/Python/Dst_as_Ntuple.pdf
On Thu, Nov 26, 2015 at 11:08 AM, Sascha Stahl [email protected]
wrote:
Yes, that would be interesting. There are two options, you make a simple
filter on tour stripping line and write out all events. This is quite easy
to configure. The second option is to write out a micro DST which contains
only your candidate. This has a much lower event size. I always wanted to
learn how to do that but never got around it. However, this is what is done
in the stripping as well, so one can probably learn from that.—
Reply to this email directly or view it on GitHub
#7 (comment)
.
Dr. Albert Puig Navarro
Laboratoire de Physique des Hautes Energies
Ecole Polytechnique Fédérale de Lausanne (EPFL)
BSP 614.4 (Cubotron UNIL) CH-1015 Lausanne
EPFL Phone: 021 6939808
CERN Phone: 72518
from second-analysis-steps.
The Ruf classic is what made me push so hard for the interactive DST exploring lesson.
I think doing your actual analysis on a DST is still a bit tedious though as it doesn't load fast enough. Dumping stuff into a pandas
dataframe/numpy
array/TTree
is more agile/interactive.
So conclusion is we include this in this set of lessons? Let's decide that first before going off into dreamland of what could be.
from second-analysis-steps.
I'm not sure, honestly... As you say, it's more cumbersome than
ntuple-format.
On Thu, Nov 26, 2015 at 12:49 PM, Tim Head [email protected] wrote:
The Ruf classic is what made me push so hard for the interactive DST
exploring lesson.I think doing your actual analysis on a DST is still a bit tedious though
as it doesn't load fast enough. Dumping stuff into a pandas dataframe/
numpy array/TTree is more agile/interactive.So conclusion is we include this in this set of lessons? Let's decide that
first before going off into dreamland of what could be.—
Reply to this email directly or view it on GitHub
#7 (comment)
.
Dr. Albert Puig Navarro
Laboratoire de Physique des Hautes Energies
Ecole Polytechnique Fédérale de Lausanne (EPFL)
BSP 614.4 (Cubotron UNIL) CH-1015 Lausanne
EPFL Phone: 021 6939808
CERN Phone: 72518
from second-analysis-steps.
Yeah, I think it is a bit "smarter" than dumping everything into a tree and cutting this down.
Though it might not be more efficient.
My wishful thinking is that this workflow might expose people more to the actual event model and the algorithms. And then people would be less afraid of working with the LHCb software and contributing to it. But that is a different discussion.
from second-analysis-steps.
I recently started playing around with this. I made a µDST with the output of two Turbo lines, and then ran a DecayTreeTuple
over the local output. The DTT step was super fast, and the µDST step was pretty quick as well.
For a few thousand events that contained my signal, the input µDST from the bookkeeping was 4.6 GB and the output µDST was 70 MB.
The filtering step was straight-forward for my use case, though I suspect adding things like raw banks (for flavour tagging and stuff) takes a little more care.
from second-analysis-steps.
my 2 grumpy cents: i'd be sad to let ntuples go, as TTree:Draw is extremely powerful once you exploit that both, the variable you draw and the weight (NB: the second argument is not a cut string, it returns a float!) support bool, int and float operations:
Draw("B0_M_(B0_BKGCAT==0)+B0_P_(B0_BKGCAT>0)","parSigYield_sw*((1<<8)&TCK)")
htemp->GetMean()
not that this particular example is any use…
also RooFit cannot import DST at the moment, can it? (yes, a dst is a root ntuple, but most interfaces fail once you have more complex things than int/float/double on the branches)
from second-analysis-steps.
Reading the initial post, I would say that this is meant less as "DSTs instead of nTuples" and more as "DSTs in addition to nTuples".
It allows you to be more economical when it comes to which variables you want to put in the nTuple (E.g. no need to save the signal decay tree as a matrix).
It also allows you to incorporate updates of the LHCb software and changes to your LHCb-side code into your data very quickly.
from second-analysis-steps.
@ibab is right, the (µ)DST step is meant as an intermediary to making ntuples as one would normally. The primary use case, at least for me, is rerunning my ntuple creation when I realise I'm missing some variables, or am asked to investigate things I hadn't foreseen (e.g. running DecayTreeFitter
). Saving the trimmed µDST means you can rerun over them very quickly, and probably without using the Grid.
from second-analysis-steps.
ack.
which reminds me, i think christian (rostock) once told me he'd use µDST as OO ntuple. maybe one can ask for longterm experience.
also realised another advantage: if you use µDST anyhow as ntuple, you don't have code which only works in DaVinci and code which only works on ntuples. (and then you have to translate your ntuple code once you need its output as stripping variable or relatedinfo on µDST).
from second-analysis-steps.
However, working with µDST (which I like in principle) makes things more
difficult due to compatibility, need of software stack, etc.
I like having only ROOT because I can work on the Mac or offline, but if
there was a portable DaVinci (din;t bring up CernVM) that I could directly
use (with reasonable complication), I wouldn't see the need for ntuples :-)
BTW: this would be a DREAM come true.
from second-analysis-steps.
so (back to topic):
we cannot (for now) remove the ntuple lessons. but it seems desirable to add a lesson for the extended workflow:
DST→(DaVinci grid)→µDST→(DaVinci local)→tuple
which is cool itself for potentially quicker turnaround cycles in the second step
and then add a lesson "cool stuff you can do with a µDST instead of writing an ntuple"
from second-analysis-steps.
Agreed! 👍
from second-analysis-steps.
👍
from second-analysis-steps.
Not even sure we'd want to have "replace your nTuple with a uDST". Maybe
something for the lhcb-magicians-kit
On Fri, Dec 18, 2015 at 10:17 AM Alex Pearce [email protected]
wrote:
Agreed! [image: 👍]
—
Reply to this email directly or view it on GitHub
#7 (comment)
.
from second-analysis-steps.
Agreed, the title should be different. Paul summarized very well what I had in mind.
from second-analysis-steps.
Related Issues (20)
- Use KaTeX for math rendering
- Checklist of things to take into account when starting an analysis
- Improvements in Selection lesson HOT 1
- Replace logo with new one
- Updating the HLT lessons HOT 2
- No link from homepage to 02-lb-get HOT 1
- External link checker
- DIRAC CLI tools HOT 1
- Add lesson on Ganga tasks
- Stripping DaVinci__NXBodyDecays HOT 1
- Rerun s21 over s20 MC HOT 3
- Explain difference between cc and CC HOT 1
- "Ganga with cmake" does not work for new tools HOT 6
- Ganga page needs updating HOT 3
- Add a tutorial on using autocompletion with LHCb software HOT 2
- Add calorimeter processing options file to Stripping21 re-running
- Create a lesson on the PID @ LHCb
- file links are wrong
- Choice of topics HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from second-analysis-steps.