This could be an interesting topic to cover. Lucio pointed out durin

Yes, we could use that to teach CombineParticles , <co

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

DSTs instead of nTuples about second-analysis-steps HOT 17 OPEN

betatim commented on June 23, 2024

DSTs instead of nTuples

from second-analysis-steps.

Comments (17)

kdungs commented on June 23, 2024

👍

from second-analysis-steps.

saschastahl commented on June 23, 2024

Yes, that would be interesting. There are two options, you make a simple filter on tour stripping line and write out all events. This is quite easy to configure. The second option is to write out a micro DST which contains only your candidate. This has a much lower event size. I always wanted to learn how to do that but never got around it. However, this is what is done in the stripping as well, so one can probably learn from that.

from second-analysis-steps.

apuignav commented on June 23, 2024

Yes, we could use that to teach CombineParticles, FilterDesktop and so
on... Kind of: apply your preselection on the DST and get a "preselected"
DST and then you can do whatever you want.

This reminds me of the classic Ruf presentation:
http://lhcb-reconstruction.web.cern.ch/lhcb-reconstruction/Python/Dst_as_Ntuple.pdf

On Thu, Nov 26, 2015 at 11:08 AM, Sascha Stahl [email protected]
wrote:

Yes, that would be interesting. There are two options, you make a simple
filter on tour stripping line and write out all events. This is quite easy
to configure. The second option is to write out a micro DST which contains
only your candidate. This has a much lower event size. I always wanted to
learn how to do that but never got around it. However, this is what is done
in the stripping as well, so one can probably learn from that.

—
Reply to this email directly or view it on GitHub
#7 (comment)
.

Dr. Albert Puig Navarro
Laboratoire de Physique des Hautes Energies
Ecole Polytechnique Fédérale de Lausanne (EPFL)
BSP 614.4 (Cubotron UNIL) CH-1015 Lausanne
EPFL Phone: 021 6939808
CERN Phone: 72518

from second-analysis-steps.

betatim commented on June 23, 2024

The Ruf classic is what made me push so hard for the interactive DST exploring lesson.

I think doing your actual analysis on a DST is still a bit tedious though as it doesn't load fast enough. Dumping stuff into a pandas dataframe/numpy array/TTree is more agile/interactive.

So conclusion is we include this in this set of lessons? Let's decide that first before going off into dreamland of what could be.

from second-analysis-steps.

apuignav commented on June 23, 2024

I'm not sure, honestly... As you say, it's more cumbersome than
ntuple-format.

On Thu, Nov 26, 2015 at 12:49 PM, Tim Head [email protected] wrote:

The Ruf classic is what made me push so hard for the interactive DST
exploring lesson.

I think doing your actual analysis on a DST is still a bit tedious though
as it doesn't load fast enough. Dumping stuff into a pandas dataframe/
numpy array/TTree is more agile/interactive.

So conclusion is we include this in this set of lessons? Let's decide that
first before going off into dreamland of what could be.

—
Reply to this email directly or view it on GitHub
#7 (comment)
.

from second-analysis-steps.

saschastahl commented on June 23, 2024

Yeah, I think it is a bit "smarter" than dumping everything into a tree and cutting this down.
Though it might not be more efficient.
My wishful thinking is that this workflow might expose people more to the actual event model and the algorithms. And then people would be less afraid of working with the LHCb software and contributing to it. But that is a different discussion.

from second-analysis-steps.

alexpearce commented on June 23, 2024

I recently started playing around with this. I made a µDST with the output of two Turbo lines, and then ran a DecayTreeTuple over the local output. The DTT step was super fast, and the µDST step was pretty quick as well.

For a few thousand events that contained my signal, the input µDST from the bookkeeping was 4.6 GB and the output µDST was 70 MB.

The filtering step was straight-forward for my use case, though I suspect adding things like raw banks (for flavour tagging and stuff) takes a little more care.

from second-analysis-steps.

pseyfert commented on June 23, 2024

my 2 grumpy cents: i'd be sad to let ntuples go, as TTree:Draw is extremely powerful once you exploit that both, the variable you draw and the weight (NB: the second argument is not a cut string, it returns a float!) support bool, int and float operations:
Draw("B0_M_(B0_BKGCAT==0)+B0_P_(B0_BKGCAT>0)","parSigYield_sw*((1<<8)&TCK)")
htemp->GetMean()
not that this particular example is any use…
also RooFit cannot import DST at the moment, can it? (yes, a dst is a root ntuple, but most interfaces fail once you have more complex things than int/float/double on the branches)

from second-analysis-steps.

ibab commented on June 23, 2024

Reading the initial post, I would say that this is meant less as "DSTs instead of nTuples" and more as "DSTs in addition to nTuples".
It allows you to be more economical when it comes to which variables you want to put in the nTuple (E.g. no need to save the signal decay tree as a matrix).
It also allows you to incorporate updates of the LHCb software and changes to your LHCb-side code into your data very quickly.

from second-analysis-steps.

alexpearce commented on June 23, 2024

@ibab is right, the (µ)DST step is meant as an intermediary to making ntuples as one would normally. The primary use case, at least for me, is rerunning my ntuple creation when I realise I'm missing some variables, or am asked to investigate things I hadn't foreseen (e.g. running DecayTreeFitter). Saving the trimmed µDST means you can rerun over them very quickly, and probably without using the Grid.

from second-analysis-steps.

pseyfert commented on June 23, 2024

ack.

which reminds me, i think christian (rostock) once told me he'd use µDST as OO ntuple. maybe one can ask for longterm experience.

also realised another advantage: if you use µDST anyhow as ntuple, you don't have code which only works in DaVinci and code which only works on ntuples. (and then you have to translate your ntuple code once you need its output as stripping variable or relatedinfo on µDST).

from second-analysis-steps.

apuignav commented on June 23, 2024

However, working with µDST (which I like in principle) makes things more
difficult due to compatibility, need of software stack, etc.

I like having only ROOT because I can work on the Mac or offline, but if
there was a portable DaVinci (din;t bring up CernVM) that I could directly
use (with reasonable complication), I wouldn't see the need for ntuples :-)

BTW: this would be a DREAM come true.

from second-analysis-steps.

pseyfert commented on June 23, 2024

so (back to topic):
we cannot (for now) remove the ntuple lessons. but it seems desirable to add a lesson for the extended workflow:
DST→(DaVinci grid)→µDST→(DaVinci local)→tuple
which is cool itself for potentially quicker turnaround cycles in the second step

and then add a lesson "cool stuff you can do with a µDST instead of writing an ntuple"

from second-analysis-steps.

alexpearce commented on June 23, 2024

Agreed! 👍

from second-analysis-steps.

saschastahl commented on June 23, 2024

👍

from second-analysis-steps.

betatim commented on June 23, 2024

Not even sure we'd want to have "replace your nTuple with a uDST". Maybe
something for the lhcb-magicians-kit

On Fri, Dec 18, 2015 at 10:17 AM Alex Pearce [email protected]
wrote:

Agreed! [image: 👍]

—
Reply to this email directly or view it on GitHub
#7 (comment)
.

from second-analysis-steps.

saschastahl commented on June 23, 2024

Agreed, the title should be different. Paul summarized very well what I had in mind.

from second-analysis-steps.

DSTs instead of nTuples about second-analysis-steps HOT 17 OPEN

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs