GithubHelp home page GithubHelp logo

api's People

Contributors

jpata avatar jpivarski avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

jpivarski

api's Issues

Object model

How should we represent ROOT objects in julia?

I quite like the approach of just using the bare Cxx types https://github.com/JuliaHEP/ROOTFramework.jl/blob/master/src/tdirectory.jl#L38

TFile() = @cxx TFile()

Pros:

  • no passthrough required, can give this type directly to C++ functions
  • no type generation required, guaranteed correct correspondence

Cons:

  • cannot easily be used for template arguments (maybe typealias?).
  • creating the constructor methods requires knowing the ROOT arguments and default values.

TFile, TDirectory model

How to best write objects to TFile or TDirectory? ROOT has the concept of a current working directory, creating tons of confusion. I would propose not to deal with that part of the API at all (i.e. TObject::SetDirectory, TDirectory::Add), but rather make a sane version of it.

HDF5.jl might be an inspiration: https://github.com/JuliaIO/HDF5.jl#quickstart

h5write("/tmp/test2.h5", "mygroup2/A", A)

#or
h5open("mydata.h5", "w") do file
    write(file, "A", A)  # alternatively, say "@write file A"
end

#or
using HDF5

h5open("test.h5", "w") do file
    g = g_create(file, "mygroup") # create a group
    g["dset1"] = 3.2              # create a scalar dataset inside the group
    attrs(g)["Description"] = "This group contains only a single dataset" # an attribute
end

TTree model

How to best access (r/w) a TTree? There are numerous proposed models out there, we should learn from them. I would adopt something that's very close to raw ROOT, and then something additional with some julian AbstractDataFrame semantics.

default PyROOT (read-only)

via __getattr__:

objs = tree.myBranch1
print objs

object schema is generated on-the-fly, i.e. if a branch contains a complex class (std::vector, pat::Electron), it will be loaded with Cling.

rootpy

http://www.rootpy.org/auto_examples/tree/model_simple.html

  • read: like PyROOT, but buffered so that tree.myBranch1 only gets loads branch using TBranch.GetEntry, only once
  • write: create a TTree based on a model, set branch values using tree.__setattr__, fill row-by-row as usual in ROOT

heppy

https://github.com/cbernet/heppy

  • read: like in PyROOT via tree.myBranch1
  • write: schedule an AutoFillTreeProducer, which knows how to translate a complex event model into a "flat ntuple" structure like
event.leptons = [Lepton(pt=120, eta=0.5, phi=0.2, mass=12), ...]
=> 
tree.nleptons # ::Int32, variable per row
tree.leptons_pt # (NTuple{NMAX, Float32}) with some predefined NMAX, each row has values up to tree.nleptons

Example of scheduling:


#example of how to save an object (with derived characteristics)
leptonTypeVHbb = NTupleObjectType("leptonTypeVHbb", baseObjectTypes = [ leptonType ],
    variables = [
        NTupleVariable("looseIdSusy", lambda x : x.looseIdSusy if hasattr(x, 'looseIdSusy') else -1, int, help="Loose ID for Susy ntuples (always true on selected leptons)"),
        NTupleVariable("looseIdPOG", lambda x : x.muonID("POG_ID_Loose") if abs(x.pdgId()) == 13 else -1, int, help="Loose ID for Susy ntuples (always true on selected leptons)"),
        ...
    ]
)

#putting it all together into a tree
treeProducer= cfg.Analyzer(
    class_object=AutoFillTreeProducer,␣
    defaultFloatType = "F",
    verbose=False,
    vectorTree = True,
        globalVariables = [
                 NTupleVariable("puWeightUp", lambda ev : getattr(ev,"puWeightPlus",1.), help="Pileup up variation",mcOnly=True),
                 NTupleVariable("puWeightDown", lambda ev : getattr(ev,"puWeightMinus",1.), help="Pileup down variation",mcOnly=True),
                 ...
    ],
    globalObjects = {
          "met"    : NTupleObject("met",     metType, help="PF E_{T}^{miss}, after default type 1 corrections"),
        ....
    },
    collections = {
        "selectedLeptons" : NTupleCollection("selLeptons", leptonTypeVHbb, 8, help="Leptons after the preselection"),
        ...
   }
)

https://github.com/cbernet/cmssw/blob/heppy_8_0_11_tutorial/PhysicsTools/Heppy/python/analyzers/core/AutoFillTreeProducer.py#L8

ROOTDataFrames.jl

  • for columnar: tdf[:myBranch1] => Vector{Float32} transforms to in-memory column
  • read row-by-row: auto-generated immutable row class with correct types based to TTree branches
  • write: transform in-memory DataFrame to on-disk TTree using writetree(df::DataFrame)

Example of row-by-row access

df = TreeDataFrame(["file1.root"]; treename="tree")

for i=1:nrow(df)
    load_row(df, i) #load all branches using TTree::GetEntry
    n__jet = df.row.n__jet() #otherwise only this will actually to TBranch::GetEntry(i - 1)
    jet__pt = df.row.jet__pt()[1:n__jet]
end

@oschulz I'm continuing the discussion on gitter here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.