GithubHelp home page GithubHelp logo

Object model about api HOT 29 OPEN

jpata avatar jpata commented on August 23, 2024
Object model

from api.

Comments (29)

oschulz avatar oschulz commented on August 23, 2024

Yes, it's done this way in ROOTFramework.jl, at the moment. I want to avoid pointers as much as possible.

from api.

jpata avatar jpata commented on August 23, 2024

Pointers can't entirely be avoided: in case you use TFile::Open you will necessarily get one. So we need to treat them as well

from api.

oschulz avatar oschulz commented on August 23, 2024

Pointers can't entirely be avoided [...] So we need to treat them as well

Yes - I'm doing that via Unions at the moment, but it's a bit verbose in code. Maybe some of it can be auto-generated, though. Mainly, I want Julia's GC do handle memory where possible - of course it's not possible everywhere due to ROOT's design.

from api.

jpata avatar jpata commented on August 23, 2024

I see:

typealias TDirectoryRef rcpp"TDirectory"
typealias TDirectoryPtr pcpp"TDirectory"
const TDirectoryInst = Union{TDirectoryRef, TDirectoryPtr}

So you treat references and pointers equivalently. That seems OK, I guess, given that julia is pass-by-reference and what people are used to from PyROOT.

What about values though? I see that in a different place:

export TFile, TFilePtr, TFileInst
typealias TFile cxxt"TFile"
typealias TFilePtr pcpp"TFile"
const TFileInst = Union{TFile, TFilePtr}

Treating values and pointers as equivalent doesn't seem correct. When will the object be copied if you pass it to a function?

from api.

oschulz avatar oschulz commented on August 23, 2024

@jpata : So you treat references and pointers equivalently.

Yes -thankfully, @cxx obj->method(...) doesn't care if it's a value, pointer or ref, so things can be written generically and the user won't have to worry about it (in many cases).

Treating values and pointers as equivalent doesn't seem correct.

I'm not treating them as equivalent. I'm defining a Union so many methods can be written universally for value, ref and pointer alike. That doesn't mean that every method will take a TFileInst value, some will of course explicitly require a TFile or TFilePtr. TFileInst is for when it does not matter what it is.

from api.

oschulz avatar oschulz commented on August 23, 2024

When will the object be copied if you pass it to a function

At least within the wrapped API, a ROOT object should (except when necessary for thread-safety or distributed computing) IMHO never be copied implicitly (as far as technically possible).

In general, I want to shield the Julia ROOT users as far as possible from ROOT, ah, peculiar way of memory management - there's limits to what can be done, of course, but I try. For example, a THxx create with ROOTFramework will never automatically belong to ROOT, and not automatically be part of any TDirectory. I belongs to Julia, Julia GC's it, it can be safely put in a Julia Array, etc.

from api.

jpata avatar jpata commented on August 23, 2024

I'm not treating them as equivalent. I'm defining a Union so many methods can be written universally for value, ref and pointer alike.

OK, I see, but how can a user be sure what happens when he/she passes something to a function such as f(file::TFileInst) = icxx""" $file->bla() """? Depending on if file::TFile or file::TFilePtr, the behaviour will be different. In particular, I believe in the former case you will call the copy constructor of TFile and create a new object (need to check), whereas in the latter case you are modifying the original object.

For example, a THxx create with ROOTFramework will never automatically belong to ROOT, and not automatically be part of any TDirectory. I belongs to Julia, Julia GC's it, it can be safely put in a Julia Array, etc.

Indeed, classes deriving from TH1 are simpler to decouple from ROOT's internal memory management due to having access to TH1::AddDirectory(false). However, I'm not sure if we can do this easily also with non-TH1 classes, for example THnSparse. There we'd need some sort of mechanism to generate unique names for those objects when you create them, assign them to some either stack-or-heap kind of TDirectory in memory which has appropriate scoping for the julia gc. We could check in ROOT how they've done that for TH1.

from api.

jpata avatar jpata commented on August 23, 2024

Here's an example from rootpy for creating "nameless objects": https://github.com/rootpy/rootpy/blob/396ae7edb40712afe1ffad2336ff8e3cc1c2ef74/rootpy/base.py#L110

Essentially, use unique ID-s for the names. This gets rid of users inadvertently overwriting objects in "ROOT-memory", but it doesn't do anything about julian scoping: non-TH1 objects will AFAIK still be created in the gDirectory namespace, and will go out of scope when it is deleted.

from api.

oschulz avatar oschulz commented on August 23, 2024

We do we need to give them names at all? Also, if the objects don't automatically live in a TDirectory - which we should avoid at all cost so Julia can do the GC - then there's no problem if TObjects have emty or identical names.

from api.

oschulz avatar oschulz commented on August 23, 2024

ROOTFramework histograms, for example, are not created in gDirectory. So far, this has worked out well - it certainly needs more careful study, to check that there's no unfortunate surprises with the approach (given ROOT's peculiarities), but it hasn't crashed on my so far. :-)

from api.

oschulz avatar oschulz commented on August 23, 2024

Do you think the "out of gDirectory" approach will not work with THnSparse.?

from api.

oschulz avatar oschulz commented on August 23, 2024

@jpata : OK, I see, but how can a user be sure what happens when he/she passes something to a function such as f(file::TFileInst) = icxx""" $file->bla() """? Depending on if file::TFile or file::TFilePtr, the behaviour will be different. In particular, I believe in the former case you will call the copy constructor

No, no copy is made in either case just by invoking icxx""" $file->bla() """. Cxx.jl really is magic ...

from api.

jpata avatar jpata commented on August 23, 2024

What if you call a call a C++ function icxx""" func($file) """ that looks like

void func(TFile f) {
  //f will be copied
  //do something with object, you won't see the modifications on the original
}

Will Cxx dereference the pointer? A function like func2(TFile* f) {...} has a different meaning, I'm still not sure how Unionizing values and pointers is correct.

from api.

jpata avatar jpata commented on August 23, 2024

Do you think the "out of gDirectory" approach will not work with THnSparse.?

Looking deeper into ROOT source code, it looks like only TH1 and a few other objects reference gDirectory, so indeed it might be enough to just use TH1::AddDirectory(false) and ignore names throughout.

So one would have to have some rule for creating the root object constructors. I'm also uneasy with hardcoding default values for arguments in julia when they already exist in C++.

from api.

oschulz avatar oschulz commented on August 23, 2024

Apart from histograms, this should be a (hopefully complete) list of classes that ROOT auto-adds to the curretn TDirectory:

TEntryList
TTree
TGraph2D
TEfficiency
TEveVSD
ROOT::TGenericClassInfo
TEventList
TDSet
TClass
TSystemDirectory
TChain

from api.

jpata avatar jpata commented on August 23, 2024

Thanks, that's useful to keep in mind! I thought we'd be in for a more hellish re-engineering with gDirectory removal, which I would support nevertheless.

from api.

oschulz avatar oschulz commented on August 23, 2024

Unfortunately, TH1::AddDirectory(false) is a global setting, and we shouldn't modify global settings if at all possible - it can have break other code using in conjunction with ours that relies on it. Also, not all the classes above offer such a setting.

from api.

oschulz avatar oschulz commented on August 23, 2024

But the approach of using the default constructor and then setting name and title seems to work well - it keeps the objects out of gDirectory right from the moment of creation.

from api.

oschulz avatar oschulz commented on August 23, 2024

Of course we can allways fall back to removal-after-creation - but that may be problematic when it comes to exception-safety and multi-threading. But if it's the only way for a certain class, so be it. :-)

from api.

jpata avatar jpata commented on August 23, 2024

But the approach of using the default constructor and then setting name and title seems to work well - it keeps the objects out of gDirectory right from the moment of creation.

I wish ROOT had this consistency but I'm not sure we can always rely on it:
https://github.com/root-mirror/root/blob/9a6d8a9a21aa0d799e333a14329ceb8a7eaeb850/hist/hist/src/TEfficiency.cxx#L618

Perhaps it's worth to declare in advance (as you've done, but in code some way) which classes have "side-effects on creation" and treat them separately.

from api.

oschulz avatar oschulz commented on August 23, 2024

Perhaps it's worth to declare in advance (as you've done, but in code some way) which classes have "side-effects on creation" and treat them separately.

That's actually fairly easy to check:

    (
        TH1::AddDirectoryStatus() &&
            (dynamic_cast<TH1*>(obj) != 0)
    ) || (
        TDirectory::AddDirectoryStatus() && (obj->IsA()->GetDirectoryAutoAdd() != nullptr)
    );

At least this has worked for me so far (see here: https://github.com/databricxx/databricxx/blob/e94bb0a34a9487ff8ce088e6b33edad6db128fda/src/rootiobrics.cxx#L427-L432).

from api.

jpata avatar jpata commented on August 23, 2024

OK, interesting, I didn't know about TDirectory::AddDirectory or the corresponding global method. I would actually support disabling auto-add out of the box for any TDirectory in an init method, that would seem to solve the problem.
Then if a user wants to go insane, they still can, but not by default.

from api.

oschulz avatar oschulz commented on August 23, 2024

I would avoid playing with glocal defaults if at all possible (and I think we won't need to). It's usually a bad idea. Also, it's not about whether the user want's to go insane - the user may well want/need to bring some existing ROOT/C++ code into the mix, and he may have no idea what that code assumes about the global defaults.

from api.

jpata avatar jpata commented on August 23, 2024

Valid point, I had the impression that TDirectory::AddDirectory is an object not a global method.

To me, it's still an open question of how to actually successfully solve or implement the "create object with no side-effects" policy, given that ROOT doesn't have such a flag in the constructor. Maybe indeed the best would be to check: if it's a known unsafe object, remove it from the gDirectory.

Another unsolved point on the object model is pointer/value semantics (still not convinced about the Union{Ptr, Value}) and ownership. For the julia gc to handle C++ objects, AFAIK ROOT must never call it's own delete methods on them (e.g. when a file is closed, the attached histograms are closed). Do you already have experience with that?

from api.

oschulz avatar oschulz commented on August 23, 2024

To me, it's still an open question of how to actually successfully solve or implement the "create object with no side-effects"

I'd suggest the following: If the object is created via @cxx ..., we don't tamper with ROOT's default behaviour, so the user can predict (as far as ROOT let's him) what happens. If the object is created via a Julia-wrapper, we do our best to make Julia own it (if sensible - not for a TTree in a TFile, etc. of course). As each wrapper will usually deal with specific types, we can handle them individually (and fall back to auto-removal for gDirectory if not possible - I have decent code in databricxx that we should be able to use almost unmodified).

(still not convinced about the Union{Ptr, Value})

This was born from the fact that ROOT often forces a pointer on us (when ROOT has to own the object) but that, in contrast to ROOT's normal "philosophy" we want to use Julia-owned non-pointer objects as often as possible. I don't like the resulting number of Union types myself - I wouldn't mind a better alternative, if one can be found.

For the julia gc to handle C++ objects, AFAIK ROOT must never call it's own delete methods on them (e.g. when a file is closed, the attached histograms are closed). Do you already have experience with that

It has worked for me so far, e.g. with histograms, even when writing them to TFiles. However a TCanvas, for example, must be owned by ROOT, because closing the window can result in object deletion and Julia will crash if it tries to GC afterwards.

Unfortunately, due to ROOT's design, I think it will be unavoidable to handle different kinds of types in different ways. :-(

from api.

jpata avatar jpata commented on August 23, 2024

I've experimented a bit.
https://github.com/jpata/ROOT.jl/blob/dcb9075dda90e7dfe75883cef616bd9e94e43c35/test/rootcxx.jl

I think as long as we explicitly differentiate between objects that are under julia control and objects that are under ROOT, defaulting on the former, we'll be safe.

In particular in

fi = TFile(...)
hi = TH1D(...)

I would make the TH1D constructor by default remove the object from TFile with hi->SetDirectory(0), requiring explicit steps to add it to the TFile. This would be a conscious choice to differentiate from standard ROOT behaviour.

It would seem that either unique_ptr or shared_ptr is the appropriate way to keep track of heap-allocated ROOT objects. As long as we tell ROOT not to take ownership of those objects, the gc can take care of them. I'm not sure if unique_ptr assignment works correctly, so perhaps shared_ptr would work better by having similar semantics as julia references.

Additionally, since in stack allocation such as icxx""" TFile(...) """ the object is copied as it is returned to julia, it won't work for many objects. I'm wondering if we need stack allocated ROOT objects at all, as we have the julia gc to clean up unreferenced objects, which would allow to remove pointer/value punning.

Also:

The answer is that anything of type CppValue will be owned by julia and destructed upon GC. Right now that should be stable, but I will not guarantee that given future possible directions of the language, so for use cases like this I would recommend keeping the object on the C++ heap so you have explicit control over addresses.

(from https://groups.google.com/forum/#!topic/julia-users/miC28UNRAds)

from api.

oschulz avatar oschulz commented on August 23, 2024

I think as long as we explicitly differentiate between objects that are under julia control and objects that are under ROOT, defaulting on the former, we'll be safe.

Yes, I think so, too.

I would make the TH1D constructor by default remove the object from TFile

What's wrong with the current approach in ROOTFramework to construct histograms in a way that keeps them out of the current TDirectory in the first place? I think that's preferable to removing them after creation.

It would seem that either unique_ptr or shared_ptr is the appropriate way to keep track of heap-allocated ROOT objects

I thought long and hard about this (and experimented) when I started ROOTFramework.jl . In the end, I decided against it. I also looked at OpenCV.jl - they also use the Julia a lot instead of the Cxx heap..

shared_ptr doesn't really come with a benefit, because ROOT doesn't support it anywhere. unique_ptr is, of course, an alternative to having the objects on the Julia heap. But it's also a complication - not only does it result in additional memory allocation, there's also the fact that C++ doesn't support co-/contravariance (a unique_ptr<TH1D> is not a subclass of unique_ptr<TH1>). That can make some things more difficult for the Julia wrappers. If ROOT would make use of either pointer type, they'd be the obvious choice - sadly, ROOT, does not. :-(

as icxx""" TFile(...) """ the object is copied as it is returned to julia

I haven't checked - is this also true for @cxx TFile()? Anyhow, so far, I haven't run into any trouble with it, and the cost of copying a basically empty object may well be less than the additional memory allocation for a unique_ptr. Also, even if the newly created object is "heavy", it is to be hoped that ROOT-6 will support C++11 move semantics in the future (as they can easily do so without breaking API compatibility) - which would basically eliminate the copy-cost.

if we need stack allocated ROOT objects at all

They're not stack-allocated, but heap-allocated on the Julia heap.

Also:

[...] but I will not guarantee that given future possible directions of the language, so for use cases like this I would recommend keeping the object on the C++ heap [...]

However, nothing like this is currently even on the horizon. And even if the GC was changed, that wouldn't be a big problem, as (from the same discussion):

Yichao Yu: We'd like to try out moving but even if it is implemented, we will definitely have a relatively easy way to pin the object.

from api.

jpata avatar jpata commented on August 23, 2024

What's wrong with the current approach in ROOTFramework to construct histograms in a way that keeps them out of the current TDirectory in the first place? I think that's preferable to removing them after creation.

I'm not actually sure where you ensure this in ROOTFramework, can you point it out?

For object init, what you do in ROOTFramework is basically a post-init if I'm not mistaken. It works for TH1*, but I'm not sure if it works for all ROOT objects, which need to be properly initialized in the constructor, like TFile.

shared_ptr doesn't really come with a benefit, because ROOT doesn't support it anywhere. unique_ptr is, of course, an alternative to having the objects on the Julia heap

Right, unfortunately we can't get ROOT to increment the reference counter when it puts the object in some global pool.

there's also the fact that C++ doesn't support co-/contravariance (a unique_ptr is not a subclass of unique_ptr)

Can you give a specific example where this is a problem? This seems to work fine:

//a function with explicit pointer rules
void func2(unique_ptr<TH1> o) {
  cout << o->GetName() << endl;
}

//old-style ROOTfunction
void func3_unsafe(TH1* o) {
  cout << o->GetName() << endl;
}

auto h1 = unique_ptr<TH1D>(new TH1D("h1", "h1", 100, -5.0, 5.0));

// in case we know that ROOT won't try to free h1
func3_unsafe(h1.get());

// in case we're not sure if ROOT will free, h1 will be invalid
//func3_unsafe(h1.release());

//if we have our own function C++ where we make the ownership policy clear
//func2(move(h1));

The problem with unique_ptr on the julia side seems to be that it doesn't respect the copy/move semantics of the C++ unique_ptr, maybe there is a way to circumvent it though.

julia> h1 = icxx""" unique_ptr<TNamed>(new TNamed("a", "a"));"""
julia> h2 = h1
julia> println(unsafe_string(icxx"""$h2->GetName();"""))
a
julia> println(unsafe_string(icxx"""$h1->GetName();""")) #but h1 should be invalid!
a
julia> h2 = icxx"""std::move($h1);"""
julia> println(unsafe_string(icxx"""$h2->GetName();"""))
a
julia> println(unsafe_string(icxx"""$h1->GetName();""")) #but h1 should be invalid!
a

I haven't checked - is this also true for @cxx TFile()

It works via @cxx, but I'm not sure if you can reasonably rely on post-initializing a generic ROOT object.

julia> @cxx TFile()
(class TFile) {
}

Using icxx, you will try to make an implicit copy. Interestingly, despite an error, Cxx.jl returns some sort of a (valid?) object.

julia> icxx"""TFile();"""
In file included from :1:
:2:1: error: calling a private constructor of class 'TFile'
TFile();
^
/Users/joosep/Documents/root-build/include/TFile.h:142:4: note: declared private here
   TFile(const TFile &);            //Files cannot be copied
   ^
(class TFile) {
}

They're not stack-allocated, but heap-allocated on the Julia heap.

Do you mean the output of @cxx Obj() and @cxxnew Obj() ? Or also icxx""" new Obj(...);"""? So julia does it's own reference counting for those, and when passing pointer to ROOT, you hope/rely that they don't change as discussed in the gc thread? How do you know that ROOT doesn't externally keep track of your object and try to delete it in that case? With unique_ptr you would clearly relinquish control of it using unique_ptr<T>::release(), or with a known "safe" function that doesn't have side effects that would cause future deletion, ::get().

from api.

oschulz avatar oschulz commented on August 23, 2024

I'm not actually sure where you ensure this in ROOTFramework, can you point it out?

Histograms are created with @cxx THxx(). When using the default constructor, a THxx is never added to the current TDirectory - ROOT doesn't register it's existence anywhere. This way, we can create histograms that really belong to us. The same trick should work for the other "auto-added" classes - thankfully, it's a finite list.

It works via @cxx, but I'm not sure if you can reasonably rely on post-initializing a generic ROOT object. (@cxx TFile()) [...] Using icxx, you will try to make an implicit copy

It's not necessary for TFile, etc., these can be constructed with their arguments directly - and the resulting object is not copied! I don't know exactly what magic Keno uses to make it happen, but try this Julia program to test it - there's no copying, and everything ends up on the Julia heap.

For, e.g., TFile, ROOTFramework uses @cxx TFile(pointer(fname), pointer(mode), ...), and it has worked perfectly fine for me so far (without any errors or suspicious behavior). If you have some time, maybe try to run the ROOTFramework examples?

Do you mean the output of @cxx Obj() and @cxxnew Obj() ? Or also icxx""" new Obj(...);"""? So julia does it's own reference counting for those, and when passing pointer to ROOT, you hope/rely that they don't change as discussed in the gc thread?

I mean the output of @cxx Obj(), resp icxx""" Obj(); """, not @cxxnew. Yes, these are, for Julia, "normal" heap-allocated Julia objects, subject to Julia's GC (and no, Julia's GC doesn't use reference counting). And yes, with the current Julia GC their address is stable. And should Julia ever switch to a moving GC, we'll be able to pin them (according to Yichao Yu). Anyhow, this is only relevant for objects allocated with@cxx Obj() whose address ROOT memorizes long-term (between Cxx.jl calls). I'm pretty sure that during Cxx calls, object addresses will be stable even with a (theoretical) moving GC.

How do you know that ROOT doesn't externally keep track of your object and try to delete it in that case?

While it's true that ROOT's memory-management strategy is, uh, a bit unique, it's not entirely unpredictable. Depending on the object type, it's usually clear under which circumstances ROOT takes ownership. I would aim to avoid these where possible, use a plain pointer if ROOT will take ownership directly (e.g. a new window / TCanvas), and use a copy when calling a ROOT function that would take ownership of an existing object.

With unique_ptr you would clearly relinquish control of it using unique_ptr::release(), or with a known "safe" function that doesn't have side effects that would cause future deletion, ::get().

Exactly - you still need to know when ROOT will take ownership (release()) and when not (get()). So I don't think it simplifies anything. On the other hand, since we won't wrap the whole ROOT API, user will, at times create ROOT objects directly - and it'll be much more convenient for them to use a @cxx ... instead of an icxx""" std::unique_ptr<...>(new ...). To really take advantage unique_ptr and use it comfortably, Cxx.jl would IMHO need some kind of direct support for creation and management of unique_ptr's. And even with that, I'm not sure there's really an advantage to using it, as Julia already does great memory management.

Can you think of a few use cases where we really need to transfer ownership of an existing object from Julia to ROOT, and using a copy will not be possible - or too costly?

from api.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.