symbolicml / dynamicexpressions.jl Goto Github PK

View Code? Open in Web Editor NEW

98.0 98.0 15.0 3.2 MB

Ridiculously fast symbolic expressions

Home Page: https://symbolicml.org/DynamicExpressions.jl/dev

License: Apache License 2.0

Julia 100.00%

binary-trees expression-evaluator symbolic-computation symbolic-manipulation symbolic-regression

dynamicexpressions.jl's People

Contributors

Stargazers

Watchers

Forkers

alcap23 everettgrethel wenpw maxreiss123 jknohr arhik sidbaines charishma13 luxdl psaunderualberta sebastianm-c gca30 nmheim robdancer viktmar

dynamicexpressions.jl's Issues

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

`tree_map`?

I realize a lot of functions could just be implemented as calls to a generic tree_map function. For example,

tree_map(t -> 1, tree; merge=max)

would calculate the depth of a tree. The “merge” function would be used to aggregate left/right child for binary nodes. For example,

tree_map(t -> 1, tree; merge=(+))

would count the total number of nodes. Meanwhile,

tree_map(tree; merge=(+)) do t
    Int(t.degree==2)
end

would count the number of binary operators. Then something like

tree_map(tree; merge=(l, r)->[l…, r…]) do t
    if t.degree != 0 || !t.constant
        return []
    end
    return [t.val]
end

would return all constants in a tree (in depth-first traversal order).

@Moelf would this have been helpful for writing that NYT puzzle solver? What do you think of the API?

@AlCap23 any comment?

[Idea] GPU-native implementation

I wonder if it’s possible to have a GPU native implementation, with the tree stored as 1-hot vectors, the dimension given by the number of total operators + value + feature + degree. You would evaluate all operators at each node in the tree, and mask the outputs not used.

However I’m not sure this would actually work, because with deeply-nested trees you would have to have O(2^n) evaluated nodes for a depth of O(n), whereas dynamic expressions would just be n evaluations.

Register package

@JuliaRegistrator register

AbstractTrees.jl support

https://juliahub.com/ui/Packages/General/AbstractTrees

The only two functions that are mandatory to overload are children and nodevalue.

Consider a different way to error on extension not loaded

Currently the functions are written as:

function foo(args...; kwargs...)
	error("ExtX.jl not loaded")
end

# In ExtX.jl
function foo(a::Int, b::Char)
	# do the correct thing
end

Now this works, but let's say I did using X, DynamicExpressions; foo(1, 2) I will get an error "ExtX.jl not loaded" which is somewhat confusing because X.jl is already loaded. For quite a few of my packages the way I handle this is:

@inline _is_extension_loaded(::Val) = false

function _foo_internal end

function foo(...)
    _is_extension_loaded(Val(:X)) && return _foo_internal(...)
	error("....")
end

# In ExtX.jl
@inline _is_extension_loaded(::Val{:X}) = true

This does cause a minor increase in invalidations, but julia compiles away the branch so there is no runtime cost

Suppress Warning from ForwardDiff Duals

Turns out ForwardDiff.jl already works with DynamicExpressions, but we would want to turn off the type mismatch warnings.

using ForwardDiff, DynamicExpressions

operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos]);
x1 = Node(; feature=1)
x2 = Node(; feature=2)
expr = x1 * cos(x2 - 3.2)

X = rand(2, 5)

ForwardDiff.gradient(X) do X
    return sum(abs2, first(eval_tree_array(expr, X, operators)))
end

┌ Warning: Warning: eval_tree_array received mixed types: tree=Float32 and data=ForwardDiff.Dual{ForwardDiff.Tag{var"#13#14", Float64}, Float64, 10}.
└ @ DynamicExpressions.EvaluateModule /mnt/research/lux/DynamicExpressions.jl/src/Evaluate.jl:95
2×5 Matrix{Float64}:
  0.182636   0.882172   0.906966    0.161635    1.86929
 -0.020271  -0.363872  -0.0764073  -0.0126587  -0.331082

Use StaticArray for children

I think the generalization to more than 2 children would be best attempted with StaticArrays.MArray. So you could have a field

struct Node{T,N}
    degree::UInt8
    ...
    _children::MVector{Node{T,N},N}
end

The beauty of this is you could still have undefined values (https://discourse.julialang.org/t/how-to-initialize-an-empty-staticarray/69942/11?u=milescranmer) and the max number of children would still be a compile time constant.

And calling children(node) could return a Vector according to the degree of the node. (Though we would still want to have special methods for each degree).

Similarly, for OperatorEnum, it could be a tuple of tuples up to the maximum arity.

@gca30 interested to hear your thoughts

Rewrite docs with DocStringExtensions

Another package I wish I knew about a long time ago… https://docstringextensions.juliadocs.org/stable/

Remove NaN checks by default

Right now, NaN and Inf checks are performed throughout the evaluation code. This is tied to SymbolicRegression.jl, but isn't needed in other cases, and slows them down a bit. Thus, I think eval_tree_array and related functions should have a flag for performing NaN and Inf checks or not.

Float64 by default

I feel like it's a bit annoying that Float32 is the default, since then just typing 1.0 triggers a different type. So would be good to change this the next version.

Overflow turns the whole batch to `NaN`s

Hey! Thanks a lot for this, I really like the package!:)

I seems like an overflow in one of the samples causes the whole batch to be turned into NaNs:

using DynamicExpressions

T = Float64
x = Node{T}(feature=1)
ops = OperatorEnum(binary_operators=[*])
expr = x*2

julia> X = ones(1,2)
julia> expr(X, ops)
2-element Vector{Float64}:
 2.0
 2.0

julia> X[2] = floatmax(T)
julia> expr(X, ops)
2-element Vector{Float64}:
 NaN
 NaN

[Feature] Preserve sharing during simplification

Right now the simplification routines will break any shared subexpressions. This means if you have an expression:

inner = cos(x1) - 3.2
expression = exp(inner) + inner

where the same inner is used at each stage, if you go to simplify it, it will break this connection, and you will end up with:

expression = exp(cos(x1) - 3.2) + cos(x1) - 3.2

however, there is a way to get around this.

Whenever there is a shared node, it should be split into a system of equations. Then, each individual equation can be treated normally with simplification. At the end, the system of equations can be sewn together with the same structure as before, preserving shared variables.

@AlCap23 in case of interest

Possible extensions

Hey @MilesCranmer !

I've been thinking about possible extensions and wonder how you think about this:

Parametric Operations
In short, I want to be able to supply function with hold tuneable constants, e.g. f(x, y , p) = x / ( y + p) where p is a constant with a tuneable value. This could maybe be done with predefined patterns of nodes, but also might be doable somewhat different?
N-ary Operations
This one seems a little harder, but might be doable as well ( major refactoring ). My reasoning for allowing this is basically steering more into the field of program synthesis and allow chunks of expressions to be parsed. Possible applications for this might be inference of systems of equations based on "building blocks" with contain typical patterns. I know that in theory this is also possible using just binary ops, but the chance of discovery might increase based on the structural prior.
Arbitrary Expression Graphs
This plays along with 2. and reuse of patterns is key here. In many (natural) systems, reoccurrence of features is common ( in classical mechanics this would be the sin(q) and cos(q) for describing the translational movement of an angular joint ). Within a classical binary tree, each expression needs to be sampled individually while in a DAG ( possibly "just" a topological sorted DAG ), the influence of a given node could be extended beyond its direct parent.

I've made some prototypes using Symbolics / MTK for this, but it its rather slow ( given that I need to build the function for each candidate ).

Something of a parallel part:
I've been working on some alternatives to GA/GP to solve symbolic regression, partly motivated by the latest trend of using RNN to sample the graph and also related to MINLP. If you're interested we could have a chat about this :).

Cheers!

Edit Uh, I just noticed that this might all be doable quite easily given that you generate dispatches using the OperatorEnum! Nice.

Do/can we have an inverse `string_tree` function?

i.e. is there anyway to parse a string output by string_tree into a valid DynamicExpressions expression?

Or is there any other convenient way to save a large amount of dynamicexpressions that can be loaded back for later use?

[Feature] Interface with ChainRulesCore.jl

Apparently the proper way to build in differentiability is to define a rule for ChainRulesCore.jl: https://github.com/JuliaDiff/ChainRulesCore.jl. Specifically we would define an frule (forward) and rrule (reverse). Then, evaluations inside DynamicExpressions.jl would be able to be link in a "chain" in a larger AD pipeline. So it might be easier to do a lot of other stuff.

Maybe if we do this it would be relatively easy to get even higher order gradients in DynamicExpressions.jl?

@kazewong check this out

Clean up convenience functions

Is there a way I can make the convenience functions (

DynamicExpressions.jl/src/OperatorEnumConstruction.jl

Line 49 in af68fa8

Base.MainInclude.eval(

) more robust? Right now there are created with:

Base.MainInclude.eval(...)

from inside DynamicExpressions.OperatorEnum(...). Is there a way I can get a basic @eval to work here (and still extend the user-defined operators)? Perhaps I need to simply export the functions (or non-implemented versions of the functions)?

@odow any ideas/tips?

symbolicml / dynamicexpressions.jl Goto Github PK

dynamicexpressions.jl's People

Contributors

Stargazers

Watchers

Forkers

dynamicexpressions.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs