linjianma / autohoot Goto Github PK
View Code? Open in Web Editor NEWAutomatic High-Order Optimization for Tensors
License: Apache License 2.0
Automatic High-Order Optimization for Tensors
License: Apache License 2.0
Numpy einsum path has multiple bugs. I plan to change everything to dependent on Opt_einsum rather than numpy
When generating the optimized contraction path, current implementation generate random tensors and regard those as numpy einsum path inputs. It would be an issue when doing large scale experiments on the parallel systems, where Numpy will create the big tensors on each process, resulting in OOM.
For the input graphs:
new_A = ad.einsum('ba,ca,da,ebc->ed',B,C,ad.tensorinv(ad.einsum('ab,ac,db,dc->bc',B,B,C,C), ind=1),input_tensor)
new_B = ad.einsum('ba,ca,da,bec->ed',A,C,ad.tensorinv(ad.einsum('ab,ac,db,dc->bc',A,A,C,C), ind=1),input_tensor)
new_C = ad.einsum('ba,ca,da,bce->ed',A,B,ad.tensorinv(ad.einsum('ab,ac,db,dc->bc',A,A,B,B), ind=1),input_tensor)
after we run
new_A, new_B, new_C = generate_sequential_optiaml_tree({
new_A: A,
new_B: B,
new_C: C
})
the resulting graphs will be
new_A = ad.einsum('abe,da,ba->ed',ad.einsum('ebc,ca->abe',input_tensor,C),ad.tensorinv(ad.einsum('ab,ac,db,dc->bc',B,B,C,C), ind=1),B)
new_B = ad.einsum('abe,da,ba->ed',ad.einsum('bec,ca->abe',input_tensor,C),ad.tensorinv(ad.einsum('ab,ac,db,dc->bc',A,A,C,C), ind=1),A)
new_C = ad.einsum('ba,ca,da,bce->ed',A,B,ad.tensorinv(ad.einsum('ab,ac,db,dc->bc',A,A,B,B), ind=1),input_tensor)
Note that in the new new_A and new_B expressions, we have ad.einsum('ebc,ca->abe',input_tensor,C)
and ad.einsum('bec,ca->abe',input_tensor,C)
which are just transposes and cannot be further deduped.
Some linalg operations such as svd, is from different submodules in different libraries. For example, can call T.svd
for CTF, but need to call T.linalg.svd
for numpy.
Currently, the dedup function regards T.einsum('cb,ca->ab',B,B)
and T.einsum('cb,ca->ba',B,B)
as different expressions. However, their results are the same and should be viewed as duplicated.
Currently, the Einsum expression capacity is bottlenecked by the number of characters allowed. One easy fix is similar to what's done in opt_einsum: https://optimized-einsum.readthedocs.io/en/stable/_modules/opt_einsum/parser.html#get_symbol, where we allow unicode characters, and be careful when we call functions like numpy.einsum, since that einsum only allow simple letters. This will increase the number of indices allowed to at least 10^6.
I suggest we going in this way since I believe there's minor refactorization needed for this modification.
The gradients of the sum-like einsum operations, such as b = einsum("ij->", a)
is not calculated correctly.
As title
This renders the test log rather not very readable.
We can use https://docs.sympy.org/latest/tutorial/intro.html#the-power-of-symbolic-computation.
This can do something like
`
simplify('x + x') = 2x
simplify('x - (x-xxxxx)') = xxxxx
`
Though, we would need to regenerate the node w.r.t the name.
WDYT?
when the expression is too long, will produce RecursionError: maximum recursion depth exceeded while calling a Python object
for the linearize
function.
The following example can reproduce the error:
import autodiff as ad
from graph_ops.graph_transformer import linearize
from examples.cpd import cpd_graph
A, B, C, input_tensor, loss, residual = cpd_graph(100, 100)
hessian = ad.hessian(loss, [A, B, C])
linearize(hessian[0][1])
I get the following error:
(python_env) ➜ software python -c "import autohoot"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/mnt/home/mfishman/software/AutoHOOT/autohoot/__init__.py", line 15, in <module>
from . import autodiff
File "/mnt/home/mfishman/software/AutoHOOT/autohoot/autodiff.py", line 20, in <module>
from autohoot.utils import find_topo_sort, sum_node_list, inner_product, find_topo_sort_p
File "/mnt/home/mfishman/software/AutoHOOT/autohoot/utils.py", line 236, in <module>
@attr.s(eq=False)
TypeError: attrs() got an unexpected keyword argument 'eq'
with attrs-19.1.0
. Upgrading to attrs-21.2.0
fixes this. Maybe the version should be specified in requirements.txt
?
Our gradients function only supports the case where the output node represents a scalar. However, it is not easy to add this restriction to the current code to avoid wrong calculations, cuz our current computation graph doesn't contain dimensionality information.
Hi,
I am trying to compute the Hessian of xᵀ A x
w.r.t. x
, then optimize the computation graph, expecting the result to be A + Aᵀ
. However, the call to optimize
crashes with a ValueError
(please see the MWE below).
Interestingly, if I replace xᵀ A x
by xᵀ A B x
, the optimization works and yields the desired result A B + (A B)ᵀ
.
Best,
Felix
from autohoot import autodiff as ad
from autohoot.graph_ops import graph_transformer
dim = 3
x = ad.Variable(name="x", shape=[dim])
A = ad.Variable(name="A", shape=[dim, dim])
B = ad.Variable(name="B", shape=[dim, dim])
# ✔ Compute the Hessian of `y = xᵀ A B x` w.r.t. `x`
y = ad.einsum("i,ij,jk,k->", x, A, B, x)
Hx_y = ad.hessian(y, [x])[0][0]
print(Hx_y)
# >>> (T.einsum('ac,cb->ab',T.identity(3),T.einsum('ab,bc->ca',A,B))+T.einsum('ac,cb->ab',T.identity(3),T.einsum('ab,bc->ac',A,B)))
# ✔ Optimize the graph to get `A B + (A B)ᵀ`
Hx_y_opt = graph_transformer.optimize(Hx_y)
print(Hx_y_opt)
# >>> (T.einsum('ab,bc->ca',A,B)+T.einsum('ab,bc->ac',A,B))
# ✔ Compute the Hessian of `z = xᵀ A x` w.r.t. `x`
z = ad.einsum("i,ij,j->", x, A, x)
Hx_z = ad.hessian(z, [x])[0][0]
print(Hx_z)
# >>> (T.einsum('ac,cb->ab',T.identity(3),T.einsum('ab,bc->ca',A,B))+T.einsum('ac,cb->ab',T.identity(3),T.einsum('ab,bc->ac',A,B)))
# ❎ Optimize the graph to get `A + Aᵀ`
Hx_z_opt = graph_transformer.optimize(Hx_z)
print(Hx_z_opt)
# >>> ValueError: Output character 'd' did not appear in the input
Considering that all our future optimizations will mainly be based on einsum, we need to replace most the matrix operations (mul, matmul, transpose, norm, sum) with einsum.
Symmetric rule will benefit the optimization. For example: the gradient of x^THx
will be Hx + H^T x
. If H is symmetric, can then collapse two terms and get 2Hx
Need to add optimization rules for orthogonal matrices, e.g. einsum("ab,cb->ac", A, A) = I
if A has orthogonal rows
The first deliverable could be merging the optimizer/simplify API.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.