GithubHelp home page GithubHelp logo

pymc-devs / pytensor Goto Github PK

View Code? Open in Web Editor NEW
258.0 258.0 79.0 84.27 MB

PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Home Page: https://pytensor.readthedocs.io

License: Other

Shell 0.01% C++ 1.78% Python 96.19% C 1.66% CSS 0.02% Cython 0.33%
ai bayesian-inference computational-science deep-learning statistics

pytensor's People

Contributors

aalmah avatar abalkin avatar abergeron avatar affanv14 avatar amrithasuresh avatar ballasn avatar brandonwillard avatar breuleux avatar caglar avatar carriepl avatar chienlima avatar dwf avatar gvtulder avatar harlouci avatar hengjean avatar jaberg avatar jlowin avatar khaotik avatar lamblin avatar michaelosthege avatar nicolasbouchard avatar notoraptor avatar nouiz avatar pascanur avatar reyhaneaskari avatar ricardov94 avatar royxue avatar sentient07 avatar slefrancois avatar turian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytensor's Issues

`local_subtensor_merge` can complicate graphs

Description

The local_subtensor_merge op often makes graph worse instead of better:
https://github.com/pymc-devs/pytensor/blob/main/pytensor/tensor/rewriting/subtensor.py#L475

import pytensor.tensor as pt
import pytensor

x = pt.dvector("x")
y = x[1:-1][1:-1][1:-1]

pytensor.config.optdb__max_use_ratio = 20
func = pytensor.function([x], y)

# Before rewriting:
"""
Subtensor{int64:int64:} [id A]
 |Subtensor{int64:int64:} [id B]
 | |Subtensor{int64:int64:} [id C]
 | | |x [id D]
 | | |ScalarConstant{1} [id E]
 | | |ScalarConstant{-1} [id F]
 | |ScalarConstant{1} [id G]
 | |ScalarConstant{-1} [id H]
 |ScalarConstant{1} [id I]
 |ScalarConstant{-1} [id J]
"""

After:

DeepCopyOp [id A] 27
 |Subtensor{int64:int64:int8} [id B] 26
   |x [id C]
   |ScalarFromTensor [id D] 24
   | |Elemwise{Composite{Switch(i0, 0, minimum((i1 + i2), i3))}}[(0, 2)] [id E] 22
   |   |Elemwise{Composite{LE((i0 - i1), 0)}} [id F] 21
   |   | |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0, -1), i2), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0, -1), i2))}}[(0, 0)] [id G] 19
   |   | | |Elemwise{Composite{Switch(i0, 0, minimum((i1 + i2), i3))}}[(0, 1)] [id H] 12
   |   | | | |Elemwise{Composite{LE((i0 - i1), 0)}} [id I] 10
   |   | | | | |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id J] 8
   |   | | | | | |Elemwise{sub,no_inplace} [id K] 7
   |   | | | | |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id L] 4
   |   | | | | |   | |Elemwise{sub,no_inplace} [id M] 3
   |   | | | | |   |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id N] 1
   |   | | | | |   |   | |Shape_i{0} [id O] 0
   |   | | | | |   |   |   |x [id C]
   |   | | | | |   |   |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}}[(0, 0)] [id P] 2
   |   | | | | |   |     |Shape_i{0} [id O] 0
   |   | | | | |   |     |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id N] 1
   |   | | | | |   |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}} [id Q] 6
   |   | | | | |     |Elemwise{sub,no_inplace} [id M] 3
   |   | | | | |     |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id L] 4
   |   | | | | |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}}[(0, 0)] [id R] 9
   |   | | | |   |Elemwise{sub,no_inplace} [id K] 7
   |   | | | |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id J] 8
   |   | | | |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}} [id Q] 6
   |   | | | |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id J] 8
   |   | | | |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id L] 4
   |   | | |TensorFromScalar [id S] 18
   |   | | | |add [id T] 16
   |   | | |   |ScalarFromTensor [id U] 14
   |   | | |   | |Elemwise{Composite{Switch(i0, 0, minimum((i1 + i2), i3))}}[(0, 1)] [id H] 12
   |   | | |   |ScalarFromTensor [id V] 5
   |   | | |     |Elemwise{sub,no_inplace} [id M] 3
   |   | | |Elemwise{sub,no_inplace} [id M] 3
   |   | |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0), i2), 0), i3), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0), i2), 0), i3)}}[(0, 0)] [id W] 20
   |   |   |Elemwise{Composite{Switch(i0, 0, minimum((i1 + i2), i3))}}[(0, 2)] [id X] 11
   |   |   | |Elemwise{Composite{LE((i0 - i1), 0)}} [id I] 10
   |   |   | |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}} [id Q] 6
   |   |   | |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}}[(0, 0)] [id R] 9
   |   |   | |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id L] 4
   |   |   |TensorFromScalar [id Y] 17
   |   |   | |add [id Z] 15
   |   |   |   |ScalarFromTensor [id BA] 13
   |   |   |   | |Elemwise{Composite{Switch(i0, 0, minimum((i1 + i2), i3))}}[(0, 2)] [id X] 11
   |   |   |   |ScalarFromTensor [id V] 5
   |   |   |Elemwise{sub,no_inplace} [id M] 3
   |   |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0, -1), i2), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0, -1), i2))}}[(0, 0)] [id G] 19
   |   |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}}[(0, 0)] [id P] 2
   |   |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0), i2), 0), i3), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0), i2), 0), i3)}}[(0, 0)] [id W] 20
   |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id N] 1
   |ScalarFromTensor [id BB] 25
   | |Elemwise{Composite{Switch(i0, 0, minimum((i1 + i2), i3))}}[(0, 1)] [id BC] 23
   |   |Elemwise{Composite{LE((i0 - i1), 0)}} [id F] 21
   |   |Elemwise{Composite{Switch(LT(Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1), Composite{Switch(LT(i0, i1), i1, i0)}(Composite{Switch(GE(i0, i1), i1, i0)}(1, i0), 0), i1)}}[(0, 0)] [id P] 2
   |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0, -1), i2), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}(i0, 0, i1), 0, -1), i2))}}[(0, 0)] [id G] 19
   |   |Elemwise{Composite{Switch(LT(Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0), 0), 0, Composite{Switch(GE(i0, i1), i1, i0)}(Composite{Switch(LT(i0, i1), i2, i0)}((i0 - 1), 0, -1), i0))}} [id N] 1
   |ScalarConstant{1} [id BD]

I think this rewrite might be fine in some special cases with known shapes/indices, but in general I don't see why we would do this rewrite.

Optimize `Sum`s of `MakeVector`s and `Join`s

Please describe the purpose of filing this issue

Not a drastic improvement by any means, but something we can keep in mind:

reduce(at.concatenate(*tensors)) -> reduce(reduce(tensor) for tensor in tensors)

Ignoring any axis complexities

import pytensor
import pytensor.tensor as pt
import numpy as np

x = pt.vector("x")
y = pt.vector("y")

f1 = pytensor.function([x, y], pt.sum(pt.concatenate((x, y))))
f2 = pytensor.function([x, y], pt.sum((pt.sum(x), pt.sum(y))))
f3 = pytensor.function([x, y], pt.add(pt.sum(x), pt.sum(y)))

pytensor.dprint(f1)
print()
pytensor.dprint(f2)
print()
pytensor.dprint(f3)

x_val = np.random.rand(100_000)
y_val = np.random.rand(200_000)

%timeit f1(x_val, y_val)
%timeit f2(x_val, y_val)
%timeit f3(x_val, y_val)
Sum{acc_dtype=float64} [id A] ''   1
 |Join [id B] ''   0
   |TensorConstant{0} [id C]
   |x [id D]
   |y [id E]

Sum{acc_dtype=float64} [id A] ''   3
 |MakeVector{dtype='float64'} [id B] ''   2
   |Sum{acc_dtype=float64} [id C] ''   1
   | |x [id D]
   |Sum{acc_dtype=float64} [id E] ''   0
     |y [id F]

Elemwise{Add}[(0, 0)] [id A] ''   2
 |Sum{acc_dtype=float64} [id B] ''   1
 | |x [id C]
 |Sum{acc_dtype=float64} [id D] ''   0
   |y [id E]
544 ยตs ยฑ 27.5 ยตs per loop (mean ยฑ std. dev. of 7 runs, 1,000 loops each)
270 ยตs ยฑ 5.11 ยตs per loop (mean ยฑ std. dev. of 7 runs, 1,000 loops each)
270 ยตs ยฑ 8.86 ยตs per loop (mean ยฑ std. dev. of 7 runs, 1,000 loops each)

`integers` and `randint` raise TypeError if rng is not provided

We should correct these methods to create the right type of RNG if they are not provided by the user (that's what super().make_node()) would do anyway

def make_node(self, rng, *args, **kwargs):
if not isinstance(
getattr(rng, "type", None), (RandomStateType, RandomStateSharedVariable)
):
raise TypeError("`randint` is only available for `RandomStateType`s")
return super().make_node(rng, *args, **kwargs)

def make_node(self, rng, *args, **kwargs):
if not isinstance(
getattr(rng, "type", None),
(RandomGeneratorType, RandomGeneratorSharedVariable),
):
raise TypeError("`integers` is only available for `RandomGeneratorType`s")
return super().make_node(rng, *args, **kwargs)

Scan requires random variables to be referenced as non sequences

Works:

import pytensor
import pytensor.tensor as at
import numpy as np

data = at.constant(np.random.randn(64))
srng = pytensor.tensor.random.RandomStream()
index = srng.integers(64, size=10)
datai = data[index]
var = at.vector("var")
scan = pytensor.scan(
    lambda v, *_: ((datai-v)**2).sum(), 
    sequences=var, non_sequences=[index], 
    strict=True
)
print(scan[0].eval({var: np.array([1., 2.])}))
print(pytensor.grad(scan[0].sum(), var).eval({var: [1, 1]}))

Raises an uninformative or fairly informative error

import pytensor
import pytensor.tensor as at
import numpy as np

data = at.constant(np.random.randn(64))
srng = pytensor.tensor.random.RandomStream()
index = srng.integers(64, size=10)
datai = data[index]
var = at.vector("var")
scan = pytensor.scan(
    lambda v: ((datai-v)**2).sum(), 
    sequences=var, 
    strict=True
)
print(scan[0].eval({var: np.array([1., 2.])}))
print(pytensor.grad(scan[0].sum(), var).eval({var: [1, 1]}))
---------------------------------------------------------------------------
MissingInputError                         Traceback (most recent call last)
Cell In [129], line 10
      8 datai = data[index]
      9 var = at.vector("var")
---> 10 scan = aesara.scan(
     11     lambda v, *_: ((datai-v)**2).sum(), 
     12     sequences=var, #non_sequences=[index],
     13     strict=True
     14 )
     15 print(scan[0].eval({var: np.array([1., 2.])}))
     16 print(aesara.grad(scan[0].sum(), var).eval({var: [1, 1]}))

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/scan/basic.py:1140, in scan(fn, sequences, outputs_info, non_sequences, n_steps, truncate_gradient, go_backwards, mode, name, profile, allow_gc, strict, return_list)
   1126     allow_gc = config.scan__allow_gc
   1128 info = ScanInfo(
   1129     n_seqs=n_seqs,
   1130     mit_mot_in_slices=(),
   (...)
   1137     as_while=as_while,
   1138 )
-> 1140 local_op = Scan(
   1141     inner_inputs,
   1142     new_outs,
   1143     info,
   1144     mode=mode,
   1145     truncate_gradient=truncate_gradient,
   1146     name=name,
   1147     profile=profile,
   1148     allow_gc=allow_gc,
   1149     strict=strict,
   1150 )
   1152 ##
   1153 # Step 8. Compute the outputs using the scan op
   1154 ##
   1155 _scan_inputs = (
   1156     scan_seqs
   1157     + mit_mot_scan_inputs
   (...)
   1163     + other_scan_args
   1164 )

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/scan/op.py:859, in Scan.__init__(self, inputs, outputs, info, mode, typeConstructor, truncate_gradient, name, as_while, profile, allow_gc, strict)
    856 self.n_outer_inputs = info.n_outer_inputs
    857 self.n_outer_outputs = info.n_outer_outputs
--> 859 self.fgraph = FunctionGraph(inputs, outputs, clone=False)
    861 _ = self.prepare_fgraph(self.fgraph)
    863 if any(node.op.destroy_map for node in self.fgraph.apply_nodes):

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/graph/fg.py:153, in FunctionGraph.__init__(self, inputs, outputs, features, clone, update_mapping, **clone_kwds)
    150     self.add_input(in_var, check=False)
    152 for output in outputs:
--> 153     self.add_output(output, reason="init")
    155 self.profile = None
    156 self.update_mapping = update_mapping

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/graph/fg.py:163, in FunctionGraph.add_output(self, var, reason, import_missing)
    161 """Add a new variable as an output to this `FunctionGraph`."""
    162 self.outputs.append(var)
--> 163 self.import_var(var, reason=reason, import_missing=import_missing)
    164 self.clients[var].append(("output", len(self.outputs) - 1))

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/graph/fg.py:304, in FunctionGraph.import_var(self, var, reason, import_missing)
    302 # Imports the owners of the variables
    303 if var.owner and var.owner not in self.apply_nodes:
--> 304     self.import_node(var.owner, reason=reason, import_missing=import_missing)
    305 elif (
    306     var.owner is None
    307     and not isinstance(var, AtomicVariable)
    308     and var not in self.inputs
    309 ):
    310     from aesara.graph.null_type import NullType

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/graph/fg.py:369, in FunctionGraph.import_node(self, apply_node, check, reason, import_missing)
    360                 else:
    361                     error_msg = (
    362                         f"Input {node.inputs.index(var)} ({var})"
    363                         " of the graph (indices start "
   (...)
    367                         "for more information on this error."
    368                     )
--> 369                     raise MissingInputError(error_msg, variable=var)
    371 for node in new_nodes:
    372     assert node not in self.apply_nodes

MissingInputError: Input 0 (RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FCBAFBA03C0>)) of the graph (indices start from 0), used to compute integers_rv{0, (0, 0), int64, False}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FCBAFBA03C0>), TensorConstant{(1,) of 10}, TensorConstant{4}, TensorConstant{0}, TensorConstant{64}), was not provided and not given a value. Use the Aesara flag exception_verbosity='high', for more information on this error.
 
Backtrace when that variable is created:

  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/ipykernel/zmqshell.py", line 528, in run_cell
    return super().run_cell(*args, **kwargs)
  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2940, in run_cell
    result = self._run_cell(
  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2995, in _run_cell
    return runner(coro)
  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3194, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3373, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/home/mkochurov/micromamba/envs/bayes/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_441370/4148489949.py", line 7, in <module>
    index = srng.integers(64, size=10)

Scan silently passes and aesara.grad fails miserably later if strict=False

import pytensor
import pytensor.tensor as at
import numpy as np

data = at.constant(np.random.randn(64))
srng = pytensor.tensor.random.RandomStream()
index = srng.integers(64, size=10)
datai = data[index]
var = at.vector("var")
scan = pytensor.scan(
    lambda v: ((datai-v)**2).sum(), 
    sequences=var 
    strict=False
)
print(scan[0].eval({var: np.array([1., 2.])}))
print(pytensor.grad(scan[0].sum(), var).eval({var: [1, 1]}))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [130], line 16
     10 scan = aesara.scan(
     11     lambda v, *_: ((datai-v)**2).sum(), 
     12     sequences=var, #non_sequences=[index],
     13     strict=False
     14 )
     15 print(scan[0].eval({var: np.array([1., 2.])}))
---> 16 print(aesara.grad(scan[0].sum(), var).eval({var: [1, 1]}))

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:623, in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients)
    620     if hasattr(g.type, "dtype"):
    621         assert g.type.dtype in aesara.tensor.type.float_dtypes
--> 623 _rval: Sequence[Variable] = _populate_grad_dict(
    624     var_to_app_to_idx, grad_dict, _wrt, cost_name
    625 )
    627 rval: MutableSequence[Optional[Variable]] = list(_rval)
    629 for i in range(len(_rval)):

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1434, in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
   1431     # end if cache miss
   1432     return grad_dict[var]
-> 1434 rval = [access_grad_cache(elem) for elem in wrt]
   1436 return rval

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1434, in <listcomp>(.0)
   1431     # end if cache miss
   1432     return grad_dict[var]
-> 1434 rval = [access_grad_cache(elem) for elem in wrt]
   1436 return rval

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1387, in _populate_grad_dict.<locals>.access_grad_cache(var)
   1384 for node in node_to_idx:
   1385     for idx in node_to_idx[node]:
-> 1387         term = access_term_cache(node)[idx]
   1389         if not isinstance(term, Variable):
   1390             raise TypeError(
   1391                 f"{node.op}.grad returned {type(term)}, expected"
   1392                 " Variable instance."
   1393             )

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1058, in _populate_grad_dict.<locals>.access_term_cache(node)
   1054 if node not in term_dict:
   1056     inputs = node.inputs
-> 1058     output_grads = [access_grad_cache(var) for var in node.outputs]
   1060     # list of bools indicating if each output is connected to the cost
   1061     outputs_connected = [
   1062         not isinstance(g.type, DisconnectedType) for g in output_grads
   1063     ]

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1058, in <listcomp>(.0)
   1054 if node not in term_dict:
   1056     inputs = node.inputs
-> 1058     output_grads = [access_grad_cache(var) for var in node.outputs]
   1060     # list of bools indicating if each output is connected to the cost
   1061     outputs_connected = [
   1062         not isinstance(g.type, DisconnectedType) for g in output_grads
   1063     ]

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1387, in _populate_grad_dict.<locals>.access_grad_cache(var)
   1384 for node in node_to_idx:
   1385     for idx in node_to_idx[node]:
-> 1387         term = access_term_cache(node)[idx]
   1389         if not isinstance(term, Variable):
   1390             raise TypeError(
   1391                 f"{node.op}.grad returned {type(term)}, expected"
   1392                 " Variable instance."
   1393             )

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1058, in _populate_grad_dict.<locals>.access_term_cache(node)
   1054 if node not in term_dict:
   1056     inputs = node.inputs
-> 1058     output_grads = [access_grad_cache(var) for var in node.outputs]
   1060     # list of bools indicating if each output is connected to the cost
   1061     outputs_connected = [
   1062         not isinstance(g.type, DisconnectedType) for g in output_grads
   1063     ]

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1058, in <listcomp>(.0)
   1054 if node not in term_dict:
   1056     inputs = node.inputs
-> 1058     output_grads = [access_grad_cache(var) for var in node.outputs]
   1060     # list of bools indicating if each output is connected to the cost
   1061     outputs_connected = [
   1062         not isinstance(g.type, DisconnectedType) for g in output_grads
   1063     ]

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1387, in _populate_grad_dict.<locals>.access_grad_cache(var)
   1384 for node in node_to_idx:
   1385     for idx in node_to_idx[node]:
-> 1387         term = access_term_cache(node)[idx]
   1389         if not isinstance(term, Variable):
   1390             raise TypeError(
   1391                 f"{node.op}.grad returned {type(term)}, expected"
   1392                 " Variable instance."
   1393             )

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/gradient.py:1213, in _populate_grad_dict.<locals>.access_term_cache(node)
   1205         if o_shape != g_shape:
   1206             raise ValueError(
   1207                 "Got a gradient of shape "
   1208                 + str(o_shape)
   1209                 + " on an output of shape "
   1210                 + str(g_shape)
   1211             )
-> 1213 input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
   1215 if input_grads is None:
   1216     raise TypeError(
   1217         f"{node.op}.grad returned NoneType, expected iterable."
   1218     )

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/scan/op.py:2613, in Scan.L_op(self, inputs, outs, dC_douts)
   2611 for dx in range(len(dC_dinps_t)):
   2612     if not dC_dinps_t[dx]:
-> 2613         dC_dinps_t[dx] = at.zeros_like(diff_inputs[dx])
   2614     else:
   2615         disconnected_dC_dinps_t[dx] = False

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/tensor/basic.py:798, in zeros_like(model, dtype, opt)
    782 def zeros_like(model, dtype=None, opt=False):
    783     """equivalent of numpy.zeros_like
    784     Parameters
    785     ----------
   (...)
    795         tensor the shape of model containing zeros of the type of dtype.
    796     """
--> 798     _model = as_tensor_variable(model)
    800     if dtype is None:
    801         dtype = _model.type.dtype

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/tensor/__init__.py:49, in as_tensor_variable(x, name, ndim, **kwargs)
     17 def as_tensor_variable(
     18     x: TensorLike, name: Optional[str] = None, ndim: Optional[int] = None, **kwargs
     19 ) -> "TensorVariable":
     20     """Convert `x` into an equivalent `TensorVariable`.
     21 
     22     This function can be used to turn ndarrays, numbers, `ScalarType` instances,
   (...)
     47 
     48     """
---> 49     return _as_tensor_variable(x, name, ndim, **kwargs)

File ~/micromamba/envs/bayes/lib/python3.9/functools.py:888, in singledispatch.<locals>.wrapper(*args, **kw)
    884 if not args:
    885     raise TypeError(f'{funcname} requires at least '
    886                     '1 positional argument')
--> 888 return dispatch(args[0].__class__)(*args, **kw)

File ~/micromamba/envs/bayes/lib/python3.9/site-packages/aesara/tensor/basic.py:100, in _as_tensor_Variable(x, name, ndim, **kwargs)
     97 @_as_tensor_variable.register(Variable)
     98 def _as_tensor_Variable(x, name, ndim, **kwargs):
     99     if not isinstance(x.type, TensorType):
--> 100         raise TypeError(
    101             f"Tensor type field must be a TensorType; found {type(x.type)}."
    102         )
    104     if ndim is None:
    105         return x

TypeError: Tensor type field must be a TensorType; found <class 'aesara.tensor.random.type.RandomGeneratorType'>.
**Please provide any additional information below.**

Versions and main components

  • Aesara version: '2.8.7'
  • Python version: '3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:58:50) [GCC 10.3.0]'
  • Operating system: Ubuntu
  • How did you install Aesara: (conda/pip) conda
pytensor config:
floatX ({'float64', 'float16', 'float32'}) 
    Doc:  Default floating-point precision for python casts.

Note: float16 support is experimental, use at your own risk.
    Value:  float64

warn_float64 ({'ignore', 'pdb', 'warn', 'raise'}) 
    Doc:  Do an action when a tensor variable with float64 dtype is created.
    Value:  ignore

pickle_test_value (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcc2c19ac10>>) 
    Doc:  Dump test values while pickling model. If True, test values will be dumped with model.
    Value:  True

cast_policy ({'custom', 'numpy+floatX'}) 
    Doc:  Rules for implicit type casting
    Value:  custom

deterministic ({'more', 'default'}) 
    Doc:  If `more`, sometimes we will select some implementation that are more deterministic, but slower.  Also see the dnn.conv.algo* flags to cover more cases.
    Value:  default

device (cpu)
    Doc:  Default device for computations. only cpu is supported for now
    Value:  cpu

force_device (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcc2c19a610>>) 
    Doc:  Raise an error if we can't use the specified device
    Value:  False

conv__assert_shape (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcc2c19a5b0>>) 
    Doc:  If True, AbstractConv* ops will verify that user-provided shapes match the runtime shapes (debugging option, may slow down compilation)
    Value:  False

print_global_stats (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5baa670>>) 
    Doc:  Print some global statistics (time spent) at the end
    Value:  False

assert_no_cpu_op ({'ignore', 'pdb', 'warn', 'raise'}) 
    Doc:  Raise an error/warning if there is a CPU op in the computational graph.
    Value:  ignore

unpickle_function (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5baa850>>) 
    Doc:  Replace unpickled Aesara functions with None. This is useful to unpickle old graphs that pickled them when it shouldn't
    Value:  True

<aesara.configparser.ConfigParam object at 0x7fcbc5baa8b0>
    Doc:  Default compilation mode
    Value:  Mode

cxx (<class 'str'>) 
    Doc:  The C++ compiler to use. Currently only g++ is supported, but supporting additional compilers should not be too difficult. If it is empty, no C++ code is compiled.
    Value:  /usr/bin/g++

linker ({'vm_nogc', 'c|py_nogc', 'c', 'py', 'c|py', 'vm', 'cvm_nogc', 'cvm'}) 
    Doc:  Default linker used if the aesara flags mode is Mode
    Value:  cvm

allow_gc (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5baaa30>>) 
    Doc:  Do we default to delete intermediate results during Aesara function calls? Doing so lowers the memory requirement, but asks that we reallocate memory at the next function call. This is implemented for the default linker, but may not work for all linkers.
    Value:  True

optimizer ({'o2', 'o1', 'o3', 'None', 'fast_compile', 'o4', 'fast_run', 'unsafe', 'merge'}) 
    Doc:  Default optimizer. If not None, will use this optimizer with the Mode
    Value:  o4

optimizer_verbose (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41550>>) 
    Doc:  If True, we print all optimization being applied
    Value:  False

on_opt_error ({'ignore', 'pdb', 'warn', 'raise'}) 
    Doc:  What to do when an optimization crashes: warn and skip it, raise the exception, or fall into the pdb debugger.
    Value:  warn

nocleanup (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b415b0>>) 
    Doc:  Suppress the deletion of code files that did not compile cleanly
    Value:  False

on_unused_input ({'ignore', 'warn', 'raise'}) 
    Doc:  What to do if a variable in the 'inputs' list of  aesara.function() is not used in the graph.
    Value:  raise

gcc__cxxflags (<class 'str'>) 
    Doc:  Extra compiler flags for gcc
    Value:   -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables

cmodule__warn_no_version (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41640>>) 
    Doc:  If True, will print a warning when compiling one or more Op with C code that can't be cached because there is no c_code_cache_version() function associated to at least one of those Ops.
    Value:  False

cmodule__remove_gxx_opt (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41400>>) 
    Doc:  If True, will remove the -O* parameter passed to g++.This is useful to debug in gdb modules compiled by Aesara.The parameter -g is passed by default to g++
    Value:  False

cmodule__compilation_warning (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41670>>) 
    Doc:  If True, will print compilation warnings.
    Value:  False

cmodule__preload_cache (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b416a0>>) 
    Doc:  If set to True, will preload the C module cache at import time
    Value:  False

cmodule__age_thresh_use (<class 'int'>) 
    Doc:  In seconds. The time after which Aesara won't reuse a compile c module.
    Value:  2073600

cmodule__debug (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41700>>) 
    Doc:  If True, define a DEBUG macro (if not exists) for any compiled C code.
    Value:  False

compile__wait (<class 'int'>) 
    Doc:  Time to wait before retrying to acquire the compile lock.
    Value:  5

compile__timeout (<class 'int'>) 
    Doc:  In seconds, time that a process will wait before deciding to
    override an existing lock. An override only happens when the existing
    lock is held by the same owner *and* has not been 'refreshed' by this
    owner for more than this period. Refreshes are done every half timeout
    period for running processes.
    Value:  120

ctc__root (<class 'str'>) 
    Doc:  Directory which contains the root of Baidu CTC library. It is assumed         that the compiled library is either inside the build, lib or lib64         subdirectory, and the header inside the include directory.
    Value:  

tensor__cmp_sloppy (<class 'int'>) 
    Doc:  Relax aesara.tensor.math._allclose (0) not at all, (1) a bit, (2) more
    Value:  0

tensor__local_elemwise_fusion (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b418b0>>) 
    Doc:  Enable or not in fast_run mode(fast_run optimization) the elemwise fusion optimization
    Value:  True

lib__amblibm (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41940>>) 
    Doc:  Use amd's amdlibm numerical library
    Value:  False

tensor__insert_inplace_optimizer_validate_nb (<class 'int'>) 
    Doc:  -1: auto, if graph have less then 500 nodes 1, else 10
    Value:  -1

traceback__limit (<class 'int'>) 
    Doc:  The number of stack to trace. -1 mean all.
    Value:  8

traceback__compile_limit (<class 'int'>) 
    Doc:  The number of stack to trace to keep during compilation. -1 mean all. If greater then 0, will also make us save Aesara internal stack trace.
    Value:  0

experimental__local_alloc_elemwise (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41ac0>>) 
    Doc:  DEPRECATED: If True, enable the experimental optimization local_alloc_elemwise. Generates error if not True. Use optimizer_excluding=local_alloc_elemwise to disable.
    Value:  True

experimental__local_alloc_elemwise_assert (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41af0>>) 
    Doc:  When the local_alloc_elemwise is applied, add an assert to highlight shape errors.
    Value:  True

warn__ignore_bug_before ({'0.8.2', '0.10', 'all', '0.5', '0.7', '1.0.3', '1.0.5', 'None', '0.8', '1.0.2', '0.3', '1.0', '0.4', '0.6', '0.9', '0.4.1', '1.0.1', '0.8.1', '1.0.4'}) 
    Doc:  If 'None', we warn about all Aesara bugs found by default. If 'all', we don't warn about Aesara bugs found by default. If a version, we print only the warnings relative to Aesara bugs found after that version. Warning for specific bugs can be configured with specific [warn] flags.
    Value:  0.9

exception_verbosity ({'high', 'low'}) 
    Doc:  If 'low', the text of exceptions will generally refer to apply nodes with short names such as Elemwise{add_no_inplace}. If 'high', some exceptions will also refer to apply nodes with long descriptions  like:
        A. Elemwise{add_no_inplace}
                B. log_likelihood_v_given_h
                C. log_likelihood_h
    Value:  low

print_test_value (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41c40>>) 
    Doc:  If 'True', the __eval__ of an Aesara variable will return its test_value when this is available. This has the practical conseguence that, e.g., in debugging `my_var` will print the same as `my_var.tag.test_value` when a test value is defined.
    Value:  False

compute_test_value ({'raise', 'ignore', 'pdb', 'warn', 'off'}) 
    Doc:  If 'True', Aesara will run each op at graph build time, using Constants, SharedVariables and the tag 'test_value' as inputs to the function. This helps the user track down problems in the graph before it gets optimized.
    Value:  off

compute_test_value_opt ({'raise', 'ignore', 'pdb', 'warn', 'off'}) 
    Doc:  For debugging Aesara optimization only. Same as compute_test_value, but is used during Aesara optimization
    Value:  off

check_input (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41cd0>>) 
    Doc:  Specify if types should check their input in their C code. It can be used to speed up compilation, reduce overhead (particularly for scalars) and reduce the number of generated C files.
    Value:  True

NanGuardMode__nan_is_error (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41d00>>) 
    Doc:  Default value for nan_is_error
    Value:  True

NanGuardMode__inf_is_error (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41d30>>) 
    Doc:  Default value for inf_is_error
    Value:  True

NanGuardMode__big_is_error (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41dc0>>) 
    Doc:  Default value for big_is_error
    Value:  True

NanGuardMode__action ({'pdb', 'warn', 'raise'}) 
    Doc:  What NanGuardMode does when it finds a problem
    Value:  raise

DebugMode__patience (<class 'int'>) 
    Doc:  Optimize graph this many times to detect inconsistency
    Value:  10

DebugMode__check_c (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41e50>>) 
    Doc:  Run C implementations where possible
    Value:  True

DebugMode__check_py (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41ee0>>) 
    Doc:  Run Python implementations where possible
    Value:  True

DebugMode__check_finite (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41f10>>) 
    Doc:  True -> complain about NaN/Inf results
    Value:  True

DebugMode__check_strides (<class 'int'>) 
    Doc:  Check that Python- and C-produced ndarrays have same strides. On difference: (0) - ignore, (1) warn, or (2) raise error
    Value:  0

DebugMode__warn_input_not_reused (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41f70>>) 
    Doc:  Generate a warning when destroy_map or view_map says that an op works inplace, but the op did not reuse the input for its output.
    Value:  True

DebugMode__check_preallocated_output (<class 'str'>) 
    Doc:  Test thunks with pre-allocated memory as output storage. This is a list of strings separated by ":". Valid values are: "initial" (initial storage in storage map, happens with Scan),"previous" (previously-returned memory), "c_contiguous", "f_contiguous", "strided" (positive and negative strides), "wrong_size" (larger and smaller dimensions), and "ALL" (all of the above).
    Value:  

DebugMode__check_preallocated_output_ndim (<class 'int'>) 
    Doc:  When testing with "strided" preallocated output memory, test all combinations of strides over that number of (inner-most) dimensions. You may want to reduce that number to reduce memory or time usage, but it is advised to keep a minimum of 2.
    Value:  4

profiling__time_thunks (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b41e80>>) 
    Doc:  Time individual thunks when profiling
    Value:  True

profiling__n_apply (<class 'int'>) 
    Doc:  Number of Apply instances to print by default
    Value:  20

profiling__n_ops (<class 'int'>) 
    Doc:  Number of Ops to print by default
    Value:  20

profiling__output_line_width (<class 'int'>) 
    Doc:  Max line width for the profiling output
    Value:  512

profiling__min_memory_size (<class 'int'>) 
    Doc:  For the memory profile, do not print Apply nodes if the size
                 of their outputs (in bytes) is lower than this threshold
    Value:  1024

profiling__min_peak_memory (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b52190>>) 
    Doc:  The min peak memory usage of the order
    Value:  False

profiling__destination (<class 'str'>) 
    Doc:  File destination of the profiling output
    Value:  stderr

profiling__debugprint (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b521f0>>) 
    Doc:  Do a debugprint of the profiled functions
    Value:  False

profiling__ignore_first_call (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b52220>>) 
    Doc:  Do we ignore the first call of an Aesara function.
    Value:  False

on_shape_error ({'warn', 'raise'}) 
    Doc:  warn: print a warning and use the default value. raise: raise an error
    Value:  warn

openmp (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b52280>>) 
    Doc:  Allow (or not) parallel computation on the CPU with OpenMP. This is the default value used when creating an Op that supports OpenMP parallelization. It is preferable to define it via the Aesara configuration file ~/.aesararc or with the environment variable AESARA_FLAGS. Parallelization is only done for some operations that implement it, and even for operations that implement parallelism, each operation is free to respect this flag or not. You can control the number of threads used with the environment variable OMP_NUM_THREADS. If it is set to 1, we disable openmp in Aesara by default.
    Value:  False

openmp_elemwise_minsize (<class 'int'>) 
    Doc:  If OpenMP is enabled, this is the minimum size of vectors for which the openmp parallelization is enabled in element wise ops.
    Value:  200000

optimizer_excluding (<class 'str'>) 
    Doc:  When using the default mode, we will remove optimizer with these tags. Separate tags with ':'.
    Value:  

optimizer_including (<class 'str'>) 
    Doc:  When using the default mode, we will add optimizer with these tags. Separate tags with ':'.
    Value:  

optimizer_requiring (<class 'str'>) 
    Doc:  When using the default mode, we will require optimizer with these tags. Separate tags with ':'.
    Value:  

optdb__position_cutoff (<class 'float'>) 
    Doc:  Where to stop eariler during optimization. It represent the position of the optimizer where to stop.
    Value:  inf

optdb__max_use_ratio (<class 'float'>) 
    Doc:  A ratio that prevent infinite loop in EquilibriumGraphRewriter.
    Value:  8.0

cycle_detection ({'fast', 'regular'}) 
    Doc:  If cycle_detection is set to regular, most inplaces are allowed,but it is slower. If cycle_detection is set to faster, less inplacesare allowed, but it makes the compilation faster.The interaction of which one give the lower peak memory usage iscomplicated and not predictable, so if you are close to the peakmemory usage, triyng both could give you a small gain.
    Value:  regular

check_stack_trace ({'log', 'off', 'warn', 'raise'}) 
    Doc:  A flag for checking the stack trace during the optimization process. default (off): does not check the stack trace of any optimization log: inserts a dummy stack trace that identifies the optimizationthat inserted the variable that had an empty stack trace.warn: prints a warning if a stack trace is missing and also a dummystack trace is inserted that indicates which optimization insertedthe variable that had an empty stack trace.raise: raises an exception if a stack trace is missing
    Value:  off

metaopt__verbose (<class 'int'>) 
    Doc:  0 for silent, 1 for only warnings, 2 for full output withtimings and selected implementation
    Value:  0

metaopt__optimizer_excluding (<class 'str'>) 
    Doc:  exclude optimizers with these tags. Separate tags with ':'.
    Value:  

metaopt__optimizer_including (<class 'str'>) 
    Doc:  include optimizers with these tags. Separate tags with ':'.
    Value:  

profile (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b52580>>) 
    Doc:  If VM should collect profile information
    Value:  False

profile_optimizer (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b525b0>>) 
    Doc:  If VM should collect optimizer profile information
    Value:  False

profile_memory (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b525e0>>) 
    Doc:  If VM should collect memory profile information and print it
    Value:  False

<aesara.configparser.ConfigParam object at 0x7fcbc5b52610>
    Doc:  Useful only for the VM Linkers. When lazy is None, auto detect if lazy evaluation is needed and use the appropriate version. If the C loop isn't being used and lazy is True, use the Stack VM; otherwise, use the Loop VM.
    Value:  None

unittests__rseed (<class 'str'>) 
    Doc:  Seed to use for randomized unit tests. Special value 'random' means using a seed of None.
    Value:  666

warn__round (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b526d0>>) 
    Doc:  Warn when using `tensor.round` with the default mode. Round changed its default from `half_away_from_zero` to `half_to_even` to have the same default as NumPy.
    Value:  False

numba__vectorize_target ({'cuda', 'parallel', 'cpu'}) 
    Doc:  Default target for numba.vectorize.
    Value:  cpu

numba__fastmath (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b52790>>) 
    Doc:  If True, use Numba's fastmath mode.
    Value:  True

numba__cache (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbc5b52820>>) 
    Doc:  If True, use Numba's file based caching.
    Value:  True

compiledir_format (<class 'str'>) 
    Doc:  Format string for platform-dependent compiled module subdirectory
(relative to base_compiledir). Available keys: aesara_version, device,
gxx_version, hostname, numpy_version, platform, processor,
python_bitwidth, python_int_bitwidth, python_version, short_platform.
Defaults to compiledir_%(short_platform)s-%(processor)s-%(python_versi
on)s-%(python_bitwidth)s.
    Value:  compiledir_%(short_platform)s-%(processor)s-%(python_version)s-%(python_bitwidth)s

<aesara.configparser.ConfigParam object at 0x7fcbc5b528e0>
    Doc:  platform-independent root directory for compiled modules
    Value:  /home/mkochurov/.aesara

<aesara.configparser.ConfigParam object at 0x7fcbc5b527f0>
    Doc:  platform-dependent cache directory for compiled modules
    Value:  /home/mkochurov/.aesara/compiledir_Linux-5.4--generic-x86_64-with-glibc2.31-x86_64-3.9.13-64

blas__ldflags (<class 'str'>) 
    Doc:  lib[s] to include for [Fortran] level-3 blas implementation
    Value:  -L/home/mkochurov/micromamba/envs/bayes/lib -lmkl_core -lmkl_intel_thread -lmkl_rt -Wl,-rpath,/home/mkochurov/micromamba/envs/bayes/lib

blas__check_openmp (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcc05b91640>>) 
    Doc:  Check for openmp library conflict.
WARNING: Setting this to False leaves you open to wrong results in blas-related operations.
    Value:  True

scan__allow_gc (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcbba30df10>>) 
    Doc:  Allow/disallow gc inside of Scan (default: False)
    Value:  False

scan__allow_output_prealloc (<bound method BoolParam._apply of <aesara.configparser.BoolParam object at 0x7fcc3034d1c0>>) 
    Doc:  Allow/disallow memory preallocation for outputs inside of scan (default: True)
    Value:  True

Allow shape specification in tensor constructor helpers

Please describe the purpose of filing this issue

Would be nice if we could pass the static shape to the tensor constructors now that it is supported

import pytensor.tensor as pt
pt.vector(shape=(3, 4))

For some types that constrain the shape, we could assert they are compatible

pt.col(shape=(5, 1))
pt.col(shape=(1, 5))  # Raise ValueError

In addition, would be nice if the more general at.tensor, had a default dtype of floatX, which is the most common case

pt.tensor("float64", shape=(None, None, 5))  # Fine
pt.tensor(shape=(None, None, 5))
# TypeError: TensorType.__init__() missing 1 required positional argument: 'dtype'

These are implemented here:

def tensor(*args, **kwargs):
name = kwargs.pop("name", None)
return TensorType(*args, **kwargs)(name=name)
cscalar = TensorType("complex64", ())
zscalar = TensorType("complex128", ())
fscalar = TensorType("float32", ())
dscalar = TensorType("float64", ())
bscalar = TensorType("int8", ())
wscalar = TensorType("int16", ())
iscalar = TensorType("int32", ())
lscalar = TensorType("int64", ())
ubscalar = TensorType("uint8", ())
uwscalar = TensorType("uint16", ())
uiscalar = TensorType("uint32", ())
ulscalar = TensorType("uint64", ())
def scalar(name=None, dtype=None):
"""Return a symbolic scalar variable.
Parameters
----------
dtype: numeric
None means to use pytensor.config.floatX.
name
A name to attach to this variable.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, ())
return type(name)
scalars, fscalars, dscalars, iscalars, lscalars = apply_across_args(
scalar, fscalar, dscalar, iscalar, lscalar
)
int_types = bscalar, wscalar, iscalar, lscalar
float_types = fscalar, dscalar
complex_types = cscalar, zscalar
int_scalar_types = int_types
float_scalar_types = float_types
complex_scalar_types = complex_types
cvector = TensorType("complex64", shape=(None,))
zvector = TensorType("complex128", shape=(None,))
fvector = TensorType("float32", shape=(None,))
dvector = TensorType("float64", shape=(None,))
bvector = TensorType("int8", shape=(None,))
wvector = TensorType("int16", shape=(None,))
ivector = TensorType("int32", shape=(None,))
lvector = TensorType("int64", shape=(None,))
def vector(name=None, dtype=None):
"""Return a symbolic vector variable.
Parameters
----------
dtype: numeric
None means to use pytensor.config.floatX.
name
A name to attach to this variable
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None,))
return type(name)
vectors, fvectors, dvectors, ivectors, lvectors = apply_across_args(
vector, fvector, dvector, ivector, lvector
)
int_vector_types = bvector, wvector, ivector, lvector
float_vector_types = fvector, dvector
complex_vector_types = cvector, zvector
cmatrix = TensorType("complex64", shape=(None, None))
zmatrix = TensorType("complex128", shape=(None, None))
fmatrix = TensorType("float32", shape=(None, None))
dmatrix = TensorType("float64", shape=(None, None))
bmatrix = TensorType("int8", shape=(None, None))
wmatrix = TensorType("int16", shape=(None, None))
imatrix = TensorType("int32", shape=(None, None))
lmatrix = TensorType("int64", shape=(None, None))
def matrix(name=None, dtype=None):
"""Return a symbolic matrix variable.
Parameters
----------
dtype: numeric
None means to use pytensor.config.floatX.
name
A name to attach to this variable.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None, None))
return type(name)
matrices, fmatrices, dmatrices, imatrices, lmatrices = apply_across_args(
matrix, fmatrix, dmatrix, imatrix, lmatrix
)
int_matrix_types = bmatrix, wmatrix, imatrix, lmatrix
float_matrix_types = fmatrix, dmatrix
complex_matrix_types = cmatrix, zmatrix
crow = TensorType("complex64", shape=(1, None))
zrow = TensorType("complex128", shape=(1, None))
frow = TensorType("float32", shape=(1, None))
drow = TensorType("float64", shape=(1, None))
brow = TensorType("int8", shape=(1, None))
wrow = TensorType("int16", shape=(1, None))
irow = TensorType("int32", shape=(1, None))
lrow = TensorType("int64", shape=(1, None))
def row(name=None, dtype=None):
"""Return a symbolic row variable (i.e. shape ``(1, None)``).
Parameters
----------
dtype: numeric type
None means to use pytensor.config.floatX.
name
A name to attach to this variable.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(1, None))
return type(name)
rows, frows, drows, irows, lrows = apply_across_args(row, frow, drow, irow, lrow)
ccol = TensorType("complex64", shape=(None, 1))
zcol = TensorType("complex128", shape=(None, 1))
fcol = TensorType("float32", shape=(None, 1))
dcol = TensorType("float64", shape=(None, 1))
bcol = TensorType("int8", shape=(None, 1))
wcol = TensorType("int16", shape=(None, 1))
icol = TensorType("int32", shape=(None, 1))
lcol = TensorType("int64", shape=(None, 1))
def col(
name: Optional[str] = None, dtype: Optional["DTypeLike"] = None
) -> "TensorVariable":
"""Return a symbolic column variable (i.e. shape ``(None, 1)``).
Parameters
----------
name
A name to attach to this variable.
dtype
``None`` means to use `pytensor.config.floatX`.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None, 1))
return type(name)
cols, fcols, dcols, icols, lcols = apply_across_args(col, fcol, dcol, icol, lcol)
ctensor3 = TensorType("complex64", shape=((None,) * 3))
ztensor3 = TensorType("complex128", shape=((None,) * 3))
ftensor3 = TensorType("float32", shape=((None,) * 3))
dtensor3 = TensorType("float64", shape=((None,) * 3))
btensor3 = TensorType("int8", shape=((None,) * 3))
wtensor3 = TensorType("int16", shape=((None,) * 3))
itensor3 = TensorType("int32", shape=((None,) * 3))
ltensor3 = TensorType("int64", shape=((None,) * 3))
def tensor3(
name: Optional[str] = None, dtype: Optional["DTypeLike"] = None
) -> "TensorVariable":
"""Return a symbolic 3D variable.
Parameters
----------
name
A name to attach to this variable.
dtype
``None`` means to use `pytensor.config.floatX`.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None, None, None))
return type(name)
tensor3s, ftensor3s, dtensor3s, itensor3s, ltensor3s = apply_across_args(
tensor3, ftensor3, dtensor3, itensor3, ltensor3
)
ctensor4 = TensorType("complex64", shape=((None,) * 4))
ztensor4 = TensorType("complex128", shape=((None,) * 4))
ftensor4 = TensorType("float32", shape=((None,) * 4))
dtensor4 = TensorType("float64", shape=((None,) * 4))
btensor4 = TensorType("int8", shape=((None,) * 4))
wtensor4 = TensorType("int16", shape=((None,) * 4))
itensor4 = TensorType("int32", shape=((None,) * 4))
ltensor4 = TensorType("int64", shape=((None,) * 4))
def tensor4(
name: Optional[str] = None, dtype: Optional["DTypeLike"] = None
) -> "TensorVariable":
"""Return a symbolic 4D variable.
Parameters
----------
name
A name to attach to this variable.
dtype
``None`` means to use `pytensor.config.floatX`.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None, None, None, None))
return type(name)
tensor4s, ftensor4s, dtensor4s, itensor4s, ltensor4s = apply_across_args(
tensor4, ftensor4, dtensor4, itensor4, ltensor4
)
ctensor5 = TensorType("complex64", shape=((None,) * 5))
ztensor5 = TensorType("complex128", shape=((None,) * 5))
ftensor5 = TensorType("float32", shape=((None,) * 5))
dtensor5 = TensorType("float64", shape=((None,) * 5))
btensor5 = TensorType("int8", shape=((None,) * 5))
wtensor5 = TensorType("int16", shape=((None,) * 5))
itensor5 = TensorType("int32", shape=((None,) * 5))
ltensor5 = TensorType("int64", shape=((None,) * 5))
def tensor5(
name: Optional[str] = None, dtype: Optional["DTypeLike"] = None
) -> "TensorVariable":
"""Return a symbolic 5D variable.
Parameters
----------
name
A name to attach to this variable.
dtype
``None`` means to use `pytensor.config.floatX`.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None, None, None, None, None))
return type(name)
tensor5s, ftensor5s, dtensor5s, itensor5s, ltensor5s = apply_across_args(
tensor5, ftensor5, dtensor5, itensor5, ltensor5
)
ctensor6 = TensorType("complex64", shape=((None,) * 6))
ztensor6 = TensorType("complex128", shape=((None,) * 6))
ftensor6 = TensorType("float32", shape=((None,) * 6))
dtensor6 = TensorType("float64", shape=((None,) * 6))
btensor6 = TensorType("int8", shape=((None,) * 6))
wtensor6 = TensorType("int16", shape=((None,) * 6))
itensor6 = TensorType("int32", shape=((None,) * 6))
ltensor6 = TensorType("int64", shape=((None,) * 6))
def tensor6(
name: Optional[str] = None, dtype: Optional["DTypeLike"] = None
) -> "TensorVariable":
"""Return a symbolic 6D variable.
Parameters
----------
name
A name to attach to this variable.
dtype
``None`` means to use `pytensor.config.floatX`.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None,) * 6)
return type(name)
tensor6s, ftensor6s, dtensor6s, itensor6s, ltensor6s = apply_across_args(
tensor6, ftensor6, dtensor6, itensor6, ltensor6
)
ctensor7 = TensorType("complex64", shape=((None,) * 7))
ztensor7 = TensorType("complex128", shape=((None,) * 7))
ftensor7 = TensorType("float32", shape=((None,) * 7))
dtensor7 = TensorType("float64", shape=((None,) * 7))
btensor7 = TensorType("int8", shape=((None,) * 7))
wtensor7 = TensorType("int16", shape=((None,) * 7))
itensor7 = TensorType("int32", shape=((None,) * 7))
ltensor7 = TensorType("int64", shape=((None,) * 7))
def tensor7(
name: Optional[str] = None, dtype: Optional["DTypeLike"] = None
) -> "TensorVariable":
"""Return a symbolic 7-D variable.
Parameters
----------
name
A name to attach to this variable.
dtype
``None`` means to use `pytensor.config.floatX`.
"""
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, shape=(None,) * 7)
return type(name)
tensor7s, ftensor7s, dtensor7s, itensor7s, ltensor7s = apply_across_args(
tensor7, ftensor7, dtensor7, itensor7, ltensor7
)

Wrong shape inference in AdvancedSubtensor

Description

This was brought up in pymc-devs/pymc#6380

import pytensor.tensor as pt

mu0 = pt.zeros((1, 2, 3))
mu1 = mu0[:, [0, 1, 1], :]
print(tuple(mu1.shape.eval()), mu1.eval().shape)
(3, 1, 3) (1, 3, 3)

The issue seems to come from the rightmost empty slices

ENH: Symbolic Vectorization

fg = FunctionGraph([inputs], [outputs])
# follow the JAX API https://jax.readthedocs.io/en/latest/_autosummary/jax.vmap.html
vfg = pytensor.graph.vectorize(fg, in_axis=[0], out_axis=[0], axis_name={0: "batch"})

# additionally, some collective Ops seem to be useful
...
# no OP if there until pytensor.graph.vectorize is called
sum_batch = pytensor.tensor.collective.Sum(tensor, axis="batch")
mean_batch = pytensor.tensor.collective.Mean(tensor, axis="batch")
...

Context for the issue:

Graph rewriting to vectorize operations in a symbolic way is a huge step to improve pymc/pytensor user experience.

Example use cases:

ENH: Implement more robust JAX dispatch for Arange

from pytensor.link.jax.dispatch import jax_funcify
from pytensor.graph.basic import Constant
from pytensor.tensor.basic import ARange

@jax_funcify.register(ARange)
def jax_funcify_ARange(op, node, **kwargs):
    try:
        start, stop, step = (at.get_scalar_constant_value(inp) for inp in node.inputs)
        def arange(*_):
            return jnp.arange(start, stop, step, dtype=op.dtype)
    except NotScalarConstantError:
        raise ValueError()
        def arange(start, stop, step):
            return jnp.arange(start, stop, step, dtype=op.dtype)

    return arange

Canonicalize `Subtensor` slices

Please describe the purpose of filing this issue

Unless I am missing something subtle about how slices work, I think those could all be treated equally:

import pytensor
import pytensor.tensor as at

x = pt.vector("x")
y1 = x[:-1]
y2 = x[0:-1]
y3 = x[0:-1:1]

f = pytensor.function([x], [y1, y2, y3])
pytensor.dprint(f)
DeepCopyOp [id A] ''   1
 |Subtensor{:int64:} [id B] ''   0
   |x [id C]
   |ScalarConstant{-1} [id D]
DeepCopyOp [id E] ''   3
 |Subtensor{int64:int64:} [id F] ''   2
   |x [id C]
   |ScalarConstant{0} [id G]
   |ScalarConstant{-1} [id D]
DeepCopyOp [id H] ''   5
 |Subtensor{int64:int64:int64} [id I] ''   4
   |x [id C]
   |ScalarConstant{0} [id G]
   |ScalarConstant{-1} [id D]
   |ScalarConstant{1} [id J]

Gradient of OpFromGraph fails

The gradients of OpFromGraph seem a bit fragile. I saw the following failures:

Multiple output

from pytensor.compile.builders import OpFromGraph
import pytensor.tensor as at

x, y = at.scalars("x", "y")
out1 = x + y
out2 = x * y
op = OpFromGraph([x, y], [out1, out2])
outs = op(x, y)
at.grad(outs[0].sum(), x)
Traceback (most recent call last):
  File "/home/ricardo/Documents/Projects/aesara/venv/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-ebcb546bdac3>", line 9, in <module>
    at.grad(outs[0].sum(), x)
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 623, in grad
    _rval: Sequence[Variable] = _populate_grad_dict(
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1434, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1434, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1387, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1213, in access_term_cache
    input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 744, in L_op
    ret_ofg_l = self._lop_op(*inps, return_list=True)
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 769, in __call__
    return super().__call__(*actual_inputs, **kwargs)
  File "/home/ricardo/Documents/Projects/aesara/aesara/graph/op.py", line 297, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 784, in make_node
    non_shared_inputs = [
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 785, in <listcomp>
    inp_t.filter_variable(inp)
  File "/home/ricardo/Documents/Projects/aesara/aesara/tensor/type.py", line 262, in filter_variable
    other2 = self.convert_variable(other)
  File "/home/ricardo/Documents/Projects/aesara/aesara/tensor/type.py", line 328, in convert_variable
    if (self.ndim == var.type.ndim) and (self.dtype == var.type.dtype):
AttributeError: 'DisconnectedType' object has no attribute 'ndim'

Single output, involving a discrete Elemwise input

from aesara.compile.builders import OpFromGraph
import aesara.tensor as at

x = at.scalar("x")
y = at.lscalar("y")
out1 = x + at.switch(at.eq(y, 0), -1, 1)
at.grad(out1, x)  # Fine

op = OpFromGraph([x, y], [out1])
out2 = op(x, y)
at.grad(out2, x)  # Fails
Traceback (most recent call last):
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-b1c4038d13ee>", line 11, in <module>
    at.grad(out2, x)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 521, in grad
    var_to_app_to_idx = _populate_var_to_app_to_idx(outputs, _wrt, consider_constant)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 968, in _populate_var_to_app_to_idx
    account_for(output)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 939, in account_for
    connection_pattern = _node_to_pattern(app)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 817, in _node_to_pattern
    connection_pattern = node.op.connection_pattern(node)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/compile/builders.py", line 851, in connection_pattern
    lop_op = self.get_lop_op()
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/compile/builders.py", line 700, in get_lop_op
    self._recompute_lop_op()
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/configparser.py", line 47, in res
    return f(*args, **kwargs)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/compile/builders.py", line 495, in _recompute_lop_op
    gdefaults_l = fn_grad(wrt=local_inputs)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 623, in grad
    _rval: Sequence[Variable] = _populate_grad_dict(
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1434, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1434, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1387, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1058, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1058, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1387, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1213, in access_term_cache
    input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/tensor/elemwise.py", line 548, in L_op
    rval = self._bgrad(inputs, outs, ograds)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/tensor/elemwise.py", line 648, in _bgrad
    ret.append(transform(scalar_igrad))
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/tensor/elemwise.py", line 621, in transform
    if isinstance(r.type, (NullType, DisconnectedType)):
AttributeError: 'float' object has no attribute 'type'

Extend lift rewrites to multivariate RVs

Please describe the purpose of filing this issue

We have two rewrites than move Dimshuffles and Subtensor operations up to the inputs of RVs.

DimShuffle lifting facilitates pattern matching of some graphs, while the Subtensor lifting allows more efficient graphs to be obtained from a general one, when we only compute some of the indepedent dimensions of batched RVs.

We already make use of the later in auto-imputation of univariate RVs in PyMC: pymc-devs/pymc#5260, and I can imagine even more uses for efficient posterior predictive sampling.

@node_rewriter([DimShuffle])
def local_dimshuffle_rv_lift(fgraph, node):
"""Lift a ``DimShuffle`` through ``RandomVariable`` inputs.
For example, ``normal(mu, std).T == normal(mu.T, std.T)``.
The basic idea behind this rewrite is that we need to separate the
``DimShuffle``-ing into distinct ``DimShuffle``s that each occur in two
distinct sub-spaces: the (set of independent) parameters and ``size``
(i.e. replications) sub-spaces.
If a ``DimShuffle`` exchanges dimensions across those two sub-spaces, then we
don't do anything.
Otherwise, if the ``DimShuffle`` only exchanges dimensions within each of
those sub-spaces, we can break it apart and apply the parameter-space
``DimShuffle`` to the distribution parameters, and then apply the
replications-space ``DimShuffle`` to the ``size`` tuple. The latter is a
particularly simple rearranging of a tuple, but the former requires a
little more work.
TODO: Currently, multivariate support for this rewrite is disabled.
"""
ds_op = node.op
if not isinstance(ds_op, DimShuffle):
return False
base_rv = node.inputs[0]
rv_node = base_rv.owner
if not (
rv_node and isinstance(rv_node.op, RandomVariable) and rv_node.op.ndim_supp == 0
):
return False
# If no one else is using the underlying `RandomVariable`, then we can
# do this; otherwise, the graph would be internally inconsistent.
if is_rv_used_in_graph(base_rv, node, fgraph):
return False
rv_op = rv_node.op
rng, size, dtype, *dist_params = rv_node.inputs
# We need to know the dimensions that were *not* added by the `size`
# parameter (i.e. the dimensions corresponding to independent variates with
# different parameter values)
num_ind_dims = None
if len(dist_params) == 1:
num_ind_dims = dist_params[0].ndim
else:
# When there is more than one distribution parameter, assume that all
# of them will broadcast to the maximum number of dimensions
num_ind_dims = max(d.ndim for d in dist_params)
# If the indices in `ds_new_order` are entirely within the replication
# indices group or the independent variates indices group, then we can apply
# this rewrite.
ds_new_order = ds_op.new_order
# Create a map from old index order to new/`DimShuffled` index order
dim_orders = [(n, d) for n, d in enumerate(ds_new_order) if isinstance(d, int)]
# Find the index at which the replications/independents split occurs
reps_ind_split_idx = len(dim_orders) - (num_ind_dims + rv_op.ndim_supp)
ds_reps_new_dims = dim_orders[:reps_ind_split_idx]
ds_ind_new_dims = dim_orders[reps_ind_split_idx:]
ds_in_ind_space = ds_ind_new_dims and all(
d >= reps_ind_split_idx for n, d in ds_ind_new_dims
)
if ds_in_ind_space or (not ds_ind_new_dims and not ds_reps_new_dims):
# Update the `size` array to reflect the `DimShuffle`d dimensions,
# since the trailing dimensions in `size` represent the independent
# variates dimensions (for univariate distributions, at least)
has_size = get_vector_length(size) > 0
new_size = (
[constant(1, dtype="int64") if o == "x" else size[o] for o in ds_new_order]
if has_size
else size
)
# Compute the new axes parameter(s) for the `DimShuffle` that will be
# applied to the `RandomVariable` parameters (they need to be offset)
if ds_ind_new_dims:
rv_params_new_order = [
d - reps_ind_split_idx if isinstance(d, int) else d
for d in ds_new_order[ds_ind_new_dims[0][0] :]
]
if not has_size and len(ds_new_order[: ds_ind_new_dims[0][0]]) > 0:
# Additional broadcast dimensions need to be added to the
# independent dimensions (i.e. parameters), since there's no
# `size` to which they can be added
rv_params_new_order = (
list(ds_new_order[: ds_ind_new_dims[0][0]]) + rv_params_new_order
)
else:
# This case is reached when, for example, `ds_new_order` only
# consists of new broadcastable dimensions (i.e. `"x"`s)
rv_params_new_order = ds_new_order
# Lift the `DimShuffle`s into the parameters
# NOTE: The parameters might not be broadcasted against each other, so
# we can only apply the parts of the `DimShuffle` that are relevant.
new_dist_params = []
for d in dist_params:
if d.ndim < len(ds_ind_new_dims):
_rv_params_new_order = [
o
for o in rv_params_new_order
if (isinstance(o, int) and o < d.ndim) or o == "x"
]
else:
_rv_params_new_order = rv_params_new_order
new_dist_params.append(
type(ds_op)(d.type.broadcastable, _rv_params_new_order)(d)
)
new_node = rv_op.make_node(rng, new_size, dtype, *new_dist_params)
if config.compute_test_value != "off":
compute_test_value(new_node)
out = new_node.outputs[1]
if base_rv.name:
out.name = f"{base_rv.name}_lifted"
return [out]
ds_in_reps_space = ds_reps_new_dims and all(
d < reps_ind_split_idx for n, d in ds_reps_new_dims
)
if ds_in_reps_space:
# Update the `size` array to reflect the `DimShuffle`d dimensions.
# There should be no need to `DimShuffle` now.
new_size = [
constant(1, dtype="int64") if o == "x" else size[o] for o in ds_new_order
]
new_node = rv_op.make_node(rng, new_size, dtype, *dist_params)
if config.compute_test_value != "off":
compute_test_value(new_node)
out = new_node.outputs[1]
if base_rv.name:
out.name = f"{base_rv.name}_lifted"
return [out]
return False
@node_rewriter([Subtensor, AdvancedSubtensor1, AdvancedSubtensor])
def local_subtensor_rv_lift(fgraph, node):
"""Lift a ``*Subtensor`` through ``RandomVariable`` inputs.
In a fashion similar to ``local_dimshuffle_rv_lift``, the indexed dimensions
need to be separated into distinct replication-space and (independent)
parameter-space ``*Subtensor``s.
The replication-space ``*Subtensor`` can be used to determine a
sub/super-set of the replication-space and, thus, a "smaller"/"larger"
``size`` tuple. The parameter-space ``*Subtensor`` is simply lifted and
applied to the distribution parameters.
Consider the following example graph:
``normal(mu, std, size=(d1, d2, d3))[idx1, idx2, idx3]``. The
``*Subtensor`` ``Op`` requests indices ``idx1``, ``idx2``, and ``idx3``,
which correspond to all three ``size`` dimensions. Now, depending on the
broadcasted dimensions of ``mu`` and ``std``, this ``*Subtensor`` ``Op``
could be reducing the ``size`` parameter and/or sub-setting the independent
``mu`` and ``std`` parameters. Only once the dimensions are properly
separated into the two replication/parameter subspaces can we determine how
the ``*Subtensor`` indices are distributed.
For instance, ``normal(mu, std, size=(d1, d2, d3))[idx1, idx2, idx3]``
could become
``normal(mu[idx1], std[idx2], size=np.shape(idx1) + np.shape(idx2) + np.shape(idx3))``
if ``mu.shape == std.shape == ()``
``normal`` is a rather simple case, because it's univariate. Multivariate
cases require a mapping between the parameter space and the image of the
random variable. This may not always be possible, but for many common
distributions it is. For example, the dimensions of the multivariate
normal's image can be mapped directly to each dimension of its parameters.
We use these mappings to change a graph like ``multivariate_normal(mu, Sigma)[idx1]``
into ``multivariate_normal(mu[idx1], Sigma[idx1, idx1])``.
"""
st_op = node.op
if not isinstance(st_op, (AdvancedSubtensor, AdvancedSubtensor1, Subtensor)):
return False
base_rv = node.inputs[0]
rv_node = base_rv.owner
if not (rv_node and isinstance(rv_node.op, RandomVariable)):
return False
# If no one else is using the underlying `RandomVariable`, then we can
# do this; otherwise, the graph would be internally inconsistent.
if is_rv_used_in_graph(base_rv, node, fgraph):
return False
rv_op = rv_node.op
rng, size, dtype, *dist_params = rv_node.inputs
# TODO: Remove this once the multi-dimensional changes described below are
# in place.
if rv_op.ndim_supp > 0:
return False
rv_op = base_rv.owner.op
rng, size, dtype, *dist_params = base_rv.owner.inputs
idx_list = getattr(st_op, "idx_list", None)
if idx_list:
cdata = get_idx_list(node.inputs, idx_list)
else:
cdata = node.inputs[1:]
st_indices, st_is_bool = zip(
*tuple(
(as_index_variable(i), getattr(i, "dtype", None) == "bool") for i in cdata
)
)
# We need to separate dimensions into replications and independents
num_ind_dims = None
if len(dist_params) == 1:
num_ind_dims = dist_params[0].ndim
else:
# When there is more than one distribution parameter, assume that all
# of them will broadcast to the maximum number of dimensions
num_ind_dims = max(d.ndim for d in dist_params)
reps_ind_split_idx = base_rv.ndim - (num_ind_dims + rv_op.ndim_supp)
if len(st_indices) > reps_ind_split_idx:
# These are the indices that need to be applied to the parameters
ind_indices = tuple(st_indices[reps_ind_split_idx:])
# We need to broadcast the parameters before applying the `*Subtensor*`
# with these indices, because the indices could be referencing broadcast
# dimensions that don't exist (yet)
bcast_dist_params = broadcast_params(dist_params, rv_op.ndims_params)
# TODO: For multidimensional distributions, we need a map that tells us
# which dimensions of the parameters need to be indexed.
#
# For example, `multivariate_normal` would have the following:
# `RandomVariable.param_to_image_dims = ((0,), (0, 1))`
#
# I.e. the first parameter's (i.e. mean's) first dimension maps directly to
# the dimension of the RV's image, and its second parameter's
# (i.e. covariance's) first and second dimensions map directly to the
# dimension of the RV's image.
args_lifted = tuple(p[ind_indices] for p in bcast_dist_params)
else:
# In this case, no indexing is applied to the parameters; only the
# `size` parameter is affected.
args_lifted = dist_params
# TODO: Could use `ShapeFeature` info. We would need to be sure that
# `node` isn't in the results, though.
# if hasattr(fgraph, "shape_feature"):
# output_shape = fgraph.shape_feature.shape_of(node.outputs[0])
# else:
output_shape = indexed_result_shape(base_rv.shape, st_indices)
size_lifted = (
output_shape if rv_op.ndim_supp == 0 else output_shape[: -rv_op.ndim_supp]
)
# Boolean indices can actually change the `size` value (compared to just
# *which* dimensions of `size` are used).
if any(st_is_bool):
size_lifted = tuple(
at_sum(idx) if is_bool else s
for s, is_bool, idx in zip(
size_lifted, st_is_bool, st_indices[: (reps_ind_split_idx + 1)]
)
)
new_node = rv_op.make_node(rng, size_lifted, dtype, *args_lifted)
_, new_rv = new_node.outputs
# Calling `Op.make_node` directly circumvents test value computations, so
# we need to compute the test values manually
if config.compute_test_value != "off":
compute_test_value(new_node)
return [new_rv]

Get a new project logo

We should replace the Aesara logo which is still shown on the README with a new logo that does not infringe the Aesara brand.

ENH: implement graph substitution that preserves independent parts

Link to a discussion

An example implementation

https://gist.github.com/ferrine/70cbcf6d3b6f033ac070d70b10ac8d25

Before

import pytensor.tensor as at
import pytensor
import numpy as np

a = at.scalar("a")
b = at.scalar("b")
b2 = b * 2
a2 = a * 2

d = (a2 ** 2 + b2 ** 2).flatten()

assert is_in_ancestors(b2, [b])
assert is_in_ancestors(d, [b])
assert not is_in_ancestors(a2, [b])
assert a in independent_apply_nodes_between([b], [d])
assert a2 in independent_apply_nodes_between([b], [d])
assert b2 not in independent_apply_nodes_between([b], [d])
d_clone = pytensor.clone_replace([d], {b: b.clone()})[0]
assert not is_in_ancestors(d_clone, [b2])
assert is_in_ancestors(d_clone, [a])
assert is_in_ancestors(d_clone, [a2]) # fails, only inputs are preserved, the remaining subgraph is copied and node references are broken

After

import pytensor.tensor as at
import pytensor
import numpy as np
a = at.scalar("a")
b = at.scalar("b")
b2 = b * 2
a2 = a * 2

d = (a2 ** 2 + b2 ** 2).flatten()

assert is_in_ancestors(b2, [b])
assert is_in_ancestors(d, [b])
assert not is_in_ancestors(a2, [b])
assert a in independent_apply_nodes_between([b], [d])
assert a2 in independent_apply_nodes_between([b], [d])
assert b2 not in independent_apply_nodes_between([b], [d])
# graph_substitute name is just an example, we can get a better name
d_clone = pytensor.graph_substitute([d], {b: b.clone()})[0]
assert not is_in_ancestors(d_clone, [b2])
assert is_in_ancestors(d_clone, [a])
assert is_in_ancestors(d_clone, [a2])

Context for the issue:

@lucianopaz had an example where random variables that get their references lost and the process to recreate them was cumbersome

import pytensor.tensor as at
import pytensor
import numpy as np

a = at.random.normal(loc=3, scale=0.01, name="a", size=2)
b = at.random.normal(loc=1, scale=0.01, name="b", size=(2, 2))
c = at.random.normal(loc=100, scale=0.01, name="c", size=(2, 2, 2))
d = at.random.normal(loc=(a + b + c).flatten(), scale=0.01)
d.name = "d"
d_clone = pytensor.graph.basic.clone_replace(
    [d], replace={c: at.zeros(c.shape, dtype=c.dtype)}
)[0]
f = pytensor.function([a, b], d_clone, on_unused_input="ignore")  # because impossible to reference a and b
f(a=np.zeros(2), b=np.zeros((2, 2)))

The proposed alternative graph substitution solves this issue

Add issue template for development / functionality request

Describe the issue:

There is no current good fit. The bug one even adds a label immediately.

On another note, I have never seen a case where the (extended) Pytensor version information was needed. I think we can remove that.

Reproducable code example:

NA

Error message:

No response

Pytensor version information:

NA

Context for the issue:

No response

Update the CoC

  • Align with other CoCs from the PyMC project
  • Align with NumFOCUS CoC template

Use optional tensorflow implementation for missing JAX Ops

I tried to sample a truncated normal distribution with Jax that threw an error. This worked fine with the non-jax sampler...but extremely slower. Erros is as follows:

Please provide a minimal, self-contained, and reproducible example.

from scipy.stats import truncnorm 
numargs = truncnorm .numargs 
a, b = 0,10
import numpy as np 
quantile = np.arange (0.01, 1, 0.1) 

# Random Variates 
R = truncnorm .rvs(a, b, size = 1000) 
x = R*np.random.randn(1000)
with pm.Model() as example_model:
    
    b = pm.Normal('b')
    
    mu = b*x
    
    sigma = pm.HalfNormal('sigma')
    
    eaches = pm.TruncatedNormal('predicted_eaches',
                                    mu=mu,
                                    sigma=sigma,
                                    lower=0,
                                    observed=R)

    idata = pm.sampling_jax.sample_numpyro_nuts(draws = 500, tune=500, target_accept = .95)

Please provide the full traceback.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_12814/714985784.py in <module>
     13                                     observed=R)
     14 
---> 15     idata = pm.sampling_jax.sample_numpyro_nuts(draws = 500, tune=500, target_accept = .95)

/opt/conda/lib/python3.7/site-packages/pymc/sampling_jax.py in sample_numpyro_nuts(draws, tune, chains, target_accept, random_seed, initvals, model, var_names, progress_bar, keep_untransformed, chain_method, postprocessing_backend, idata_kwargs, nuts_kwargs)
    481     )
    482 
--> 483     logp_fn = get_jaxified_logp(model, negative_logp=False)
    484 
    485     if nuts_kwargs is None:

/opt/conda/lib/python3.7/site-packages/pymc/sampling_jax.py in get_jaxified_logp(model, negative_logp)
    104     if not negative_logp:
    105         model_logp = -model_logp
--> 106     logp_fn = get_jaxified_graph(inputs=model.value_vars, outputs=[model_logp])
    107 
    108     def logp_fn_wrap(x):

/opt/conda/lib/python3.7/site-packages/pymc/sampling_jax.py in get_jaxified_graph(inputs, outputs)
     97 
     98     # We now jaxify the optimized fgraph
---> 99     return jax_funcify(fgraph)
    100 
    101 

/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw)
    838                             '1 positional argument')
    839 
--> 840         return dispatch(args[0].__class__)(*args, **kw)
    841 
    842     funcname = getattr(func, '__name__', 'singledispatch function')

/opt/conda/lib/python3.7/site-packages/aesara/link/jax/dispatch.py in jax_funcify_FunctionGraph(fgraph, node, fgraph_name, **kwargs)
    682         type_conversion_fn=jax_typify,
    683         fgraph_name=fgraph_name,
--> 684         **kwargs,
    685     )
    686 

/opt/conda/lib/python3.7/site-packages/aesara/link/utils.py in fgraph_to_python(fgraph, op_conversion_fn, type_conversion_fn, order, input_storage, output_storage, storage_map, fgraph_name, global_env, local_env, get_name_for_object, squeeze_output, **kwargs)
    740     for node in order:
    741         compiled_func = op_conversion_fn(
--> 742             node.op, node=node, storage_map=storage_map, **kwargs
    743         )
    744 

/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw)
    838                             '1 positional argument')
    839 
--> 840         return dispatch(args[0].__class__)(*args, **kw)
    841 
    842     funcname = getattr(func, '__name__', 'singledispatch function')

/opt/conda/lib/python3.7/site-packages/aesara/link/jax/dispatch.py in jax_funcify_Elemwise(op, **kwargs)
    399 def jax_funcify_Elemwise(op, **kwargs):
    400     scalar_op = op.scalar_op
--> 401     return jax_funcify(scalar_op, **kwargs)
    402 
    403 

/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw)
    838                             '1 positional argument')
    839 
--> 840         return dispatch(args[0].__class__)(*args, **kw)
    841 
    842     funcname = getattr(func, '__name__', 'singledispatch function')

/opt/conda/lib/python3.7/site-packages/aesara/link/jax/dispatch.py in jax_funcify_Composite(op, vectorize, **kwargs)
    404 @jax_funcify.register(Composite)
    405 def jax_funcify_Composite(op, vectorize=True, **kwargs):
--> 406     jax_impl = jax_funcify(op.fgraph)
    407 
    408     def composite(*args):

/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw)
    838                             '1 positional argument')
    839 
--> 840         return dispatch(args[0].__class__)(*args, **kw)
    841 
    842     funcname = getattr(func, '__name__', 'singledispatch function')

/opt/conda/lib/python3.7/site-packages/aesara/link/jax/dispatch.py in jax_funcify_FunctionGraph(fgraph, node, fgraph_name, **kwargs)
    682         type_conversion_fn=jax_typify,
    683         fgraph_name=fgraph_name,
--> 684         **kwargs,
    685     )
    686 

/opt/conda/lib/python3.7/site-packages/aesara/link/utils.py in fgraph_to_python(fgraph, op_conversion_fn, type_conversion_fn, order, input_storage, output_storage, storage_map, fgraph_name, global_env, local_env, get_name_for_object, squeeze_output, **kwargs)
    740     for node in order:
    741         compiled_func = op_conversion_fn(
--> 742             node.op, node=node, storage_map=storage_map, **kwargs
    743         )
    744 

/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw)
    838                             '1 positional argument')
    839 
--> 840         return dispatch(args[0].__class__)(*args, **kw)
    841 
    842     funcname = getattr(func, '__name__', 'singledispatch function')

/opt/conda/lib/python3.7/site-packages/aesara/link/jax/dispatch.py in jax_funcify_ScalarOp(op, **kwargs)
    158 
    159     if "." in func_name:
--> 160         jnp_func = reduce(getattr, [jax] + func_name.split("."))
    161     else:
    162         jnp_func = getattr(jnp, func_name)

AttributeError: module 'jax.scipy.special' has no attribute 'erfcx'

Versions and main components

  • PyMC/PyMC3 Version: 4.0.1
  • Aesara/Theano Version: 2.7.2
  • Python Version: 3.7.12
  • Operating system: Linux
  • How did you install PyMC/PyMC3: (conda/pip) conda

Move pandas converters to pytensor

Description

Does it make sense to move the converters into pytensor? It will be slightly longer import time, but user experience will be better

try:
    import pandas as pd
    # https://github.com/pymc-devs/pymc/blob/main/pymc/aesaraf.py#L152
    @_as_tensor_variable.register(pd.Series)
    @_as_tensor_variable.register(pd.DataFrame)
    def dataframe_to_tensor_variable(df: pd.DataFrame, *args, **kwargs) -> TensorVariable:
        return at.as_tensor_variable(df.to_numpy(), *args, **kwargs)
except ImportError:
    pass

ENH: Make (some) commonly used signatures accept items or collections

Link to a discussion

@ferrine and I discussed this in chat, related to #15.

PR #15 is about a function that took an item as an argument, but the PR changed the signature to take a collection.

Instead of switching from taking items to taking collections we considered options to take either:

  1. With def f(*items: T) the backwards compatibility is maintained, but multiple items/collections can be passed like f(a, b) or f(*abc).
  2. With def f(items: Union[T, Collection[T]]) the function can do an isinstance(items, T) check to convert to a collection internally while the calls can be f(a) or f(abc).

The *items signature is not common in the codebase, therefore we concluded to favor option 2.

Where this applies

  • The new is_in_ancestors from #15
  • pytensor.function is an example that is already flexible in this way:

inputs : list of either Variable or In instances.
Function parameters, these are not allowed to be shared variables.
outputs : list or dict of Variables or Out instances.
If it is a dict, the keys must be strings. Expressions to compute.

  • FunctionGraph currently takes only collections:

inputs: Optional[Sequence[Variable]] = None,
outputs: Optional[Sequence[Variable]] = None,

Before

x = pt.dscalar("x")
y = x + 2

fg = FunctionGraph([x], [y])

After

x = pt.dscalar("x")
y = x + 2

fg = FunctionGraph(x, y)

Context for the issue:

No response

Rewrite products of exponents as exponent of sum

Please describe the purpose of filing this issue

exp(x) * exp(y) => exp(x+y)
power(base, x) * power(base, y) => power(base, x + y)

From the top of my mind, I don't think such rewrite would affect numerical stability, but it should be faster...

import numpy as np

%timeit np.exp(9) * np.exp(2)
# 2.27 ยตs ยฑ 53.5 ns per loop (mean ยฑ std. dev. of 7 runs, 100000 loops each)
%timeit np.exp(9+2)
# 1.11 ยตs ยฑ 37 ns per loop (mean ยฑ std. dev. of 7 runs, 1000000 loops each)

There was a PR for this in Theano that went stale: Theano/Theano#5272

Maybe we can borrow something from it.

`local_dimshuffle_rv_lift` fails in some cases

Please describe the purpose of filing this issue

import numpy as np
import pytensor as aesara
import pytensor.tensor as at
from pytensor.graph import FunctionGraph
from pytensor.tensor.random.opt import local_dimshuffle_rv_lift

x = at.random.normal(0, 1, size=(1, 2)).dimshuffle(1)
fg = FunctionGraph(outputs=[x])
print(local_dimshuffle_rv_lift.transform(fg, x.owner))  # False

x = at.random.normal([[0, 0]], 1, size=(1, 2)).dimshuffle(1)
fg = FunctionGraph(outputs=[x])
local_dimshuffle_rv_lift.transform(fg, x.owner)  # raises ValueError

x = at.random.normal(np.zeros((4, 3, 2)), 1, size=(4, 3, 2)).T
fg = FunctionGraph(outputs=[x])
assert local_dimshuffle_rv_lift.transform(fg, x.owner)  # Fine

x = at.random.normal(np.zeros((3, 2)), 1, size=(4, 3, 2)).T
fg = FunctionGraph(outputs=[x])
assert local_dimshuffle_rv_lift.transform(fg, x.owner)  # Fails

Implement equivalent numpy median and quantile / percentile

Please describe the purpose of filing this issue

Equivalent symbolic methods to those are missing.

The Numpy quantile and percentile methods have too many options for the interpolation argument, and these are planned to be deprecated for a while now (see numpy/numpy#10736). It should suffice to implement the default "linear" interpolation.

I am confident that these should not require any extra Ops.

BUG: <Please write a comprehensive title after the 'BUG: ' prefix>

Describe the issue:

The behaviour is not clear to me is it an expected behaviour?

Reproducable code example:

a = at.constant(np.array([1, 2.]), name="a") 
b = at.matrix("b")
c = b * 2

d = (a ** 2 + b ** 2 + c  ** 2).flatten()

d_clone = pytensor.graph.basic.clone_get_equiv(
    [c], [d], memo={c: at.zeros(c.shape, dtype=c.dtype)},
    copy_orphans=True,
)[d]
assert a in pytensor.graph.basic.ancestors([d_clone])

d_clone = pytensor.graph.basic.clone_get_equiv(
    [c], [d], memo={c: at.zeros(c.shape, dtype=c.dtype)},
    copy_orphans=False,
)[d]
assert a in pytensor.graph.basic.ancestors([d_clone])

Error message:

No response

Pytensor version information:

running at this commit a210b51

Context for the issue:

No response

No broadcasting support in numba AdvancedIncSubtensor1

Description

import pytensor
import pytensor.tensor as pt
import numpy as np

x = pt.dvector("z")
idx = pt.ivector("idx")

out = pt.inc_subtensor(x[idx], 1)  # The one needs to be broadcasted

func = pytensor.function([x, idx], out, mode="NUMBA")
func(np.zeros(3), np.array([2], dtype=np.int32))
# -> TypingError in numba

This works with the C backend.

BUG: Scan inner graphs are not optimized in NUMBA / JAX backends

Reproducable code example:

import aesara
aesara.config.mode = "NUMBA"  # Otherwise it works
import aesara.tensor as at
from aesara.compile.builders import OpFromGraph

x = at.scalar("x")
out = at.log(x)
op = OpFromGraph([x], [out], inline=True)

xs = at.vector("xs")
seq, _ = aesara.scan(
    fn=lambda x: op(x),
    sequences=[xs],
)

f = aesara.function([xs], seq)
aesara.dprint(f)
for{cpu,scan_fn} [id A] 5
 |Shape_i{0} [id B] 0
 | |xs [id C]
 |Subtensor{int64:int64:int8} [id D] 4
 | |xs [id C]
 | |ScalarFromTensor [id E] 3
 | | |Elemwise{Composite{Switch(LE(i0, i1), i1, i2)}} [id F] 2
 | |   |Shape_i{0} [id B] 0
 | |   |TensorConstant{0} [id G]
 | |   |TensorConstant{0} [id H]
 | |ScalarFromTensor [id I] 1
 | | |Shape_i{0} [id B] 0
 | |ScalarConstant{1} [id J]
 |Shape_i{0} [id B] 0

Inner graphs:

for{cpu,scan_fn} [id A]
 >OpFromGraph{inline=True} [id K]
 > |*0-<TensorType(float64, ())> [id L] -> [id D]

OpFromGraph{inline=True} [id K]
 >Elemwise{log,no_inplace} [id M]
 > |*0-<TensorType(float64, ())> [id L]

Make sure all deprecations are properly removed

Description

Right now we've lost track of deprecated functions. See #111

import warnings
import functools

__version__ = "0.1.1"
def deprecate(*, start: str, removed: str):
    if __version__ >= removed:
        raise RuntimeError(f"After incrementing version {__version__} this function has to be removed")
    else:
        def wrapper(fn):
            @functools.wraps(fn)
            def wrapped_fn(*args, **kwargs):
                warnings.warn(f"This function is deprecated in version {start} and "
                              f"will be removed at version {removed}. Current version is {__version__}.", DeprecationWarning)
                return fn(*args, **kwargs)
            return wrapped_fn
        return wrapper

gives

@deprecate(start="0.0.1", removed="1.1.1")
def add(x, y):
    return x + y
add(1, 2)
/var/folders/rx/rk9gm4ln35z802s3p81wfz8r0000gp/T/ipykernel_10024/2831940282.py:9: DeprecationWarning: This function is deprecated in version 0.0.1 and will be removed at version 1.1.1. Current version is 0.1.1.
  warnings.warn(f"This function is deprecated in version {start} and "

And during increment version PR

@deprecate(start="0.0.1", removed="0.1.1")
def add(x, y):
    return x + y
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In [19], line 1
----> 1 @deprecate(start="0.0.1", removed="0.1.1")
      2 def add(x, y):
      3     return x + y

Cell In [16], line 4, in deprecate(start, removed)
      2 def deprecate(*, start: str, removed: str):
      3     if __version__ >= removed:
----> 4         raise RuntimeError(f"After incrementing version {__version__} this function has to be removed")
      5     else:
      6         def wrapper(fn):

RuntimeError: After incrementing version 0.1.1 this function has to be removed

For flexibility we can build similar thing on top of
https://borda.github.io/pyDeprecate/

ENH: implement JAX modified Bessel function via exponentially modified version

The modified Bessel function of the first kind is not currently available in Jax, but there is an exponentially-scaled version such that:

ive(v, z) = iv(v, z) * exp(-abs(z.real))

However ive is not available in PyTensor. Implementing it would allow us to have iv from this transformation.

Context for the issue:

The Bessel functions are useful in places like the Hilbert space approximation kernels for GPs. It would be valuable to be able to sample these with the Jax backend for performance reasons.

Do not fail when dropping dimension of unknown length

Please describe the purpose of filing this issue

import pytensor.tensor as pt

x = pt.tensor("float64", shape=(None, None))
x.dimshuffle((0,))  # ValueError: Cannot drop a non-broadcastable dimension [False, False] , (0,)

Instead we could just add a SpecifyShape to the input whenever that's attempted

x = pt.specify_shape(x, (None, 1))
x.dimshuffle((0,))

Does anyone see a problem with this?

This is an issue in aesara-devs/aeppl#191

Accelerate the CI pipeline

Right now the CI is taking ~1.5 h.

However, we can split up jobs in the job matrix to parallelize more, and also investigate whether we're unnecessarily running things multiple times (e.g. different Python versions).

Maybe the scripts/check_all_tests_are_covered.py from PyMC can be adapted to quickly diagnose redundacies?

Numba backend does not deepcopy shared outputs

Please describe the purpose of filing this issue

import aesara
import aesara.tensor as at

x = aesara.shared(0, name="x")

f = aesara.function([], x, mode=None)
f().itemset(2)
assert x.get_value() == 0  # Fine

f = aesara.function([], x, mode="NUMBA")
f().itemset(2)
assert x.get_value() == 0  # AssertionError

Remove deprecated modules and functions

Description

These modules were moved and marked deprecated. As a part of maintainance and library refresh, we can work on removing placeholders and documentation references

grep 'Deprecation'  -nR {tests,pytensor}/**/*.py
tests/tensor/nnet/test_conv.py:88:            with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:602:                        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:640:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:653:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:666:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:679:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:692:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:705:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:718:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:731:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:744:        with pytest.warns(DeprecationWarning):
tests/tensor/nnet/test_conv.py:757:        with pytest.warns(DeprecationWarning):
tests/tensor/test_basic.py:1336:        with pytest.warns(DeprecationWarning):
tests/tensor/test_math_scipy.py:729:    with pytest.warns(DeprecationWarning):
tests/tensor/test_subtensor.py:1205:                        # with NumPy 1.10 will raise a Deprecation warning.
tests/tensor/test_type.py:333:    with pytest.warns(DeprecationWarning, match=".*broadcastable.*"):
tests/tensor/test_type.py:338:    with pytest.warns(DeprecationWarning, match=".*broadcastable.*"):
tests/tensor/test_type.py:359:        DeprecationWarning, match="The `broadcastable` keyword is deprecated"
tests/test_config.py:29:    with pytest.warns(DeprecationWarning, match="instead"):
tests/test_config.py:35:    with pytest.warns(DeprecationWarning, match="instead"):
tests/test_config.py:52:    with pytest.warns(DeprecationWarning):
tests/test_config.py:63:    with pytest.warns(DeprecationWarning):
pytensor/assert_op.py:7:    DeprecationWarning,
pytensor/configparser.py:84:            DeprecationWarning,
pytensor/configparser.py:193:        # The following code adds redirects that spill DeprecationWarnings
pytensor/configparser.py:572:    config settings, but raises DeprecationWarnings with instructions to use `pytensor.config`.
pytensor/configparser.py:583:            DeprecationWarning,
pytensor/configparser.py:593:            DeprecationWarning,
pytensor/configparser.py:637:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/gradient.py:2142:        category=DeprecationWarning,
pytensor/gradient.py:2376:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/kanren.py:6:    DeprecationWarning,
pytensor/graph/optdb.py:6:    DeprecationWarning,
pytensor/graph/optdb.py:26:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/opt.py:6:    DeprecationWarning,
pytensor/graph/opt.py:26:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/opt_utils.py:6:    DeprecationWarning,
pytensor/graph/opt_utils.py:26:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/rewriting/basic.py:116:            DeprecationWarning,
pytensor/graph/rewriting/basic.py:2312:            DeprecationWarning,
pytensor/graph/rewriting/basic.py:3261:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/rewriting/db.py:577:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/rewriting/utils.py:68:            DeprecationWarning,
pytensor/graph/rewriting/utils.py:272:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/graph/toolbox.py:7:    DeprecationWarning,
pytensor/graph/unify.py:6:    DeprecationWarning,
pytensor/__init__.py:196:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/link/jax/jax_dispatch.py:7:    DeprecationWarning,
pytensor/link/jax/jax_linker.py:7:    DeprecationWarning,
pytensor/printing.py:199:            DeprecationWarning,
pytensor/sandbox/rng_mrg.py:43:    DeprecationWarning,
pytensor/sandbox/rng_mrg.py:709:                DeprecationWarning,
pytensor/sandbox/rng_mrg.py:1121:            DeprecationWarning,
pytensor/scalar/basic.py:4471:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/scalar/basic_scipy.py:7:    DeprecationWarning,
pytensor/scan/basic.py:507:                    raise DeprecationWarning(
pytensor/scan/opt.py:6:    DeprecationWarning,
pytensor/sparse/opt.py:6:    DeprecationWarning,
pytensor/tensor/basic_opt.py:6:    DeprecationWarning,
pytensor/tensor/basic.py:2596:            DeprecationWarning,
pytensor/tensor/math_opt.py:6:    DeprecationWarning,
pytensor/tensor/math.py:3153:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/tensor/nnet/basic.py:2168:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/tensor/nnet/conv.py:112:        DeprecationWarning,
pytensor/tensor/nnet/conv.py:410:            DeprecationWarning,
pytensor/tensor/nnet/__init__.py:7:    DeprecationWarning,
pytensor/tensor/nnet/opt.py:6:    DeprecationWarning,
pytensor/tensor/opt_uncanonicalize.py:6:    DeprecationWarning,
pytensor/tensor/random/opt.py:6:    DeprecationWarning,
pytensor/tensor/rewriting/basic.py:1295:        warn(msg, DeprecationWarning, stacklevel=2)
pytensor/tensor/sharedvar.py:71:            DeprecationWarning,
pytensor/tensor/signal/conv.py:17:    DeprecationWarning,
pytensor/tensor/signal/pool.py:25:    DeprecationWarning,
pytensor/tensor/signal/pool.py:111:                category=DeprecationWarning,
pytensor/tensor/signal/pool.py:129:                category=DeprecationWarning,
pytensor/tensor/signal/pool.py:145:                category=DeprecationWarning,
pytensor/tensor/signal/pool.py:158:            category=DeprecationWarning,
pytensor/tensor/signal/pool.py:227:                category=DeprecationWarning,
pytensor/tensor/signal/pool.py:245:                category=DeprecationWarning,
pytensor/tensor/signal/pool.py:261:                category=DeprecationWarning,
pytensor/tensor/signal/pool.py:274:            category=DeprecationWarning,
pytensor/tensor/signal/pool.py:391:                    category=DeprecationWarning,
pytensor/tensor/signal/pool.py:409:                    category=DeprecationWarning,
pytensor/tensor/signal/pool.py:426:                    category=DeprecationWarning,
pytensor/tensor/signal/pool.py:1040:                    category=DeprecationWarning,
pytensor/tensor/signal/pool.py:1058:                    category=DeprecationWarning,
pytensor/tensor/signal/pool.py:1074:                    category=DeprecationWarning,
pytensor/tensor/slinalg.py:868:            warn(msg, DeprecationWarning, stacklevel=2)
pytensor/tensor/subtensor_opt.py:6:    DeprecationWarning,
pytensor/tensor/type.py:98:                DeprecationWarning,
pytensor/tensor/type.py:127:                DeprecationWarning,
pytensor/utils.py:153:    when the function is used first time and filter is set for show DeprecationWarning.
pytensor/utils.py:173:                    category=DeprecationWarning,

Consider implementing a scalar Scan Op

Description

We have a couple of derivatives that are implemented as iterative power series approximations:

class GammaIncDer(BinaryScalarOp):

class GammaIncCDer(BinaryScalarOp):

class BetaIncDer(ScalarOp):

And this in the near future: aesara-devs/aesara#1288

These are currently implemented in Python only. We can't use Scan for these, because Elemwise requires the gradients to be composed exclusively of other Elemwise Ops, so that it can be safely vectorized by just passing tensor inputs. See aesara-devs/aesara#512, aesara-devs/aesara#1178, aesara-devs/aesara#514

If we had a Scalar scan, which expects all inputs and outputs to be scalar, we could then turn it into an Elemwise and use it in the gradients of such Ops.

This would allow us to have a single implementation for our Python/Numba/JAX backends without having to manually rewrite the same code for each backend.

It may be also easier to optimize than the general-purpose Scan, so it could be used internally in other scenarios as well for better performance. For instance this Op wouldn't have to worry about SharedVariables, RNGs, Taps (better to not start down that path, and model carryover explicitly, as this is one of the biggest "unmanageable" complexities in Scan), etc.

This would go with the idea of creating multiple specialized scan Ops, instead of trying to refactor the general purpose one that already exists. As the latter is the approach being explored by Aesara, we may cover more ground across libraries by trying this new approach.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.