Details:
Hours Spent: 16
Background
Optimal performance is required. Can be developed in tandem with #30
Overview
Add the concept of a complex expression. When an expression has repeated subexpressions, instead of evaluating the value of that subexpression repeatedly, substitute an ExpressionVariable
for that subexpression that can be evaluated along the side the overall expression.
Create a subexpression map that holds the expression that the variable is masking and the expression that it is used in ExpressionLet VariableName Expression Expression
which means let x = a in b
. Where a
is the subexpression and b
is the parent expression.
Prefix all sub-expression variables with $
and the Expression.Simplifier
module should be used before joining the common subexpressions.
Example1:
sin(sin(x)) + sin(x)
becomes
let $v1 = sin(x) in let $v2 = sin($v1) in let $v3 = $v2+$v1 in $v3
Example 2:
sin(sin (sin(x))) + sin (sin(x)) + sin(x)
becomes
let $v1 = sin(x) in let $v2 = sin($v1) in let $v3 = sin($v2) in let $v4 = $v3+$v2 in let $v5 = $v4+$v1 in $v5
Algorithm:
- Construct a set of sub-expressions (OR map from sub-expressions to occurrence count)
- Assign numbers to all these sub-expressions, ie convert the set to
indexToSubExpressionMap :: Map Int Expression
- Invert this mapping, ie create
subExpressionToIndexMap :: Map Expression Int
- Substitute in
indexToSubExpressionMap :: Map Int Expression
top-level sub-terms with variables, giving indexToSubstitutedExpressionMap :: Map Int Expression
, eg:
BinOp Add expr1 expr2 -> BinOp Add (Var "$v10") (Var "$v22")
BinOp Add (Var "x") expr2 -> BinOp Add (Var "x") (Var "$v22")
- Order the variables by dependency, creating
sortedIndexAndSubstitutedExpressions :: Array (Tuple Int Expression)
in multiple passes, starting with indexToSubstitutedExpressionMap
, removing expressions that only contain known variables until the map is empty
Acceptance Criteria
Closing Notes
Overview of changes can be found in PR #47
The algorithm implemented is as follows.
- Split the expression into
splitSubExpressions
Subexpressions are unary and binary operations. This means that the expression ExpressionLiteral 40
has no sub-expressions and is not a subexpression itself.
- Create the
subExpressionToVariableMap
In this map the subexpression as the key and the variable as the value. This map is biopic meaning that both the key and value in each entry are unique in the map.
- Substitute the variables into each of the sub-expressions
Each subexpression is either a unary or a binary operation. This means that they are expression trees that are only at maximum one deep. In this step, each entry in the map is treated over and the nested operation(s) in the unary and binary operations are looked up in the subExpressionToVariableMap
. If they are found they are replaced with their respective variable.
- Order the entries by their interdependencies
This works by recursively removing variables that are not dependent on any other sub-expressions in the subExpressionToVariableMap
. This results in an array of the entries in the order that they should appear in the nested ExpressionLet
chain.
- Rebuild the expression as an
ExpressionLet
chain
This simply folds the chain of sub-expressions into a single expression tree. If there are no sub-expressions in the subExpressionToVariable
array then the original expression is used.
There are some small differences between what the examples expect and what is produced by this algorithm. The only real difference is that there is no subexpression variable used for the top-most subexpression.
sin(sin(x)) + sin(x)
becomes
let $v1 = sin(x) in let $v2 = sin($v1) in $v2+$v1
sin(sin(sin(x))) + sin(sin(x)) + sin(x)
becomes
let $v1 = sin(x) in let $v2 = sin($v1) in let $v3 = sin($v2) in let $v4 = $v3+$v2 in $v4+$v1
I did have to update the Expression.Simplifier
as now that the definition for Expression
has 5 data constructors doing nested pattern matching causes a warning.