Comments (7)
For conditional types it does seem like we could evaluate each branch and then pick the minimum, so yes I do think in your example we should be able to conclude that the next field is 4-aligned, would there be any downsides to this?
from fathom.
The problem is that we need to combine those like terms in step 3 in order not to lose information. Say we have a struct containing two u8 arrays of length x. Individually they are not aligned, so normally if we combine them in a struct we would conclude that the result is not aligned. But if we normalise the expression it becomes 2 * x
, which we can conclude is 2-aligned, information we otherwise would have lost.
In the above case, both branches are 2-aligned, and the preceding u8
s between them are 2-aligned, so we can't conclude that the sum is anything better than 2-aligned.
We can do better than this if we can detect that each branch has size 2 in mod 4, and so do the u8
s. So the overall size is 2 + 2 in mod 4, which is equal to zero. That is, the total is 4-aligned. Thinking about how to do that in general...
from fathom.
So if both branches are 2-aligned, and we represent them as 2x and there are two preceding u8s so that's a constant 2 and we end up with 2x + 2, the gcd of the coefficients is still 2, not 4.
Instead we could consider two different versions of the struct by expanding the two branches, so one version would be 2 + 2 and the other would be 2 + 6, so anything that follows either version would be 4 aligned.
But I think there are implications of doing it that way I don't fully understand.
from fathom.
But I think there are implications of doing it that way
One of them is that if there is a sequence of conditionals then there is an exponential number of paths through them. So to avoid trying each path we need to be able to merge the information from alternative branches somehow.
Hmm, rather than 2x, we should be able to say that each branch is 4x + 2 (that's the same as saying it is equivalent to 2 in mod 4). Then the whole struct is 2 + 4x + 2, which normalises to 4x + 4 and is hence 4-aligned.
We can do so by inserting the following between steps 1 and 2 above:
- ...
- For conditional types, recursively apply this algorithm (steps 1-4) for each branch, generating two polynomials.
- Let
m
be the greatest common divisor of the following numbers:- the non-constant coefficients of the polynomials.
- the absolute difference between the two constant coefficients. (Note that every number is a divisor of zero, so gcd(0, x) = x for all positive x.)
- Introduce a fresh variable, say
z
, and letb
be either of the constant coefficients. Use the expressionmz + (b mod m)
for the conditional type in step 1, above. Don't replace these expressions in step 2.
- ...
In our example, the two polynomials we get are just the constants 2 and 6, hence m
is 4. So we get the expression 4z + 2
as required.
As another example, say from one branch we get 8x + 7
and from the other we get 4x + 8y + 1
. Then m = gcd(8, 4, 8, 6) = 2
and the formula we get is 2z + 1
. That is, the best we can say about this conditional is that its size is odd.
from fathom.
Hmm. A counterexample to the above is a struct with a field of size x
, and another which is a conditional that is either x
or 3x
. This is of even length for all x
, irrespective of the branch taken, but the method above fails because the x
s from the conditional are not combined with the x
from the first field.
However, the difference between the two branches is 2x
, and this is always even. We should be able to combine this information about the difference between the branches with the result we get by assuming the first one is taken. That will give the possible values for either branch.
We first define the function alignment
, which takes a polynomial expression and returns the gcd of the coefficients of the polynomial after it has been normalised.
We can then define the function poly
, which takes a type and returns a polynomial expression as follows.
poly( struct { f1 : T1; ... ; fn : Tn } ) = poly(T1) + ... + poly(Tn)
poly( [T; e] ) = poly(T) * conv(e)
poly( b ) = sizeof(b)
poly( T1 || T2 ) = poly(T1) + alignment(poly(T2) - poly(T1)) * z, fresh z
The function conv
takes an expression and returns a polynomial expression:
conv(k) = k
conv(x) = x
conv(e1 + e2) = conv(e1) + conv(e2)
conv(e1 * e2) = conv(e1) * conv(e2)
conv(-e) = -conv(e)
conv(_) = z, fresh z
We can calculate the alignment of any type by applying alignment
to the result of poly
. Or to calculate the modulus, let m
be the gcd of the non-constant coefficients of the normalised polynomial, and let b
be the constant coefficient. Then the size will always be equal to mx + (b mod m)
for some x
.
For the example above, poly
returns x + x + alignment(3x - x) * z
. This reduces to 2x + 2z
, and we conclude that the size must be even.
Note that if in the formula we assume the second branch was taken rather than the first (or if we swap the branches) then the polynomial reduces to 4x + 2z
, and we again conclude 2-alignment. Hence we get an equivalent result for either branch order, as expected.
from fathom.
Neat, modulo subtraction comes to the rescue once again!
from fathom.
@markbrown has started work on a post in 2789d1a. Looking nice so far!
from fathom.
Related Issues (20)
- Constrained representation types HOT 3
- Cover more unification codepaths in the testsuite
- Let formats HOT 2
- Sugar for guarded fields in record formats
- Challenges arising from the OpenType `glyf` table HOT 2
- Inconsistency between synthesised function literals and checked function literals HOT 1
- Sum types? HOT 4
- Semantic Interpretation Revisited
- Inconsistency between tuple types and record types
- Compile time benchmarks in CI? HOT 1
- Add documentation for implicit arguments HOT 1
- Lazy evaluation HOT 6
- OpenType data description
- Distillation crashes in some cases HOT 1
- Implementation annoyances HOT 1
- Multiple modules HOT 1
- Global string interner HOT 8
- Separate name resolution from elaboration HOT 5
- Question: Comparison with Kaitai? HOT 4
- Incorrect elaboration of record literals?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fathom.