The motivating example is how to type check this record: <div class="snippet-clipb

Based on my work in <a class="issue-link js-issue-link" data-error-text="Failed to loa

Host type conversion and type checking about fathom HOT 11 CLOSED

yeslogic commented on April 28, 2024

Host type conversion and type checking

from fathom.

Comments (11)

mikeday commented on April 28, 2024 2

In that case I think we can drop the host type conversion function altogether and just have the subtype relation for binary integer types:

U8 ≤ {0..<2^8}
U16BE ≤ {0..<2^16}
U16LE ≤ {0..<2^16}

And the existing subtype relation for ranged integer types:

c ≤ a ∧ b ≤ d ⊢ {a..b} ≤ {c..d}

We can define the parseable types with a predicate matching the old host type function in structure:

parseable : Type → Prop

parseable U8
parseable U16BE
parseable U16LE
parseable (Array n t) if parseable t
parseable (Record {...}) if all of its fields are parseable
parseable (Cond e t1 t2) if parseable t1 ∧ parseable t2

from fathom.

mikeday commented on April 28, 2024

A different example which requires type conversion to go the other way:

Header = record {
  width: U16BE,
  height: U16BE,
  format: U16BE
}

Pixels(h: header) = record {
  data: Array (h.width * h.height * h.format) U8
}

Image = record {
  header: Header,
  pixels: Pixels(header)
}

Although this looks reasonable, it will not type check if the header field has been converted to a host record type before being passed to Pixels, which expects the original binary type. How can we solve this?

1. Duplicate the type definition

Define a new type representing the parsed header record and use that instead:

parsed_header = record {
  width: {0..<2^16},
  height: {0..<2^16},
  format: {0..<2^16}  
}

Pixels(h: parsed_header) = record {

This will now type check, but the duplication is annoying. Worse still it's manually doing the work of the host type conversion function, so why not just use that instead?

2. Use the host conversion function

Pixels(h: host header) = record {

This neatly solves the problem, until the user forgets to add host and it fails with a confusing error message. When will we ever want to pass a binary type without converting it to a host type? Is that even a meaningful operation, given that the binary type represents an unparsed value?

3. Implicitly apply the host conversion function

Implicitly applying the host conversion function to the type of all arguments solves this problem, but does it break anything else?

4. Subtyping of binary types

If we only apply the host function in the subtype relation then the original code will already type check, as the header value still has its original binary type.

5. Subtyping of host types

If host types are subtypes of the original binary type then we can pass in a host type where an equivalent binary type is expected, but this is a little weird.

from fathom.

markbrown commented on April 28, 2024

Some amount of subtyping is needed to get the first example to work in any of the cases, since the type of the array size is smaller than it has to be. Moreover, there doesn't appear to be any need describe things as "host" or "binary" in order to explain the example. All that's needed is that U16BE is a subtype of {0..}, which is perfectly reasonable given the interpretation of the types as sets of numbers.

The second example is similar, but there is a slight bit of trickiness to the subtyping because it could happen in a number of different places, since the * function is presumably overloaded to work with various different integer types so it is ambiguous where conversions take place. Semantically, however, this shouldn't matter as we ought to get the same result whatever we choose. (This property of an overloaded function with respect to subtyping is called "coherence".)

Where the "host" vs "binary" distinction comes up is in an example like this:

record {
    len: {0..100},
    data: Array len {0..255}
}

It should be perfectly ok to construct records of this type and pass them to functions, etc. But the type cannot be parsed because neither field has enough information (size and endianness) to do that, so this type should not be considered a binary type.

It might be helpful to think of "binary" as being a typeclass which might also be reasonably named Parseable.

from fathom.

brendanzab commented on April 28, 2024

Thanks a bunch for all these examples and thoughts, this is extremely helpful, and I am reading with interest! Feels like we are getting closer!

It might be helpful to think of "binary" as being a typeclass which might also be reasonably named Parseable.

The Power of Pi paper uses this explanatory bridge in section 3.1 as well. It's a good one!

from fathom.

mikeday commented on April 28, 2024

Perhaps the parseable predicate could include sizes as (potentially infinite) sets of integers:

parseable : Type → Size → Prop

parseable U8 {1}
parseable U16BE {2}
parseable U16LE {2}
parseable (Array n t) (n * s) if parseable t s ∧ singleton s
parseable (Record {...}) (sum s) if all of its fields are parseable
parseable (Cond e t1 t2) (union s1 s2) if parseable t1 s1 ∧ parseable t2 s2

This checks that array elements can only have a single known size.

from fathom.

mikeday commented on April 28, 2024

Not to go off on a tangent, but sizes can be represented using the polynomial approach that Mark sketched out for determining field alignments, or for arrays a simpler method that just distinguishes between known fixed sizes and unknown/variable sizes:

size U8 = Fixed 1
size U16BE = Fixed 2
size U16LE = Fixed 2
size (Array n t) = if n == Fixed n' && size t == Fixed s then n' * s else Unknown
size (Record {...}) = sum of field sizes
size (Cond e t1 t2) = if size t1 == size t2 then size t1 else Unknown

from fathom.

mikeday commented on April 28, 2024

How tricky will it be to integrate subtyping into the type checker? At least the subtype relation sketched out above does not have cycles, but it can require multiple steps to complete:

U16BE ≤ {0..<2^16} ≤ {0..}

Is it sufficient to just greedily apply subtyping repeatedly/speculatively whenever a type doesn't match?

from fathom.

brendanzab commented on April 28, 2024

Yeah, my plan was to add as part of CONV, instead of just checking for alpha equivalence.

from fathom.

brendanzab commented on April 28, 2024

Alas, I have never added subtyping to a language before. I had always hoped to get to grips Steven Dolan's thesis, Algebraic Subtyping, before hand, but perhaps I'll just blunder through! 😅

from fathom.

brendanzab commented on April 28, 2024

Based on my work in #215, I'm thinking it makes sense to think of 'formats' as 'descriptions of types' of type Format, rather than types in their own right. These can then be converted to their corresponding representation type by way of some built-in repr : Format -> Type function. We'd decouple the typing rules of format structures and host structures - so format structures can have their own typing rules, rather than trying to overload structures themselves.

This nicely sidesteps the issue of having silly cases where you might be able to construct elements of format types - because they have no constructors themselves this becomes an impossibility. It also means we don't need to use subtyping for this, which becomes rather complicated!

from fathom.

brendanzab commented on April 28, 2024

Funnily enough this means our language will look very much like the one described in the Power of Pi paper and implemented in Narcissus.

from fathom.

Host type conversion and type checking about fathom HOT 11 CLOSED

Comments (11)

1. Duplicate the type definition

2. Use the host conversion function

3. Implicitly apply the host conversion function

4. Subtyping of binary types

5. Subtyping of host types

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs