rust-lang / chalk Goto Github PK
View Code? Open in Web Editor NEWAn implementation and definition of the Rust trait system using a PROLOG-like logic solver
Home Page: https://rust-lang.github.io/chalk/book/
License: Other
An implementation and definition of the Rust trait system using a PROLOG-like logic solver
Home Page: https://rust-lang.github.io/chalk/book/
License: Other
In rustc we allow an universal binder to bind more than one name, such as for<'a, 'b> fn(&'a u32, &'b u32)
were both 'a
and 'b
have a DeBruijn index of 0
. In chalk we also allow you to write that, but when moved into the environment 'a
and 'b
will have distinct universes (see fn instantiate_binders_universally
and unify_forall_tys
).
Currently chalk represents universal type in the environment directly by the universe index, so forall<T> { Goal(T) }
becomes Goal(Ui)
once the binder is into the environment. We might want to instead represent them with an index into a table in the environment, like we do for existential variables today, this would be one way to allow us to associate the universal type not only with a universe but with a particular name within that universe. Or maybe we could inline it by carrying two indexes Uij
where j
is the relative index of the name within the universe.
A motivation would be to u-canonicalize forall<T, U> { Goal(T, U) }
and forall<T, U> { Goal(U, T) }
into the same thing, as it should be easier to put the universals in a canonical order when you know they are at the universe.
Or at least that's how I understood it, ping @nikomatsakis
We are still wrestling to some extent with the meaning of "unique" and how to handle negative reasoning. In #38, @aturon made a number of changes, but I have some reservations about some of the details of that PR. There are also some scenarions that are still not handled very well in that PR that we may want to consider.
When @aturon and I last spoke, we had decided to shift the approach of the PR in two ways -- at least, if I remember correctly. First, to introduce a CannotProve
variant, which is used to mean that the system cannot prove a particular goal, and never will be able to, but that the goal may hold for some instantiations of the universally quantified variables in scope (but it also may not hold for some instantiations). In particular, this would be returned when you attempt to unify (e.g.) !1
and !2
or !1
and Vec<T>
. Currently, those unification attempts "succeed" with the "ambiguous" flag set to true, and hence forall<T,U> { T = U }
yields an ambiguous result. Under this proposal it would yield CannotProve
.
The meaning of the various return values for proving goal G is thus:
In general, the modality controls whether we consider the set of impls/environmental clauses/etc to be a closed world.
Just adding CannotProve
alone doesn't quite give us enough resolution to always negate the way we might want to. In particular, not { G }
where G
yields CannotProve
also yields CannotProve
, and hence not { forall<T,U> { T = U } }
is also going to yield up "cannot prove" (as would forall<T,U> { not { T = U } }
). It is plausible that we could instead say that not { forall }
is Unique
and forall { not }
is Error
; but to handle it, we would have to track the "sense" of a goal more deeply (right now, the fulfillment context cannot distinguish between a not
inside or outside the forall
binder).
If we add CannotProve
, we ought to be able to remove the "special treatment" that we give to the environment right now (which makes me uncomfortable in any case, at least in its current form). In particular, we have some logic that makes an ambiguous unification in the environment become a "hard error", which implies that forall<T,U> if (T: Foo) { U: Foo }
will reject using T: Foo
, even though T = U
yields "ambiguous". Once we have CannotProve
, then T = U
can yield CannotProve
, and we are universally justified in ignoring proof routes that yield CannotProve
.
Some interesting benchmarks and examples to use to evaluate possible solutions:
Example 1: Maybe implements.
trait Foo {}
struct i32 {}
impl Foo for i32 {}
forall<T> { T: Foo }
forall<T> { not { T: Foo } }
It's vital here that we treat T
as possibly implementing Foo
due to the impls. So that's at least the minimum bar.
Example 2: negation and binders.
not { forall<T, U> { T = U } } // provably true
// vs
forall<T, U> { not { T = U } } // neither true nor false
Example 3: Normalization.
struct i32 { }
trait Iterator { type Item; }
impl Iterator for i32 { type Item = i32; }
Given the query forall<T> { if (T: Iterator<Item = i32>) { T: Iterator } }
, current code can fail with ambiguity. The environment supplies T: Iterator<Item = i32>
but the program clauses supply i32: Iterator<Item = i32>
.
Implicit type parameters (like Self
and others from the env) are put into a lowered item after the explicit type parameters, which is surprising.
For example, the following test fails:
program {
trait Combine { type Item<T>; }
struct u32 { }
struct i32 { }
struct Either<T, U> { }
impl Combine for u32 { type Item<U> = Either<u32, U>; }
impl Combine for i32 { type Item<U> = Either<i32, U>; }
}
goal {
exists<T, U> {
T: Combine<Item<U> = Either<u32, i32>>
}
} yields {
"Unique; substitution [?0 := u32, ?1 := i32]"
}
Instead of the expected result, we get [?0 := i32, ?1 := u32]
, which is technically correct because our GAT parameter U
is mapped to ?0
and T
is mapped to ?1
. But there's no way to see this from the output of chalk.
See for example: https://github.com/rust-lang-nursery/chalk/blob/94a1941a021842a5fcb35cd043145c8faae59f08/src/ir/lowering.rs#L183-L185
This will probably have to be changed in a bunch of places.
The first-order hereditary harrop predicates that Chalk uses allow for a fairly flexible definition of program clause:
Clause = DomainGoal | Clause && Clause | Goal => DomainGoal | ForAll { Clause }
The intention then is that implications in Goal
can reference these clauses in their full generality:
Goal = ... | Clause => Goal | ...
However, Chalk currently uses a rather simpler definition:
https://github.com/rust-lang-nursery/chalk/blob/master/src/ir/mod.rs#L776
This is equivalent to the following:
Clause = DomainGoal | Clause && Clause
We should generalize this to the full form -- or at least include ForAll
binders. This will require a few changes. For one thing, we'll have to change environments to store clauses, rather than just domain goals:
Then we have to update the chalk-slg
HhGoal
type in an analogous fashion:
Presumably extending the Context
trait with the notion of a clause as well:
Finally, we have to modify the Context::program_clauses
implementation:
In particular, this is the code that finds hypotheses from the environment:
Actually, this code is probably fine as is, we just have to (a) implement CouldMatch
for Clause
and (b) implement into_program_clause
for CouldMatch
.
(Note: It might be good to do #90 first, just to limit the amount of code affected.)
We don't currently do any WF checking. Specifically, it'd be nice to lower each struct/trait/impl into a predicate that, if provable, defines whether the declaration is well-formed. In most cases this is relatively straightforward, though depending how far we go we might need to extend the IR.
For example, structs don't currently list their fields, but the WF rules for structs require that the type of each field is WF. So if we had a struct:
struct Foo<T> {
b: Bar<T>
}
struct Bar<T> where T: Eq { t: T }
Then the WF predicate for Foo
might look like:
StructFooWF :-
forall<T> {
WF(Bar<T>)
}.
This relies on us having the rules for when types are well-formed (which...I think we do? if not, should open a bug on that), which would look like:
WF(Bar<T>) :- T: Eq.
In this case, that would be unprovable, and hence the struct Foo
is not considered well-formed (not without a T: Eq
where-clause, at least). If we added the T: Eq
where clause:
struct Foo<T> where T: Eq { b: Bar<T> }
then we would have:
StructFooWF :-
forall<T> {
if (T: Eq) {
WF(Bar<T>)
}
}.
and everything is provable again.
Running cargo install
installs a binary called "repl" which is not great. Let's rename it so the user can run the repl with the command "chalk" rather than "repl".
The code here looks suspect. It appears to assume that the inference table keys generated will go from 0..q.binders.len()
, but this doesn't happen since difference kinds of variables (types, lifetimes, crates) are indexed separately.
If this is indeed a bug, you'd expect it to trigger panics due to not being able to find keys. I'm working toward a test case illustrating that behavior.
During solve_one()
, fulfill invokes self.infer.unify()
rather than self.unify()
. This means that some obligations are dropped onto the floor. Easy to fix, but I wanted to write a test that would expose it.
No good reason not to implement this that I can think of. But regardless should add a test first.
The following code passes lowering:
struct Foo { }
trait Bar { type Item<T>; }
impl Bar for Foo {
type Item = Foo; // Should be a "wrong number of parameters for associated type" error
}
I believe the relevant code is here. Edit: although perhaps I've misunderstood. That code isn't even getting called in this case. It only seems to be getting called when associated types are used in impl
headers, like so: impl<X> Foo for <X as Iterator>::Item where X: Iterator { }
The following program panics with the error message "zipping things of mixed type", after outputting a = '?0, b = Array3<i32>
.
trait StreamingIterator {
type Item<'a>;
}
trait IntoStreamingIterator where Self::IntoIter: StreamingIterator {
type Item<'a>;
type IntoIter;
}
impl<S> IntoStreamingIterator for S where S: StreamingIterator {
type Item<'a> = S::Item<'a>;
type IntoIter = S;
}
struct i32 {}
struct usize {}
struct Ref<'a, T> {}
struct Array3<T> {_1: T, _2: T, _3: T}
struct Array3IntoIter<T> {
array: Array3<T>,
index: usize
}
impl<T> StreamingIterator for Array3IntoIter<T> {
type Item<'a> = Ref<'a, T>;
}
impl<T> IntoStreamingIterator for Array3<T> {
type Item<'a> = Ref<'a, T>;
type IntoIter = Array3IntoIter<T>;
}
// goal: Array3<i32>: IntoStreamingIterator
output including backtrace
$ RUST_BACKTRACE=1 cargo run -- --program=streaming_iterator.chalk --goal='Array3<i32>: IntoStreamingIterator'
Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/chalki --program=streaming_iterator.chalk '--goal=Array3<i32>: IntoStreamingIterator'`
a = '?0, b = Array3<i32>
thread 'main' panicked at 'zipping things of mixed kind', src/zip.rs:260:16
stack backtrace:
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:69
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:58
at src/libstd/panicking.rs:381
3: std::panicking::default_hook
at src/libstd/panicking.rs:397
4: std::panicking::begin_panic
at src/libstd/panicking.rs:577
5: std::panicking::begin_panic
at /Users/travis/build/rust-lang/rust/src/libstd/panicking.rs:538
6: <chalk::ir::ParameterKind<T, L> as chalk::zip::Zip>::zip_with
at src/zip.rs:260
7: <[T] as chalk::zip::Zip>::zip_with
at src/zip.rs:83
8: <alloc::vec::Vec<T> as chalk::zip::Zip>::zip_with
at src/zip.rs:72
9: <chalk::ir::UnselectedProjectionTy as chalk::zip::Zip>::zip_with
at src/zip.rs:155
10: <chalk::ir::UnselectedNormalize as chalk::zip::Zip>::zip_with
at src/zip.rs:155
11: <chalk::ir::DomainGoal as chalk::zip::Zip>::zip_with
at src/zip.rs:189
12: <T as chalk::ir::could_match::CouldMatch<T>>::could_match
at src/ir/could_match.rs:12
13: <T as chalk::ir::could_match::CouldMatch<T>>::could_match
at src/ir/could_match.rs:53
14: chalk::solve::recursive::Solver::solve_new_subgoal::{{closure}}
at src/solve/recursive/mod.rs:232
15: <core::iter::Filter<I, P> as core::iter::iterator::Iterator>::next
at /Users/travis/build/rust-lang/rust/src/libcore/iter/mod.rs:1221
16: <core::iter::Cloned<I> as core::iter::iterator::Iterator>::next
at /Users/travis/build/rust-lang/rust/src/libcore/iter/mod.rs:443
17: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T, I>>::from_iter
at /Users/travis/build/rust-lang/rust/src/liballoc/vec.rs:1802
18: <alloc::vec::Vec<T> as core::iter::traits::FromIterator<T>>::from_iter
at /Users/travis/build/rust-lang/rust/src/liballoc/vec.rs:1715
19: core::iter::iterator::Iterator::collect
at /Users/travis/build/rust-lang/rust/src/libcore/iter/iterator.rs:1298
20: chalk::solve::recursive::Solver::solve_new_subgoal
at src/solve/recursive/mod.rs:228
21: chalk::solve::recursive::Solver::solve_reduced_goal
at src/solve/recursive/mod.rs:162
22: chalk::solve::recursive::fulfill::Fulfill::prove
at src/solve/recursive/fulfill.rs:195
23: chalk::solve::recursive::fulfill::Fulfill::fulfill
at src/solve/recursive/fulfill.rs:290
24: chalk::solve::recursive::fulfill::Fulfill::solve
at src/solve/recursive/fulfill.rs:337
25: chalk::solve::recursive::Solver::solve_via_implication
at src/solve/recursive/mod.rs:376
26: chalk::solve::recursive::Solver::solve_from_clauses
at src/solve/recursive/mod.rs:337
27: chalk::solve::recursive::Solver::solve_new_subgoal
at src/solve/recursive/mod.rs:236
28: chalk::solve::recursive::Solver::solve_reduced_goal
at src/solve/recursive/mod.rs:162
29: chalk::solve::recursive::fulfill::Fulfill::prove
at src/solve/recursive/fulfill.rs:195
30: chalk::solve::recursive::fulfill::Fulfill::fulfill
at src/solve/recursive/fulfill.rs:290
31: chalk::solve::recursive::fulfill::Fulfill::solve
at src/solve/recursive/fulfill.rs:337
32: chalk::solve::recursive::Solver::solve_via_implication
at src/solve/recursive/mod.rs:376
33: chalk::solve::recursive::Solver::solve_from_clauses
at src/solve/recursive/mod.rs:337
34: chalk::solve::recursive::Solver::solve_new_subgoal
at src/solve/recursive/mod.rs:236
35: chalk::solve::recursive::Solver::solve_reduced_goal
at src/solve/recursive/mod.rs:162
36: chalk::solve::recursive::fulfill::Fulfill::prove
at src/solve/recursive/fulfill.rs:195
37: chalk::solve::recursive::fulfill::Fulfill::fulfill
at src/solve/recursive/fulfill.rs:290
38: chalk::solve::recursive::fulfill::Fulfill::solve
at src/solve/recursive/fulfill.rs:337
39: chalk::solve::recursive::Solver::solve_canonical_goal
at src/solve/recursive/mod.rs:108
40: chalk::solve::infer::InferenceTable::universe_of_unbound_var
at src/solve/recursive/mod.rs:96
41: chalk::solve::infer::unify::OccursCheck::new
at src/solve/mod.rs:228
42: chalki::goal
at src/bin/chalki.rs:231
43: chalki::run::{{closure}}
at src/bin/chalki.rs:125
44: chalk::ir::tls::set_current_program::{{closure}}
at ./src/ir/tls.rs:23
45: <std::thread::local::LocalKey<T>>::try_with
at /Users/travis/build/rust-lang/rust/src/libstd/thread/local.rs:377
46: <std::thread::local::LocalKey<T>>::with
at /Users/travis/build/rust-lang/rust/src/libstd/thread/local.rs:288
47: chalk::ir::tls::set_current_program
at ./src/ir/tls.rs:21
48: chalki::run
at src/bin/chalki.rs:123
49: chalki::main
at ./<quick_main macros>:4
50: panic_unwind::dwarf::eh::read_encoded_pointer
at src/libpanic_unwind/lib.rs:99
51: <std::rand::reader::ReaderRng<R> as rand::Rng>::fill_bytes
at src/libstd/panicking.rs:459
at src/libstd/panic.rs:361
at src/libstd/rt.rs:59
52: chalki::main
We should add some tests for the scenario where there are two equally applicable impls (for marker traits). This ought to be supported by the codebase -- it should refuse to infer details from either one, but if there are multiple sets that match a single set of types, that's ok.
The following works in Rust, but not in Chalk:
trait DynSized {}
trait Sized where Self: DynSized {}
impl<T> DynSized for T where T: Sized {}
struct i32 {}
impl Sized for i32 {}
For the goal i32: DynSized
, Chalk currently returns "No possible solution", and it should return "Unique".
Currently we're using a fairly naive selection process but it'd be nice to prototype a decision tree-- eventually I'd like to compile everything down to bytecode.
Currently we check that when you instantiate a struct or trait you supply the right number of parameters, but we don't check that they have the right kind (i.e., type vs lifetime). I added an (ignored) test check_struct_kinds
that demonstrates this. It'd be nice to fix this.
To fix it, one would go into chalk-rust/src/lower/mod.rs
and look for places where we check the number of parameters, e.g.:
if args.len() != info.addl_parameter_kinds.len() {
bail!("wrong number of parameters for associated type (expected {}, got {})",
info.addl_parameter_kinds.len(), args.len())
}
and adapt those to also check that the kinds
in args
match the kinds of info.addl_parameter_kinds
(in that particular case).
With work on const generics underway in rustc, we'll want to make sure Chalk also supports them.
I've started looking into this here, starting off by extending ParameterKind
and filling in things from there. I'll give an update when progress is a little further along.
A Prolog-ish interpreter written in Rust, intended perhaps for use in the compiler, but also for experimentation.
This makes it sound like a (relatively) general-purpose logic programming language, perhaps with a fairly Prolog-like syntax, that would be useful for the Rust compiler, and possibly unrelated purposes.
Looking at libstd.chalk
however, shows this is not the case, since it's input is (a subset of) Rust syntax, and not some generic language.
I'm not sure exactly how it should be described, but this should probably be clarified.
Related to #11, we don't really do anything clever with where clauses on traits/structs. We need to support "elaboration" -- ideally, a richer form than what the current compiler supports. In particular, we should be able to show that e.g. T: Ord => T: PartialOrd
(supertrait) but also any other where-clause on traits, and in particular I'd like a better form of support around projections of associated types (e.g., T: Foo<Bar = U>
should let us prove about U
whatever we can prove about <T as Foo>::Bar
).
Chalk lowering grew rather organically and is kind of a mess. It should be separated out into distinct modules and phases. There was some recent discussion from the WG-traits gitter channel laying out some of the problems and ideas.
The grammar needs to be extended to support bounds and (quantified) where clauses on associated types like in:
trait Foo {
type Item<'a, T>: Clone where Self: Sized;
}
The grammar is located parser.lalrpop, and the AST in ast.rs.
When this is done, this item:
https://github.com/rust-lang-nursery/chalk/blob/eeb21829c1e847921261ad820b5b2ec35b649c76/chalk-parse/src/parser.lalrpop#L204-L207
won't be needed anymore because we will be using QuantifiedWhereClauses
everywhere.
We would also remove where clauses on associated type values because they are useless:
https://github.com/rust-lang-nursery/chalk/blob/eeb21829c1e847921261ad820b5b2ec35b649c76/chalk-parse/src/parser.lalrpop#L104-L111
Currently, a where clause of the form where T: Foo<Item = U>
translated to a single domain goal ProjectionEq(<T as Foo>::Item = U)
. We would need this clause to be translated to two distinct domain goals:
ProjectionEq(<T as Foo>::Item = U)
as beforeImplemented(T: Foo)
Currently, we cannot translate one AST where clause to multiple IR domain goals. This means that the following trait:
https://github.com/rust-lang-nursery/chalk/blob/eeb21829c1e847921261ad820b5b2ec35b649c76/src/lower/mod.rs#L437-L439
should be changed into something like:
trait LowerWhereClause<T> {
fn lower(&self, env: &Env) -> Result<Vec<T>>;
}
Now we should be in a position where we can implement the various rules written in this comment (niko's comment just below can be ignored, it was basically merged into the main one). This includes:
Note: the organization of these two files will change after #114 is fixed, so the preceding text may become obsolete.
At that point, the domain goal WellFormed(ProjectionEq)
will be unused. See #115 for the next steps.
cc @rust-lang-nursery/wg-traits
Of course, it may be best to just re-implement this code in the compiler, but it's also worth considering if we can abstract away the current IR using traits to allow for integration into the compiler.
So given:
trait Foo {
type Item;
}
then the goal forall<T> { if (T: Foo, InScope(Foo)) { T: Foo<Item = T::Item> } }
or equivalently, forall<T> { if (T: Foo, InScope(Foo)) { <T as Foo>::Item = T::Item } }
answers No possible solution
.
It seems like a "fallback" rule (i.e. which would normalize <T as Foo>::Item
to T::Item
) is missing.
The following items are on hold until I learned more about logic and types. This thing is still like alien technology for me ๐
#![warn(missing_docs)]
.It seems like it's time to get rid of Chalk's "recursive" solver and commit to the on-demand SLG solver. It's basically strictly better at this point.
The recursive solver code is found here:
https://github.com/rust-lang-nursery/chalk/tree/master/src/solve/recursive
We'll also want to remove this SolverChoice
variant:
Which will entail then modifying the unit tests. They mostly run on both solvers, as controlled here:
But sometimes the tests have distinct results for the recursive vs SLG solvers:
And of course we'll have to modify chalki
, the command-line interpreter too. We can just remove this arm:
though we may just want to remove the whole --solver
option.
Right now the ir::Program
struct contains a lot of fields that encode the lowered Rust program:
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct Program {
/// From type-name to item-id. Used during lowering only.
pub type_ids: HashMap<Identifier, ItemId>,
/// For each struct/trait:
pub type_kinds: HashMap<ItemId, TypeKind>,
/// For each impl:
pub impl_data: HashMap<ItemId, ImplDatum>,
/// For each trait:
pub trait_data: HashMap<ItemId, TraitDatum>,
/// For each trait:
pub associated_ty_data: HashMap<ItemId, AssociatedTyDatum>,
/// Compiled forms of the above:
pub program_clauses: Vec<ProgramClause>,
}
Since the trait solving routines in solve
get access to an Arc<Program>
, this suggests that they need all of this information. But in fact they do not. The goal is that they only need program_clauses
, although it may be useful to have a bit more (basically, it depends a bit on how much code we lower to explicit program-clauses and how much code we choose to keep in Rust code).
However, these fields are used during lowering. And in particular we lower the input program and then, in the unit tests, save that state and come back and lower various goals.
I think we should introduce a new struct, let's call it the SharedEnvironment
or ProgramEnvironment
, which is used by the solving routines and contains only those fields needed by solving.
Looking through the code that would be two fields:
program_clauses
trait_data
-- this one is needed because of our current approach to #12, which I am not happy withSo we could refactor Program
like so:
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct Program {
/// From type-name to item-id. Used during lowering only.
pub type_ids: HashMap<Identifier, ItemId>,
/// For each struct/trait:
pub type_kinds: HashMap<ItemId, TypeKind>,
/// For each impl:
pub impl_data: HashMap<ItemId, ImplDatum>,
/// For each trait:
pub associated_ty_data: HashMap<ItemId, AssociatedTyDatum>,
/// Data used by the solver routines:
pub program_env: Arc<ProgramEnvironment>
}
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct ProgramEnvironment {
/// For each trait:
pub trait_data: HashMap<ItemId, TraitDatum>,
/// Compiled forms of the above:
pub program_clauses: Vec<ProgramClause>,
}
Here I moved the trait_data
field into the program-env. Another option would be to duplicate it. In that case, I suspect that Program
would not need to have a link to the program-env at all. Instead, lowering could return a pair of (Program, ProgramEnvironment)
.
We would then also change the solve
routines to take an Arc<ProgramEnvironment>
instead of Arc<Program>
.
I'm tagging this as help-wanted since it seems like a relatively good "intro bug".
After GATs are implemented and if everything goes well, we won't need WellFormed(WhereClauseAtom::ProjectionEq)
abstraction, defeating the initial purpose of the WhereClauseAtom
abstraction.
I propose that we instead re-use WhereClauseAtom
to include every where clause that can effectively be written by a Rust programmer, e.g. it should include:
Implemented(T: Foo)
ProjectionEq(<T as Foo>::Item = U)
T: 'a
(not implemented for the moment)'a: 'b
Then, the various IR constructs which represent real rust code like:
https://github.com/rust-lang-nursery/chalk/blob/eeb21829c1e847921261ad820b5b2ec35b649c76/src/ir/mod.rs#L202-L207
should carry where clauses of the type Binders<WhereClauseAtom>
(which could be aliased into QuantifiedWhereClauseAtom
) instead of the current type QuantifiedDomainGoal
.
This will enable rewriting some match
blocks where we used to do things like:
WellFormed(..) | ... | => panic!("unexepected"),
or
WellFormed(..) | ... | => (),
e.g. like in
https://github.com/rust-lang-nursery/chalk/blob/eeb21829c1e847921261ad820b5b2ec35b649c76/src/lower/wf.rs#L131-L147
Bikeshedding: we could also just use the name WhereClause
instead of WhereClauseAtom
, like in the old days.
Another question left open: should we merge the current WellFormed
/ FromEnv
predicate -- which would then deal only with trait references -- with WellFormedTy
/ FromEnvTy
?
cc @rust-lang-nursery/wg-traits
Extend the code with the ability to do a coherence check. I'm not exactly sure what this should look like. To cover the orphan check, we'd have to include a knowledge of crates, but I think I'm more interested in examining the overlap check. That seems to imply a certain amount of negative reasoning (though that also includes some knowledge of crates).
Given the following traits:
trait Bar { }
trait Foo { type Item<T>: Bar; }
and the following goal:
forall<T, U> {
if (T: Foo) {
<T as Foo>::Item<U>: Bar
}
}
we give No solution
whereas this should be Unique
. Note that if we remove the generic parameter from Item
, this works fine.
Basically at some point, the projection type <T as Foo>::Item<U>
fails to unify with the skolemized type, as we can see in the debug log:
| resolvent_clause(
: goal=ProjectionEq(<!1 as Foo>::Item<!2> = (Foo::Item)<?0, !1>),
: clause=ForAll(for<type, type> ProgramClauseImplication { consequence: ProjectionEq(<?1 as Foo>::Item<?0> = (Foo::Item)<?0, ?1>), conditions: [] })) {
: | instantiate(arg=ProgramClauseImplication { consequence: ProjectionEq(<?1 as Foo>::Item<?0> = (Foo::Item)<?0, ?1>), conditions: [] })
: | new_variable: var=?1 ui=U2
: | new_variable: var=?2 ui=U2
: | instantiate: vars=[?1, ?2]
: | consequence = ProjectionEq(<?2 as Foo>::Item<?1> = (Foo::Item)<?1, ?2>)
: | conditions = []
: | unify(a=ProjectionEq(<!1 as Foo>::Item<!2> = (Foo::Item)<?0, !1>),
: : b=ProjectionEq(<?2 as Foo>::Item<?1> = (Foo::Item)<?1, ?2>)) {
: : | unify_ty_ty(a=!2,
: : : b=?1) {
: : : | unify_var_ty(var=?1, ty=!2)
: : : | unify_var_ty: var ?1 set to !2
: : | }
: : | unify_ty_ty(a=!1,
: : : b=?2) {
: : : | unify_var_ty(var=?2, ty=!1)
: : : | unify_var_ty: var ?2 set to !1
: : | }
: : | unify_ty_ty(a=(Foo::Item)<?0, !1>,
: : : b=(Foo::Item)<?1, ?2>) {
: : : | unify_ty_ty(a=?0,
: : : : b=!2) {
: : : : | unify_var_ty(var=?0, ty=!2)
: : : | }
: : | }
: | }
| }
Note that the following answers Unique
as expected:
forall<T, U, V> {
if (T: Foo<Item<U> = V>) {
V: Bar
}
}
Introduce support for specialization between impls. Probably best to try and implement the current RFC before considering variations.
One interesting question is how the occurs check interacts with lazy normalization. Consider the inference unit test projection_eq
. It attempts to solve an equation like exists(A -> A = Item0<<A as Item1>::foo>)
, which is currently rejected. In some sense, this is obviously an infinitely sized type, and indeed <A as Item1>::foo
might normalize to A
. But it could also normalize to u32
, in which case normalization would make the type finite size.
Perhaps the best answer is to substitute a new variable (and a new normalization obligation). So that after processin that result, A
would be bound to A = Item0<B>
where <A as Item1::Foo> == B
? If B
wound up including A
, then the occurs check would fail then.
(We just have to be wary of this recurring infinitely...)
Given the traits/types found in this gist:
https://gist.github.com/nikomatsakis/58717b0588393ca14f2d6168e7c86188
Why do we get ambiguity here?
> target/debug/chalki --program=$HOME/tmp/futures-model.chalk --goal='forall<T> { if (T: FutureResult) { exists<I, E> { T: Future<Output = Result<I, E>> } } }'
Ambiguous; no inference guidance
Note that if we (hackily) don't force the solver to give us back a result for I
and E
, we get Unique
as expected:
> target/debug/chalki --program=$HOME/tmp/futures-model.chalk --goal='forall<T> { if (T: FutureResult) { T: Future, exists<I, E> { T: Future<Output = Result<I, E>> } } }'
Unique; substitution [], lifetime constraints []
cc @scalexm, who wanted to investigate this
The current region constraints aren't quite expressive enough and have some flaws. After chatting some with @aturon, the current plan is to shoot for the following structure. First off, chalk is implementing a function Solve
that has roughly this structure:
Solve(Env, Goal) = LifetimeGoal
a successful result is meant to imply that (in purely logical terms):
Env, LifetimeGoal |- Goal
This should always be achievable, since in the limit we can make LifetimeGoal
be false
. Note that we have to be careful around "negation-as-failure" style negative reasoning here, but factoring out the imprecison into LifetimeGoal
actually helps.
What is the grammar then of our LifetimeGoal
? I want to first lay out the complete grammar. I think then we can imagine an approximation function approx(LG) = LG'
that e.g. eliminates disjunctions:
LG = for<'a...'z> LG
| exists<'a...'z> LG
| if(LC) LG
| LG, LG
| LG; LG
| LA
LC = LC, LC
| for<'a...'z> LC
| if(LG) LC
| LA
LA = 'a: 'b
| 'a = 'b // clearly not a fundamental goal but convenient
In some cases, when doing unification, we'll need to generate new region variables at outer scopes. Some examples:
exists<A> for<'l> A = &'l T
The idea was to transform this by first introducing a variable 'x
in same universe as A
:
exists<'x> exists<A> for<'l> A = &'l T
then saying that A = &'x T
with the resulting goal of:
for<'l> 'x = 'l
More generally, we could capture whatever environment is "in scope" and insert it into the goal. This seems to imply that we need (for the lifetime clauses) as rich a grammar as our environment supports.
Next example:
for<'a> exists<T> for<'b> if('a = 'b) T = &'b i32
To resolve this, we would introduce an existential in the same universe as T
:
for<'a> exists<'x> exists<T> for<'b> if('a = 'b) T = &'b i32
which then allows us to say that T = &'x i32
with the (ultimately) lifetime goal:
for<'a> exists<'x> for<'b> if('a = 'b) 'b = 'x
The current coherence rules do not have any special cases for "marker traits" -- that is, traits with zero items. As discussed in #8, at least for now it'd be nice to have "marker-ness" be declared explicitly with #[marker]
.
Here are some steps to do this:
We want to model this on the support for #[auto]
trait declarations. This means we would:
auto
)TraitDatumBound
struct, again, like auto
visit_specializations
function, we can basically just ignore marker traits, I believeIt might be nice, rather than just adding another boolean field, to introduce some sort of TraitFlags
type (e.g., struct TraitFlags { auto: bool, marker: bool }
) into the AST and copy that over into ir::TraitDatumBound
? Or at least I don't love having a bunch of random boolean variables floating around.
In working on #73, I encountered a problem with the concept of a "fallback clause", which is currently a key part of how we handle normalization. The idea of a fallback clause is that it is a clause that we use if no other clauses apply. That seems fine but it's actually too weak: we wind up coming up with "unique" solutions that are not in fact unique.
Consider this example:
trait Iterator { type Item; }
struct Foo { }
struct Vec<T> { }
a
and this goal:
forall<T> {
if (T: Iterator) {
exists<U> {
exists<V> {
U: Iterator<Item = V>
} } } }
Here, there are two values for U
and V
:
exists<W> { U = Vec<W>, V = W }
U = T, V = T::Item
However, our current system will select the first one and be satisfied. This is because the second one is considered a "fallback" option, and hence since the first one is satisfied, it never gets considered. This is not true of the SLG solver, since I never could figure out how to integrate fallback into that solver -- for good reasons, I think.
I have a branch de-fallback that adresses this problem by removing the notion of fallback clauses. Instead, we have a new domain goal, "projection equality", which replaces normalization in a way (though normalization, as we will see, plays a role). When we attempt to unify two types T and U where at least one of those types is a projection, we "rewrite" to projection equality (really, we could rewrite all of unification into clauses, but I chose the more limited path here, since otherwise we'd have to handle higher-ranked types and substitution in the logic code).
Projection equality is defined by two clauses per associated item, which are defined when lowering the trait (well, when lowering the declaration of the associated item found in the trait). The first clause is what we used to call the "fallback" rule, basically covering the "skolemized" case:
forall<T> {
ProjectionEq(<T as Iterator>::Item = (Iterator::Item)<T>)
}
The second clause uses a revised concept of normalization. Normalization in this setup is limited to applying an impl to rewrite a projection to the type found in the impl (whereas before it included the fallback case):
forall<T, U> {
ProjectionEq(<T as Iterator>::Item = U) :-
Normalizes(<T as Iterator>::Item -> U)
}
Both of these rules are created when lowering the trait. When lowering the impl, we would make a rule like:
forall<T> {
Normalizes(<Vec<T> as Iterator>::Item -> T) :-
(Vec<T>: Iterator).
}
This all seems to work pretty well. Note that ProjectionEq
can never "guess" the right hand side unless normalization is impossible: that is exists<X> { ProjectionEq(<Vec<i32> as Iterator>::Item = X) }
is still ambiguous. But if you want to force normalize, you can use the Normalizes
relation (which would only be defined, in that example, wen X = i32
).
However, the tests are currently failing because we are running into problems with the implied bounds elaborations and the limitations of the recursive solver. (The SLG solver handles it fine.) In particular, there is a rule that looks something like this:
forall<T, U> {
(T: Iterator) :-
ProjectionEq(<T as Iterator>::Item = U)
}
This is basically saying, if we know (somehow) that T::Item
is equal to U
, then we know that T
must implement Iterator
. It's reasonable, but it winds up causing us a problem. This is because, in the recursive solver, when we are doing normalization, we apply the normalization clause cited above, and then must prove that (Vec<T>: Iterator)
. To do that, we turn to our full set of clauses, which includes a clause from the impl (which is the right one) and the clause above. The reverse elaboration clause always yields ambiguous -- this is because there is no unique answer to ProjectionEq
, and U
is unconstrained.
I'm not 100% sure how to resolve this problem right now, so I thought I'd open the issue for a bit of discussion.
Deref is not well integrated into typeck, all derefs must be eagerly checked causing problems even in a simpe program such as:
use std::sync::Arc;
fn main() {
let mut a = Default::default();
// Works if the lines are swapped.
let b = *a;
a = Arc::new(0);
}
The type of a
will not be inferred because derefs error if they find a type variable. There are probably good reasons of why this difficult to support, but hopefully chalk has a better chance at supporting derefs as obligations. I'll start by implementing Deref
as domain goal.
The on-demand solver, when faced with an X-clause, currently always selects the "last" literal in the list. This is pretty silly. It would be smarter to look for literals that are closer to being solved, and maybe make some heuristics that look at the traits involved.
I think it might be useful to have a rough idea of the "inputs" and "outputs" from given domain goals (many prologs have modal annotations of this kind, I've not looked deeply at how they work). So for example given this set of pending goals:
NormalizeTo(<T as Bar>::Item -> ?U)
?U: Foo
It would be better to start with the normalization, since otherwise we would wind up enumerating all type variables that implement Foo
.
In fact, targeting this scenario is the primary rule we need, I think, given Rust's limitations around constrained type parameters on impls. (Rustc has some similar logic already, actually.)
A description of what coercions do can be found in the reference and the nomicon.
In many places Rust only requires that one type coerces to another, but there is no way to explain that to the inference engine which only understands type equality. So if there are multiple types being coerced to a single target type the current rustc implementation will sometimes greedily unify the first type it sees.
First a trivial example, in let mut a = &&0; a = &0;
the type of a
should be inferred to &i32
not &&i32
. It's the same with generic functions:
fn foo<T>(_: T, _: T) {}
fn main() {
let x: u8 = 0;
foo(&&x, &x); // Fails, `T = &&u8`.
foo(&x, &&x); // Works, `T = &u8`.
}
And closures, here with &mut
to &
coercions:
fn main() {
let mut a = 0;
let b = 0;
let c = |x| {};
// Fails because c is inferred to Fn(&mut i32),
// but works if lines below swapped and c is inferred to Fn(&i32).
c(&mut a);
c(&b);
}
For some cases rustc already improves on this with special treatment for coercing many expressions to a single target type (see CoerceMany
in coercion.rs
for details), so for example both if true { &0 } else { &&0 }
and if true { &&0 } else { &0 }
will work with resulting type &i32
.
If we can model a "T
coerces to U
" relationshig in chalk it would be a major step towards improving the situation and integrating coercions within the inference engine. Before diving into formulations for Chalk, I'll try to put together a description of how the current rustc code handles this in coercion.rs.
I've read that chalk will be used internally by rustc for many things, but I Wonder if there is any plan for adding to rust (not only MIR) full/partial support for the logical programming paradigm?
https://en.wikipedia.org/wiki/Logic_programming?wprov=sfla1
I feel like this paradigm offer many possibilities (AI, etc) but it is not widely used because there is no great language and fast language that support it.
Disclaimer : I may say bullshit.
In @aturon's epic blog post Negative reasoning in chalk, he described using modal logic to model coherence. This seems like a great idea, but we've not done it in chalk. Let's get it done!
I guess the first thing to narrow down is just what we want. The overall goal is that certain kinds of negative queries fail. This is roughly corresponding to the logic described in RFC 1023, though we should also consider how the code works, as there've been a few bug fixes and things in the meantime (and the definition of "orphans" is a bit tricky too). But -- intuitively -- the goal is to limit reasoning about types/traits external to your crate. So e.g. if we ask String: Iterator
-- and we are not compiling libstd :) -- this would come back as "maybe", reflecting the idea that maybe we'll add an impl like that in the future.
I'll leave it here and put the rest in comments.
Currently, when asked to solve
, the on-demand solver requests answers until the aggregate result becomes "trivial" (i.e., no information is gained). This occurs in make_aggregate
:
Notably right here:
This is the key to e.g. why the only_demand_so_many
test only, well, demands "so many" answers. But the current check is not that smart.
For example, the goal exists<T> { T: Foo }
on this input will still enumerate a ton of answers:
trait Sized { }
trait Foo { }
struct Vec<T> { }
impl<T> Sized for Vec<T> where T: Sized { }
impl<T> Foo for Vec<T> where T: Sized { }
struct i32 { }
impl Sized for i32 { }
struct Slice<T> { }
impl<T> Sized for Slice<T> where T: Sized { }
The problem here is that the stream of answers looks like:
Vec<i32>
Vec<Vec<i32>>
Vec<Slice<i32>>
Vec<Vec<Vec<i32>>
Hence our aggregate suggestion will quickly reach Vec<_>
and stay there. It will never become truly trivial.
But we can easily test this! We could for example, before requesting the next solution, check whether there exist any strands that could violate the current aggregate (which would be Vec<_>
). If not, we don't need to request the next answer, it won't give us anything new we've not already seen.
One thing that the on-demand solver does not support yet is simplification of delayed literals. In the case of negative loops we sometimes have to delay processing a given literal, and simplification is what comes back in the end to clean up the mess.
It's not clear how imp't this is: it an only arise with negative cycles, which are not yet possible in Rust since negative clauses never appear in goals, and may never be possible (the fact that we would have to deal with negative cycles is a good argument against adding such goals).
My strategy for doing it was going to be sort of crude: when we encounter an answer with a delayed literal (other than CannotProve, which can never be simplified), we have to first "force complete" the table. This means that we run until all strands are done, and then we further go over all the answers, find all the delayed literals we may need to simplify, and "force complete" their tables (this process may recurse, of course). Once this is done, we can run the simplification procedure (here is the simplification code from the eager SLG solver, for reference, which includes a writeup).
This is roughly where we would want to make this change:
The occurs check code in src/solve/infer/unify.rs
(specifically, the methods defined on OccursCheck
) follows a very folder-like pattern. Unfortunately, it can't quite use the Folder
trait (as defined in fold.rs
) because that trait only contains "callback methods" for processing free lifetime/type/krate variables. The occurs check needs to be able intercept all types, at least, as well as names that live in a universe and a few other things. We could add callback methods for those things to Folder
.
I imagine the resulting trait would look like:
pub trait Folder {
// Methods today:
fn fold_free_var(&mut self, depth: usize, binders: usize) -> Result<Ty>;
fn fold_free_lifetime_var(&mut self, depth: usize, binders: usize) -> Result<Lifetime>;
fn fold_free_krate_var(&mut self, depth: usize, binders: usize) -> Result<Krate>;
// Methods we need:
fn fold_ty(&mut self, ty: &Ty, binders: usize) -> Result<Ty>;
fn fold_lifetime(&mut self, lifetime: &Lifetime, binders: usize) -> Result<Lifetime>;
}
One thing that I'm not crazy about is that fold_ty
invokes fold_free_var
etc, at least in its default incarnation, so putting them into the same trait might mean that if you override fold_ty
you would not need the other methods. So it might be better to have Folder
just include the higher-level methods (e.g., fold_ty
) and then have a new trait VarFolder
where we define a bridge impl like impl<T: VarFolder> Folder for T
. Or something like that.
I know this is supposed to be more-or-less a typesystem for rust.
I know this is a Prolog-like language.
I know how to use Prolog decently well.
I want to help.
I especially want to help write documentation.
But:
--help
option mentions a .chalk
file, but there's no explanation anywhere of how that file is formatted, what it contains, or anything..chalk
files anywhere..chalk
files from rust code or anything like that..chalk
files.Could someone give me some pointers, maybe a quick summary of what this project does and how to use it? I can totally start documenting this stuff, it looks really cool, I just need a place to start.
We can implement support for generic associated types / ATC in a relatively straightforward fashion.
Currently we treat cycles (i.e. seeing the same clause twice) as Ambiguous
. However, this is not desirable as in the presence of tautologies like (u8: Foo) :- (u8: Foo)
(which may appear while elaborating clauses ร la #12), Chalk will answer Ambiguous
to the query u8: Foo
when u8
does not implement Foo
, whereas we would prefer an error.
We can't treat cycles as errors or else the following query:
trait Foo { }
struct S<T> { }
struct i32 { }
impl<T> Foo for S<T> where T: Foo { }
impl Foo for i32 { }
exists<T> { T: Foo }
would return Unique; substitution [?0 := i32]
whereas it should be ambiguous because there is an infinite family of solutions.
A possible strategy for handling cycles would be to use tabling in order to feed back the cycles and possibly produce new answers until we reach a fixed point, in which case we know that we have all the answers.
Right now in Chalk we don't have support for types like T::Item
; we always require fully explicit types like <T as Iterator>::Item
. In particular, the trait is always known. So for example in the AST node, we know the trait. In the lowered IR, a projection references the id of a particular trait item, and hence implicitly specifies a particular trait.
The rules in Rust are a bit hackily implemented but the idea is that we should permit T::Item
only if there is a unique trait that could work, selected from the set of in-scope traits. I think the first step for this issue therefore is to model in-scope traits somehow. I'd probably want to do this with a new kind of DomainGoal
, something like InScope(ItemId)
, where ItemId
is the id of a trait. We would have to give this some syntax too (e.g., InScope(Iterator)
). Then we can model which traits are in scope by adding those facts into the environment (e.g., if (InScope(Iterator)) { .. })
).
Here are the steps to add InScope
:
WhereClause
struct to add something for InScope(...)
.DomainGoal
to include the new case InScope(ItemId)
, as described before.DomainGoal
to handle that case as well as the lowering code that converts into a LeafGoal
(which can just invoke the other case, like this arm does.InScope(Foo)
cannot be proven, but if (InScope(Foo) { InScope(Foo) }
can.Once that work is done, we can do the next half of using the in-scope facts to do something useful.
The Solver::solve()
requests are intended to be cached, but we're not doing it yet.
Much of the plumbing for lifetime inference is there, but still work to do. In particular we have to move the constraints
vector from infer
into fulfill
and then populate it during unification.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.