I noticed in the paper that it says:
We train our models to work with both PSM (Prefix-Suffix-Middle) and SPM (Suffix-Prefix-Middle) modes, with relevant formatting control tokens,
Have you noticed any difference in accuracy from PSM vs SPM completion? I have seen PSM used before, but I like the idea of SPM. Since the model would be completing almost directly from the prefix, I wonder if it would generate higher quality completions than PSM. If there is no difference, that would be cool to know as well.
I'm still waiting on llama.cpp support to fully materialize before I can try out these models, but they look really nice!