Comments (36)
@neongreen, You have proposed many useful things, without these propositions Megaparsec would be different. I consider it significant contribution to the project. I would like to include you into the list of contributors, should I include you as "Artyom"?
from megaparsec.
@mrkkrp, yep (and thanks). Just “Artyom” would be ambiguous, but since we're on Github anyway, you could just add a link to my Github nick:
Artyom (@neongreen)
from megaparsec.
@neongreen, Can you point out where (<|>)
is defined for StateT
as:
m <|> n = StateT $ \ s -> runStateT m s <|> runStateT n s
?
from megaparsec.
@mrkkrp, here <|>
is defined as mplus
, and here mplus
is defined as follows:
m `mplus` n = StateT $ \ s -> runStateT m s `mplus` runStateT n s
So, yeah, the original definition uses mplus
instead of <|>
, but I didn't want to introduce extra baggage when pasting, so I rewrote it with <|>
.
from megaparsec.
@neongreen, I see. I will try to define mtl-style MonadParsec
(although in Megaparsec we have more low-level functions than in Parsec (parsers
is modeled after Parsec
I guess) because I decided that we could define some of them clearer on lower level).
I'll try to eliminate user state, so user will need to combine monads to get backtracking state. This should work. Changes will live in monad-transformer
branch until it's clear that everything is all right.
from megaparsec.
I will also make combinators more general, many of them can work with Applicative
and Alternative
instances, not necessarily with MonadParsec
instances.
from megaparsec.
I had to make other minor changes to make it work smoothly. Also, it seems we could put functions to get current state (parser state, position in input stream, etc) into the class too. This way most part of the library will be defined for any instance of MonadParsec
. Tests will need to be corrected too. This requires more work then it seemed.
from megaparsec.
OK, this is almost done. I want to re-work some tests for Text.Megaparsec.Prim
though. I think it's good idea to add tests that check backtracking abilities of StateT MyStateType Parser a
monad combination. Once it's done I'll organize and push my changes.
@neongreen, I'll ask you to try the improvements and see if Megaparsec is flexible enough now.
from megaparsec.
@neongreen, please try out code in monad-transformer
branch. I also ask you to write one or two tests that you deem realistic and practical to test backtracking state achieved by the method you mention. I suggest you first test this locally and tell me about results, then I merge this into master. Then you can open a pull request adding the tests.
from megaparsec.
Okay, here go some observations.
-
<?>
should be moved out of the class, I think. Alternatively, there should be aMINIMAL
pragma saying that only one of {label
,<?>
} is needed. Justification: I got bit by this when I was defining an instance ofMonadParsec
, forgot about bothlabel
and<?>
, GHC didn't warn me (because they both have default definitions), and as the result I spent 10m trying to find out why my parser was going in an endless loop. -
We need instances for
MonadParsec s (WriterT w m) c
and so on. Here's a sample instance forWriterT
:instance (Monoid w, MonadParsec s m c) => MonadParsec s (WriterT w m) c where unexpected x = lift $ unexpected x eof = lift $ eof token x y = lift $ token x y tokens x y z = lift $ tokens x y z getParserState = lift $ getParserState updateParserState f = lift $ updateParserState f try (WriterT m) = WriterT $ try m label x (WriterT m) = WriterT $ label x m lookAhead (WriterT m) = WriterT $ lookAhead m notFollowedBy (WriterT m) = WriterT $ notFollowedBy (fst <$> m) >>= \x -> return (x, mempty)
The same instance in
parsers
has{-# INLINE ... #-}
for all parsers, so maybe we should inline all the things too. I'll try to benchmark it. -
It's more annoying now to write generic parsers. Before:
whitespace :: Parser () whitespace = void (some (char ' '))
After:
whitespace :: MonadParsec s m Char => m () whitespace = void (some (char ' '))
Small as the difference may be, it might push people towards reusing less code than they used to, and it's bad. I'd like both
Text.Megaparsec.String
andText.Megaparsec.Text
to export aParsing
that would be respectivelyMonadParsec String m Char
andMonadParsec Text m Char
under the hood (after all, they already export differentParser
s that are named the same). This would make the code look like this:whitespace :: Parsing m => m () whitespace = ... inParens :: Parsing m => m a -> m a inParens = ...
from megaparsec.
Oh, and another observation: I had to write
warnParse :: WarnParser a -> SourceName -> Text -> Either ParseError (a, [String])
warnParse p src s = parse (runWriterT p) src s
and it feels like it's something that should've been done automatically, but I have no idea how to actually accomplish this.
from megaparsec.
I agree with points 1 and 2. As for 3, I think end user can define Parsing
type or something like that if he thinks it's necessary. Mostly I don't see any problem in MonadParsec s m Char => m ()
stuff.
I currently don't think it should be done automatically. You're trying to run code inside monad and usually if you have stack of monads you need to nest functions like parse
and runWriterT
accordingly. Some functions like runStateT
will need to accept additional arguments like initial state, so I don't think all these cases can be handled in uniform (non-hackish) way.
from megaparsec.
Some functions like
runStateT
will need to accept additional arguments like initial state, so I don't think all these cases can be handled in uniform (non-hackish) way.
Okay, you're right.
As for 3, I think end user can define
Parsing
type or something like that if he thinks it's necessary.
Sure, but at least in my case I'm probably going to be defining Parsing
literally every time I use Megaparsec with a custom stack, and unlike parse
it can be done generically in a non-hackish way. I guess GenParser
was included into original parsec
for the same reason. (By the way, am I right that GenParser
shouldn't be in the monad-transformer branch at all?)
from megaparsec.
@neongreen, OK, I'll add Parsing
and GenParser
is indeed should disappear, just overlooked that.
from megaparsec.
@neongreen, To write something like this:
type Parsing m = MonadParsec C.ByteString m Char
You need to enable ConstraintKinds
extension. Because it's synonym for a constraint, not type. This once more makes me think that this is not entirely necessary.
I think we will do it this way: for now Parsing
won't be added. Then in future releases this might be reconsidered. After all it saves not much typing.
from megaparsec.
Okay.
from megaparsec.
@neongreen, There will be two instances for WriterT
: Lazy.WriterT
and Strict.WriterT
, by the way.
from megaparsec.
@neongreen, about inlining: honestly, I think GHC will inline all this stuff anyway. So, you probably won't be able to find any performance difference.
from megaparsec.
@neongreen, I've committed all the changes, so you can try again. Is there anything else you want to tell me?
from megaparsec.
I've committed all the changes, so you can try again.
Okay, thanks. I'll start writing tests now.
from megaparsec.
@mrkkrp, I have just realised that Parsec's lookAhead
and Megaparsec's lookAhead
have different semantics – Parsec's lookAhead
reverts changes made by the parser to backtracking state, Megaparsec's lookAhead
doesn't. It won't matter in most cases when lookAhead
is used; however, it might be useful to have a version of lookAhead
that adheres to the intuitive definition of “run the parser but don't let it change anything whatsoever”. Would it be easy to write? (I don't know Parsec's internals well enough to answer this question on my own. It may even be impossible, I don't know.)
from megaparsec.
@neongreen, parsers
seems to be brainy about the stuff. How does lookAhead
work there?
from megaparsec.
@mrkkrp, I already checked – it isn't being brainy in this particular case, unfortunately.
import Text.Parser.LookAhead
import Text.Parser.Combinators
import Text.Parsec.String
import Text.Parsec (parse, runParser, Parsec, setState, getState)
import qualified Control.Monad.State as State
tParsers = parse (State.evalStateT p 0) "" ""
where p :: State.StateT Integer Parser Integer
p = do
State.put 0
_ <- lookAhead (State.put 1 >> eof)
State.get
tParsec = runParser p 0 "" ""
where p = do
setState 0
_ <- lookAhead (setState 1 >> eof)
getState
> tParsers
Right 1
> tParsec
Right 0
from megaparsec.
@neongreen, ah I see they just use Parsec's version without reimplementing it.
instance (Parsec.Stream s m t, Show t) => LookAheadParsing (Parsec.ParsecT s u m) where
lookAhead = Parsec.lookAhead
from megaparsec.
@neongreen, I think Megaparec should revert backtracking state, you're right. I'll see how to fix it. Make sure you include something in your tests to check this feature in future.
from megaparsec.
@neongreen, Have you tried with simple alternatives à la p <|> n
? Does it work as expected?
from megaparsec.
@mrkkrp, yep, it works alright. E.g. the modified prop_user_backtrack
test (that already was there) passes:
prop_user_backtrack :: Integer -> Integer -> Property
prop_user_backtrack n m = runParser (State.evalStateT p 0) "" "" === Right n
where p = do
State.put n
(State.put m >> fail "failed") <|> return ()
State.get
from megaparsec.
OK, first of all, Megaparsec's and Parsec's lookAhead
are essentially identical:
-- Megaparsec
pLookAhead :: ParsecT s m a -> ParsecT s m a
pLookAhead p = ParsecT $ \s _ cerr eok eerr ->
let eok' a _ _ = eok a s mempty
in unParser p s eok' cerr eok' eerr
-- Parsec
lookAhead :: (Stream s m t) => ParsecT s u m a -> ParsecT s u m a
lookAhead p =
ParsecT $ \s _ cerr eok eerr -> do
let eok' a _ _ = eok a s (newErrorUnknown (statePos s))
unParser p s eok' cerr eok' eerr
Another thing is Megaparsec currently has no built-in user state. But this should not matter. As your little experiment demonstrates, while Parsec's lookAhead
generally works OK, that version with StateT
transformer doesn't work so great. So I think the problem is in definition of lookAhead
for StateT
.
from megaparsec.
@neongreen, This means that all these definitions of WriterT
, StateT
, etc. in parsers are not necessarily correct with respect this sort of corner case. We should perhaps correct them in Megaparsec instead of copying them.
Maybe we should also open an issue on GitHub page of parsers
library.
from megaparsec.
@neongreen, We also should tests all these definitions too, I think. Don't worry about that, I will fix it and later add new tests myself.
from megaparsec.
@neongreen, Try with this definition (with TupleSections
extension):
instance (MonadPlus m, MonadParsec s m t) =>
MonadParsec s (L.StateT e m) t where
label n (L.StateT m) = L.StateT $ \s -> label n (m s)
try (L.StateT m) = L.StateT $ try . m
lookAhead (L.StateT m) = L.StateT $ \s ->
(,s) . fst <$> lookAhead (m s)
notFollowedBy (L.StateT m) = L.StateT $ \s ->
notFollowedBy (fst <$> m s) >> return ((),s)
unexpected = lift . unexpected
eof = lift eof
token f e = lift $ token f e
tokens f e ts = lift $ tokens f e ts
getParserState = lift getParserState
updateParserState f = lift $ updateParserState f
If this passes the test I will commit necessary corrections.
from megaparsec.
I don't have anything to add, but it looks like you two are doing pretty good work - thanks!
from megaparsec.
@neongreen, I checked it and it seems to work OK. I've committed the changes so this bug should be eliminated by now.
from megaparsec.
@mrkkrp, I have the following (passing) tests that use lookAhead
, notFollowedBy
, and <|>
:
-- Monad transformers + Parser
prop_StateT :: Integer -> Integer -> Property
prop_StateT n m = checkParser (State.evalStateT p 0) (Right (n+m)) ""
where p = do
State.put n
State.modify (+ m)
State.get
prop_StateT_backtrack :: Integer -> Integer -> Property
prop_StateT_backtrack n m = checkParser (State.evalStateT p 0) (Right n) ""
where p = do
State.put n
(State.put m >> fail "failed") <|> return ()
State.get
-- See <https://github.com/mrkkrp/megaparsec/issues/27#issuecomment-141785141>.
prop_StateT_lookAhead :: Integer -> Integer -> Property
prop_StateT_lookAhead n m = checkParser (State.evalStateT p 0) (Right n) ""
where p = do
State.put n
lookAhead (State.put m >> eof)
State.get
prop_WriterT :: String -> String -> Property
prop_WriterT pre post = checkParser (Writer.runWriterT p) result "abx:"
where logged_letter = do
x <- letterChar
Writer.tell [x]
return x
logged_colon = do
x <- char ':'
Writer.tell [x]
return x
p = do
Writer.tell pre
cs <- Writer.censor (map toUpper) $
many (try (logged_letter <* notFollowedBy logged_colon))
Writer.tell post
_ <- logged_letter
_ <- logged_colon
return cs
result = Right ("ab", pre ++ "AB" ++ post ++ "x:")
Not claiming the last one is particularly realistic :)
I have a question: how should label
be tested, if at all? I.e. can Megaparsec's new error reporting system somehow interfere with monad transformers?
from megaparsec.
@neongreen, There are two cases when label has effect:
- When parser
p
succeeds without consuming input and next parsern
fails immediately afterp
without consuming input. “Label” ofp
will be displayed in the error message. - When labeled parser fails without consuming input.
See existing prop_label
for example. Although it's a bit dense in that it tests many behaviors at once, so you should not necessarily take it as base for new tests.
Also, prop_StateT
has nothing to do with MonadParsec
, it tests features of StateT
, which we can safely assume work OK. Two other tests for StateT
are OK. Please add one more to test notFollowedBy
, I think it's easy. Other functions all should be tested too, preferably…
I advise you enable coverage statistics and see what else you can test. It's a good idea to test everything, although it's more than I can ask from you. Try at least test all functions in instance declaration of StateT
. Also, you can have two copies of the same tests, one for lazy version of state monad and another one for strict version.
Coverage dropped because of these new changes and I think we should restore it at least on 80 % level. Initially I naively thought that these “boilerplate” definitions are not necessary to test, but after this issue with lookAhead
I think everything should be tested.
from megaparsec.
I suppose this may be closed now.
from megaparsec.
Related Issues (20)
- [proposed labels: question, feature request] best practices for stateful matching of simple patterns HOT 4
- Question: mergeError HOT 2
- Greedy combinators HOT 11
- Problematic `IsString` instance HOT 4
- Processing input prior to parsing while retaining source positions HOT 1
- Mention in documentation that `parse` is an alias for `runParser` HOT 1
- MonadAccum instance for ParsecT HOT 5
- Tabs are not handled correctly when errors are rendered HOT 4
- Indentation error lost in alternative HOT 3
- `local` clears all hints HOT 5
- 9.4 migration: getSourcePos now requires `Monad m =>` HOT 6
- Remove/upgrade version bounds of executable `test-debug` from megaparsec-tests HOT 2
- Add `drop` to `Text.Megaparsec` HOT 3
- Wrong source locations on `unexpected end of input` with custom tokens HOT 2
- get col, row (as Int's) and error message of the first TrivialError HOT 2
- deepseq-1.5? HOT 1
- Error context with additional source positions HOT 4
- Parsing hexadecimal floats HOT 1
- Rewrite rules for parser primitives HOT 1
- How to use `Operator` from tutorial HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from megaparsec.