Comments (4)
Hi, you should really just use a regular expression for that.
Something like
import re
re.findall('parsley\s+\d+', some_string)
Lark is meant for recognizing well-defined structure in text.
from lark.
Hello,
I gave tiny lang as an example to say what happens with LALR parser, I'm an indeed parsing time expressions (context free and more complicated of course:) ) I can't disclose the grammar here. Something like this:
Can we meet next year, late March / early April?
or
We can meet tomorrow 12.00-14.00 or Wednesday 15.00-16.00.
My grammar looks like
sentence: (time_expr|rest)+
time_expr: .....
rest: REST
REST: /[^\s]+/
So, I want to both extract the strings, also parse the expression at the same time (then make entity resolution). Is there a way to extract parsed strings from the outer string, that was my question.
from lark.
Yes, it's possible, but not recommended in terms of performance.
I recommend that you use a two stage approach:
-
Use a regular expression to capture every phrase that might be relevant (As a whole string)
-
Parse each matched string (with Earley, not LALR), and throw away those that fail to parse.
Another suggestion is to use PyParsing for this. While Lark is generally better, PyParsing has the "scan()" function which seems to do what you want (maybe?).
from lark.
Thanks for the prompt answer!
-
I'm unhappy with performance of PyParsing in general. Input strings to this parser might be "long", so I'm not sure really, I'll give it a try anyway.
-
Why am I not surprised that I'll fall back onto good, old regex at the end of the day? 🥇
Thanks for suggestions, issue can be closed.
from lark.
Related Issues (20)
- Lark.open_from_package() does not support namespace packages HOT 2
- Stand-alone program cannot be run HOT 4
- Issue of installing lark in Python HOT 1
- Pipe in terminal regex not working as expected HOT 1
- Transformer Not Applying Expected Transformations in Lark Parser HOT 3
- Deprecation Warning HOT 6
- accepts() vs choices() in InteractiveParser HOT 10
- No such file or directory: 'COMMON.lark' HOT 4
- Grammar Syntax For Unordered Groups HOT 1
- Is it possible to parse parts of the input? HOT 12
- Forgiving syntax HOT 3
- Post 1388 changes HOT 4
- Dynamic Earley: Incorrect value for SymbolNode.end
- Inconsistent parse results from simple ambiguous grammar HOT 4
- Superfluous identical ambiguities in Earley HOT 2
- Porting from pyparsing match_previous_literal HOT 4
- _TERMINAL appears in tree HOT 1
- Lexer matches shorter literals before longer literals HOT 1
- Priorities not working within recursive rules
- Error in parsing datetime strings HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lark.