GithubHelp home page GithubHelp logo

Comments (4)

erezsh avatar erezsh commented on August 18, 2024

Hi, you should really just use a regular expression for that.
Something like

import re
re.findall('parsley\s+\d+', some_string)

Lark is meant for recognizing well-defined structure in text.

from lark.

DuyguA avatar DuyguA commented on August 18, 2024

Hello,

I gave tiny lang as an example to say what happens with LALR parser, I'm an indeed parsing time expressions (context free and more complicated of course:) ) I can't disclose the grammar here. Something like this:

Can we meet next year, late March / early April?

or

We can meet tomorrow 12.00-14.00 or Wednesday 15.00-16.00.

My grammar looks like

sentence: (time_expr|rest)+
time_expr: .....
rest: REST
REST: /[^\s]+/

So, I want to both extract the strings, also parse the expression at the same time (then make entity resolution). Is there a way to extract parsed strings from the outer string, that was my question.

from lark.

erezsh avatar erezsh commented on August 18, 2024

Yes, it's possible, but not recommended in terms of performance.

I recommend that you use a two stage approach:

  1. Use a regular expression to capture every phrase that might be relevant (As a whole string)

  2. Parse each matched string (with Earley, not LALR), and throw away those that fail to parse.

Another suggestion is to use PyParsing for this. While Lark is generally better, PyParsing has the "scan()" function which seems to do what you want (maybe?).

from lark.

DuyguA avatar DuyguA commented on August 18, 2024

Thanks for the prompt answer!

  • I'm unhappy with performance of PyParsing in general. Input strings to this parser might be "long", so I'm not sure really, I'll give it a try anyway.

  • Why am I not surprised that I'll fall back onto good, old regex at the end of the day? 🥇

Thanks for suggestions, issue can be closed.

from lark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.