GithubHelp home page GithubHelp logo

Comments (2)

dlclark avatar dlclark commented on May 31, 2024 2

Thanks for the request!

I've attempted a basic feature for this -- check out tag v1.0.1 (519dd65), it includes a basic smoke test. Apparently the longer names (like "Katakana") aren't technically Unicode categories, but are instead "Scripts." Because of this the go runtime stores them in a separate list so they were unavailable to the \p{} syntax. I've updated the regexp2 code to join the unicode.Categories and unicode.Scripts maps so now all the known unicode scripts should work with \p as expected.

Unfortunately I don't know enough about your example pattern or the characters involved to know the proper behavior you're expecting. If this new version doesn't work for you please give me a simple test similar to TestUnicodeScriptSets that doesn't pass and I'll fix it up.

Once again, thanks for the heads up. My day-to-day is all ASCII so the unicode matching is an area that I need as much help as I can get!

from regexp2.

hachi8833 avatar hachi8833 commented on May 31, 2024

Tried v1.0.1 and worked fine for my small test. Thank you very much for your kindness and quick update!
Now I can use most of .NET regexp syntax I need. (Yes, the names are actually 'Scripts' as you specified :-)


FYI
I've been looking for regex engine for Go language that meets the following requirement:

  • 'Scripts' such as 'Katakana' or 'Hiragana' are available as character classes
  • Look-ahead/behind (both positive and negative) with quantifiers, such as (?<=[a-zA-Z])blahblar(?=[a-zA-Z]) or (?<![a-zA-Z])blahblar(?![a-zA-Z]), are fully supported (I recognize that such expressions might be inefficient in some cases, but it's been very convenient for me in .NET environment)

I've been familiar with .NET regex with the advanced features, which are missing in almost all other regex engines except 'onigmo'. While onigmo is very fast and almost equivalent to .NET framework, it is available only for ruby by now.

Thus I adopted rubex library for my Go app by now, but the engine for rubex is oniguruma, the previous version of onigmo. Full look ahead/behind with quantifier is missing in oniguruma.

Once I tried to port onigmo to rubex, but it is cgo-based (bridging c and go) and very hacky for me to implement. Instead, your regexp2 is based on c# .NET code and much cleaner than rubex.

Your porting regexp2 from .NET source is a good news from heaven for me. I really appreciate your work. :-)

from regexp2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.