GithubHelp home page GithubHelp logo

remram44 / regex-cheatsheet Goto Github PK

View Code? Open in Web Editor NEW
278.0 9.0 64.0 110 KB

Cheatsheet for different regex syntaxes

Home Page: https://remram44.github.io/regex-cheatsheet/regex.html

CSS 1.27% JavaScript 3.46% HTML 95.27%
regexes regular-expression cheatsheet

regex-cheatsheet's Introduction

Cheatsheet for regex syntaxes

Say Thanks!

Many programs use regular expression to find & replace text. However, they tend to come with their own different flavor.

You can probably expect most modern software and programming languages to be using some variation of the Perl flavor, "PCRE"; however command-line tools (grep, less, ...) often use the POSIX flavor (sometimes with an extended variant, e.g. egrep or sed -r). ViM also comes with its own syntax (a superset of what Vi accepts).

This cheatsheet lists the respective syntax they each use.

If you spot errors or missing data, or just want to make this prettier/more accurate, don't hesitate to open an issue or a pull request.

The rendered cheatsheet is available here: regex cheatsheet

Note that this is still a work in progress; a lot of entries need some details in some kind of tooltip.

regex-cheatsheet's People

Contributors

fishilico avatar leoluyi avatar moskytw avatar remram44 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

regex-cheatsheet's Issues

Oniguruma

Oniguruma (and its fork Onigmo) is a regex library used by multiple software/languages such as Ruby (Onigmo since 2.0) and PHP.

I'm assuming the syntax matches PCRE but supported features may vary.

sed ERE switch is -r

The last line in the last table is false on the website.
The ERE switch for sed to use extended regular expressions is -r or --regexp-extended

man sed sais

   -r, --regexp-extended
         use extended regular expressions in the script.

Make it pretty

This needs to be prettier and printable, I definitely want to post this on my wall somewhere.

My CSS-fu is weak, so help welcome ๐Ÿ˜„

missing data - "\ special in class?", "Ranges"

"\ special in class?", and "Ranges" are empty for "POSIX extended (ERE)".
I assume that they're the same as for "POSIX (BRE)", but this should be explicitly specified.

In addition "Ranges" for "ViM" seem to imply that '' is special in character classes, but I'm not sure if that's intended.
If it is, "'' is special in character classes" should be said explicitly in the appropriate cell of the table.

Rust

Rust's regex crate provides native regular expression matching.

At first glance, this is yet another Perl-inspired dialect, with a lot less features.

BSD sed seems not working with `\+` (1 or more) in BRE mode

Hi there, thank you for your brilliant work.

Here is the question I found in macOS:

# BSD sed prints nothing
echo '<tag>foobar</tag>' | /usr/bin/sed -n 's|<tag>\(.\+\)</tag>|\1|p'
# GNU sed is okay
echo '<tag>foobar</tag>' | /usr/local/opt/gnu-sed/libexec/gnubin/sed -n 's|<tag>\(.\+\)</tag>|\1|p'
foobar

Is the BSD version not working with \+? If not, maybe some information should be noted in the cheatsheet.

Correct me if I was wrong.

Please enable GitHub pages

Hi,

Thanks for the work you did! As http://htmlpreview.github.io/?https://github.com/remram44/regex-cheatsheet/blob/master/regex.html does not work well with links browser, I've cloned your repo and set up GitHub pages by following instructions in https://pages.github.com/ for Project site. The result is http://fishilico.github.io/regex-cheatsheet/regex.html. As this is quite quick to set up (you only need to rename master branch to gh-pages, I'm now opening this issue as a "feature request/improvement wish" to ask you if you'd like to make your project available through GitHub pages.

While at it, I've made the HTML page pass the W3C validator (http://validator.w3.org/check?uri=http%3A%2F%2Ffishilico.github.io%2Fregex-cheatsheet%2Fregex.html) and fixed a small typo in the content ([^[:space;]] -> [^[:space:]]). You can see my commits at https://github.com/fishilico/regex-cheatsheet/commits/gh-pages (I can do a small PR if you want)

Thanks

Oracle Database

https://docs.oracle.com/cd/B19306_01/B14251_01/adfns_regexp.htm
https://docs.oracle.com/cd/B13789_01/appdev.101/b10795/adfns_re.htm

Oracle Database follows the exact syntax and matching semantics for these operators as defined in the POSIX standard for matching ASCII (English language) data. You can find the POSIX standard draft at the following URL:

http://www.opengroup.org/onlinepubs/007908799/xbd/re.html

Oracle Database enhances regular expression support in the following ways:

Extends the matching capabilities for multilingual data beyond what is specified in the POSIX standard.

Adds support for the common Perl regular expression extensions that are not included in the POSIX standard but do not conflict with it. Oracle Database provides built-in support for some of the most heavily used Perl regular expression operators, for example, character class shortcuts, the non-greedy modifier, and so on.

It says, it's posix compliant but e.g. doesn't use \( \) for matching (which is posix afaik)

A short example of oracle regexp is

REGEXP_REPLACE(text1,
                 '^([[:alpha:]]+): ([[:alpha:]]+)$',
                 '\2 \1')

which replaces lastname: firstname with firstname lastname

Multiplicity

How 0 or 1 is different from 0 or 1, (non-greedy) ?

Add default magic for vim.

Vim uses a property called magic that changes the requirements of backslashes before special characters (for more information: https://vimdoc.sourceforge.net/htmldoc/pattern.html#/magic ). They go from something like BRE to ERE like so: \V, \M, \m, v.
You can change the magic level within the actual regex script (similar to how you can use (?i) in pcre to change the case sensitivity.

The default one is \m (so a regex like /^.*$ is actually /\m^.*$ if vim has the default options).
It can be useful to specify that the values given in the vim row are actually about \m mode (and potential a link to the magic documentation for people who want to look up the difference).

If you also want to add a simple description on the difference between the different magic levels:

- \V assumes every character in the regex as a literal character & requires a
backslash escape to use any of the regex special characters.
- \v allows every regex character (excluding any letter regex character like \W,
\s, etc) to be called withoun a backslash escape & requires an espace to
access the literal characters that coincide with regex special characters
( \( to point to the literal left braket ).
- \M & \m are inbetween \V & \v.

Vim: non-word boundary (\B)

While vim does not have a native \B, would it be worth mentioning that combining the \< or \> anchors with a negative lookahead \@! might be used?

\<\@! and \>\@! to accomplish the same goal as \B.

Vim \zs and \ze

I apologize in advance; I am a very infrequent vim user, so some of this may need to be double-checked.

  • \zs sets the start of a match.
  • \ze sets the end of a match.

Vim's \zs appears to work like PCRE's \K (which discards matched text up to that point):

  • PCRE pattern: s/foo\Kfoo/bar/g
  • Vim pattern: :%s/foo\zsfoo/bar/g

Text: foofoofoofoo
Result: foobarfoobar

On the other hand, vim's \ze appears to work more like a lookahead:

  • PCRE pattern: s/foo(?=foo)/bar/g
  • Vim pattern: :%s/foo\zefoo/bar/g
  • Vim's lookahead pattern: :%s/foo\(foo\)\@=/bar/g

Text: foofoofoofoo
Result: barbarbarfoo

I thought it might be useful to include the \K similarities in your Vim cheatsheet.

PCRE non-greedy block is incorrect

Inside the PCRE column:

  • 0 or 1, non-greedy: ?+ has to be changed to ??

  • 0 or more, non-greedy | *+ has to be changed to *?

  • 1 or more, non-greedy | ++ has to be changed to +?

  • Specific number, non-greedy | {n,m}+ {n,}+ has to be changed to {n,m}? {n,}?

(you may verify this yourself by testing it on the console - e.g.:

echo 'abc00defg00hi00jkl7mn' | pcregrep --color '00.*?00'
-> abc00defg00hi00jkl7mn
)

Javascript

Javascript has built-in regular expressions (with unquoted /.../ syntax).

What does Ruby use?

Looks like a variation of Perl's, but as usual with Ruby, it's hard to find any reference documentation of anything.

Mikrotik

For regular expressions Mikrotik uses a slightly modified ERE scheme

This note should make it easier to use regexp in Mikrotik

  • Special characters \$ \? require Backslash
  • Space can be denoted by \_ and Double Quotes \"
  • Special characters \\^ \\. \\{ \\[ \\( \\) \\| \\* \\+ \\? is required specifying Double Slashes
  • The Separator as Tab may be skiped, as scripts service characters $ (Var) \ (Hex)
  • Supports Capturing group*
  • Not Supports Backreferences* and Non-capturing* "Atomic" groups
  • Not Supports \w [[:word:]]** and \b accordingly

There is no information in Wiki (rarely updated): https://wiki.mikrotik.com/wiki/Manual:Regular_Expressions

Correct ex. determination type of Date from log: https://regex101.com/r/5lpRGc/2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.