GithubHelp home page GithubHelp logo

regex_generate's Introduction

regex_generate

Use regular expressions to generate text. This crate is very new and raw. It's a work-in-progress, but feel free to add issues or PRs or use it for your own ideas, if you find it interesting. No guarantees or warranties are implied, use this code at your own risk.

Thanks to the amazing folks who work on rust-lang/regex which is the heart of this crate. Using regex_syntax made this crate 1000x easier to produce.

Documentation

Magically generated and graciously hosted by Docs.rs.

The documentation is not good right now.

Usage

Add this to your Cargo.toml:

[dependencies]
regex_generate = "0.2"

and this to your crate root:

extern crate regex_generate;

This example generates a date in YYYY-MM-DD format and prints it. Adapted from the example for rust-lang/regex.

extern crate regex_generate;
extern crate rand;

use regex_generate::{DEFAULT_MAX_REPEAT, Generator};

fn main() {
    let mut gen = Generator::new(r"(?x)
(?P<year>[0-9]{4})  # the year
-
(?P<month>[0-9]{2}) # the month
-
(?P<day>[0-9]{2})   # the day
", rand::thread_rng(), DEFAULT_MAX_REPEAT).unwrap();
    let mut buffer = vec![];
    gen.generate(&mut buffer).unwrap();
    let output = String::from_utf8(buffer).unwrap();

    println!("Random Date: {}", output);
}

Tests

Run tests with cargo test -- --nocapture

Benches

Run benchmarks with rustup run nightly cargo bench

Tips

  • Be explicit in your character classes or you will get unexpected results.
  • . really means any, as in any valid unicode character.
  • Likewise, \d means any number, not just [0-9].
  • The default maximum for repetitions (like .*) is 100, but you can set it yourself with generate_with_max_repeat.

TODO

  • Add convenience method for directly generating complete strings
  • Implement Iter for making lots of strings?
  • Add tests for regex bytes feature
  • Account for case insensitivity in Literal
  • Do something with group numbers or names? (No back referencing in the syntax, so maybe nothing can be done.)

License

regex_generate is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.

See LICENSE-APACHE and LICENSE-MIT for details.

regex_generate's People

Contributors

azdle avatar cryptarchy avatar eipi1 avatar jsinger67 avatar kennytm avatar vks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

regex_generate's Issues

Trait bound `ThreadRng: rand_core::RngCore` is not satisfied

Hi,

I'm trying to implement the example given in the ReadMe. However, I cannot use rand::thread_rng() as I get the following error:

the trait bound `ThreadRng: rand_core::RngCore` is not satisfied
required because of the requirements on the impl of `rand::Rng` for `ThreadRng`

I first used rand version 0.8.3 and then went back to 0.8 to have the same version as in the example. I also use regex_generate version 0.2 and run the rust compiler stable version 1.49.0.

This is my code (basically the same as the example with a different regex and max repeat):

extern crate regex_generate;
extern crate rand;

use regex_generate::Generator;

fn main() {
    let mut gen = Generator::new(r"(a|b|c)*", rand::thread_rng(), 3).unwrap();
    let mut buffer = vec![];
    gen.generate(&mut buffer).unwrap();
    let output = String::from_utf8(buffer).unwrap();

    println!("Random Date: {}", output);
}

I hope you can somehow reproduce this problem.

Valid regex throws error "Could not parse expression"

Regex

(|(20\/([^0179]0|[1234]00|25|125|160|250)|6\/([69]|1[258]|[369]0|[34]8|7\.5|24|75|120)|(1\.0|0\.(0[58]|[12458]0|067|67|16|125|25|33))))

is supported both by Regex crate and Qt, but this crate throws error

could not parse expression
```


Test crate
```
use rand::Rng;
use regex::Regex;

const LIMIT : usize = 1000;

fn main() {
    let mut rng = rand::thread_rng();
    let re = r"(|(20\/([^0179]0|[1234]00|25|125|160|250)|6\/([69]|1[258]|[369]0|[34]8|7\.5|24|75|120)|(1\.0|0\.(0[58]|[12458]0|067|67|16|125|25|33))))";
    let tt = Regex::new(re).unwrap();
    dbg!(tt);
    let mut generator = regex_generate::Generator::new(&re, &mut rng, 1000).unwrap();
    for _ in 0..LIMIT {
        let mut buffer : Vec<u8> = Vec::new();
        generator.generate(&mut buffer).unwrap();
        println!("{}", String::from_utf8(buffer).unwrap());
    }
}

```

Panic if the `max_repeat` is lower than the minimum repetition in a pattern

If you create a Generator with a lower max_repeat than the pattern requires, the generator will not error, but you will get a panic when actually generating input.

Example input pattern: x{150,} (repeat x at least 150 times).

panic:

thread <redacted> panicked at 'Uniform::new_inclusive called with `low > high`',<redacted>/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/distributions/uniform.rs:567:1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.