GithubHelp home page GithubHelp logo

yashgawankar / simple-regex-engine-in-c Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 13 KB

This is a simple regex engine built in C, which returns the start and end index of the match in case of match, else 0.

Python 38.42% C 61.58%
regex-engine regex re c regular-expression regular-expressions regular-expression-engine

simple-regex-engine-in-c's Introduction

Simple-Regex-Engine-in-C

This code prints the starting and ending index of the match in the text, in case of a match; 0 otherwisw.

The following matches required for this assignment are supported by this code:-

  1. Matching of individual characters and numbers
  2. Ranges: a-z,A-Z,0-9 or anything in between
  3. Character classes
  4. Macros like +,*,?
  5. \w,\d
  6. Greedy and non-greedy matching in case of *

Certain Additional matches are also supported since they weren't very difficult to implement, like

  1. Matches '.' - Everything except \n
  2. Support for greedy and non greedy wrt +
  3. Startswith: ^ and Endswith: $
  4. Non inclusion matches in character class: [^...]
  5. \W,\D,\s,\S

Data Structures used:- The input pattern is parsed and stored in the form of a structure whose definition is given below:- typedef struct regex_t { int type; union { char ch; char *char_class; }; }regex;

Logic used:-

  1. The text and no of patterns are taken through input
  2. Each pattern that is input, is parsed, converted and stored in the form of the structure, defined above.
  3. The match_here function as defined by Sir in the class, has been converted to an iterative version called match_util which iterates through the pattern and text and returns the starting index in case a match is found. -1 otherwise.
  4. The match length is in fact a global variable which indicates the no. of characters matched. This is used to calculate the ending index, as follows: End index = Start index + match length - 1
  5. The parsed pattern and the character class buffers are also global variables for convenience of scope (Although a bad programming practice)
  6. The macros and metacharacters are depicted by their types, which are ENUMS for ease of use, easy debugging and readability.

Input:- From stdin, as follows

text

no_of_patterns

pattern0

pattern1

.

.

.

Constraints:-

Max length of text = 4000

Max length of patterns = 1000

Output:- As required by the assignment, this code Outputs:-

0 - if there is no match

1 start_index end_index - if match is found

for each pattern input

simple-regex-engine-in-c's People

Contributors

yashgawankar avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.