GithubHelp home page GithubHelp logo

pombredanne / libpypa Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vinzenz/libpypa

0.0 1.0 0.0 7.97 MB

libpypa is a Python parser implemented in pure C++

License: Apache License 2.0

Shell 0.01% Python 0.03% C++ 99.97%

libpypa's Introduction

libpypa - A Python Parser Library in C++

## Introduction **libpypa** is a Python parser implemented in pure *C++*. It neither uses any tools like [flex][1], [yacc][2], [bison][3] etc, nor is it using any parser framework like [Boost.Spirit][4]. It's implementation is pure C++ code. ### Motivation I started getting involved into the [pyston project][5] where it had an entry in their getting involved list for implementing a parser for Python. Never having properly tackled the problem of creating a parser library for any language, I decided it might be worth a try, since most of the libraries I found, where basically just using the builtin Python parser or where implemented in Python itself. ### Goal The first goal of the library is to support python 2.7 syntax, later on 3.x syntax might be added. ## Example

An example file:

$cat hello_world.py
#! /usr/bin/env python
# -*- coding: utf-8 -*-
#

"""
    A "Hello World" example for the pypa parser
"""
import sys

print >> sys.stdout, "Hello", "World!"

And here the output of the test parser:

$ ./parser-test hello_world.py
Parsing successfull

[Module]
  - body:
    [Suite]
      - items: [
            [DocString]
              - doc:
    A "Hello World" example for the pypa parser


            [Import]
              - names:
                [Alias]
                  - as_name: <NULL>
                  - name:
                    [Name]
                      - context: Load
                      - dotted: False
                      - id: sys

            [Print]
              - destination:
                [Attribute]
                  - attribute:
                    [Name]
                      - context: Load
                      - dotted: False
                      - id: stdout
                  - context: Load
                  - value:
                    [Name]
                      - context: Load
                      - dotted: False
                      - id: sys
              - newline: True
              - values: [
                    [Str]
                      - value: Hello

                    [Str]
                      - value: World!
                    ]
            ]
  - kind: Module

And here the parse tree of python: (astdump.py can be found in tools)

[Module]
    - body: [

        [Expr]
            - value:
            [Str]
                - s:
    A "Hello World" example for the pypa parser


        [Import]
            - names: [

                [alias]
                    - asname: None
                    - name: sys
            ]

        [Print]
            - dest:
            [Attribute]
                - attr: stdout
                - ctx: Load
                - value:
                [Name]
                    - ctx: Load
                    - id: sys
            - nl: True
            - values: [

                [Str]
                    - s: Hello

                [Str]
                    - s: World!
            ]
    ]
## Error Reporting The parser supports also SyntaxError and IndentionError reporting:

Let's take a look at this file syntax_error.py which clearly has a syntax error:

#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""
    Syntax error example
"""

print x y z

This is the output of the test parser:

$./parser-test syntax_error.py
  File "syntax_error.py", line 7
    print x y z
            ^
SyntaxError: Expected new line after statement
-> Reported @pypa/parser/parser.cc:944 in bool pypa::simple_stmt(pypa::{anonymous}::State&, pypa::AstStmt&)

Parsing failed

And this of cpython 2.7:

$ python syntax_error.py
  File "syntax_error.py", line 7
    print x y z
            ^
SyntaxError: invalid syntax

libpypa uses different error messages than python, however in the hopes that that would increase the clarity.

## Requirements To be able using **libpypa**, you have to have a *C++11* compiler available. **libpypa** was developed on top of *g++ 4.8.2* and it heavily uses *C++11* features where seen fit.

libpypa currently does not depend on any other libraries than the C++11 standard library with the exception of the class FileBuf which currently uses system libraries, but might be changed to just use fopen/fread/ fclose.

## Structure **libpypa** currently consists of 3 major parts:
  1. Lexer
  2. Parser
  3. AST
### Lexer The `Lexer` portion of the library tokenizes the input for the `Parser` and distinguishes the different types of tokens for the `Parser`. ### Parser The `Parser` utilizes the `Lexer` to parse the input and generates a preliminary `AST` from the input. ### AST The AST contains the definition of all syntax elements in the code. The main parts of the definition are in `pypa/ast/ast.hh` which makes heavily use of preprocessor macros to define typedefs, mappings for compile time type lookups by AstType (enum class), and an implementation for a switch based visitor.

The AST types do not implement any methods, they are just structures with data. The only thing which is in there for some of the bases is the constructor, to set the type id value and initialize the line and column values.

## License Copyright 2014 Vinzenz Feenstra
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

License for src/double-conversion

Copyright 2006-2011, the V8 project authors. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

    * Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above
      copyright notice, this list of conditions and the following
      disclaimer in the documentation and/or other materials provided
      with the distribution.
    * Neither the name of Google Inc. nor the names of its
      contributors may be used to endorse or promote products derived
      from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

libpypa's People

Contributors

dagar avatar serge-sans-paille avatar undingen avatar vinzenz avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.