GithubHelp home page GithubHelp logo

wgtcc's Introduction

wgtcc Build Status codecov

A small C11 compiler in C++11.

Environment

  1. x86-64
  2. linux 4.4.0
  3. clang 3.8.0 (or any version supports C++11)

Build

cmake . -Bbuild
cd build && make && make test

Install

make install

Then you can play with the examples:

wgtcc example/heart.c
wgtcc example/chinese.c

Without Install

Try this:

./wgtcc -no-pie -I../include ../example/heart.c

Notice

As wgtcc doesn't support PIC/PIE, if you are using gcc >= 6.2.0, specify -no-pie explicitly:

wgtcc -no-pie example/heart.c
wgtcc -no-pie example/chinese.c

Goal

wgtcc is aimed to implement the full C11 standard with some exceptions:

  1. Some features are supported only in grammar level(like keyword register).
  2. Features that disgusting me are removed (like default int type without type specifier).
  3. Some non standard GNU extensions are supported, but you should not rely on wgtcc of a full supporting.

Front End

A basic recursive descent parser.

Back End

wgtcc generates code from AST directly. The algorithm is TOSCA (top of stack caching). It is far from generating efficient code, but at least it works and generates code efficently.

Memory Management

Through wgtcc was written in C++, I paid no effort for memory management except for a simple memory pool to accelerate allocations. only new is preferred because wgtcc runs fast and exits immediately after finishing parsing and generating code.

Reference

  1. Compilers Principles, Techniques and Tools. Second Edition.
  2. N1548, C11 standard draft
  3. 64-ia-32-architectures-software-developer-manual-325462
  4. 8cc
  5. 9c
  6. macro expansion algorithm

Todo

  • support GNU extensions (e.g. keyword __attribute__)
  • support variable length array
  • optimization (e.g. register allocation)
  • support type qualification

wgtcc's People

Contributors

ace17 avatar choleraehyq avatar luolent avatar page4 avatar pallharaldsson avatar wgtdkp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wgtcc's Issues

some syntax problem

hello,I have some questions about these code:
Token* Token::New(int tag) {
return new (TokenPool.Alloc()) Token(tag);
}
why do you add (TokenPool.Alloc()) before Token? What will this function return?I think that it will return void**,am I right?

没有错误处理,遇到一个错误就退出了

希望能改进下~

例如: 1.c

int a;
void foo(...) {
    int b = p;
}

wgtcc -S 1.c

1.c:2:10: error: expect type specifier
      void foo(...) {
               ^

7cc -S 1.c

1.c:2:10: error: ISO C requires a named parameter before '...'
1.c:3:10: error: use of undeclared identifier 'p'
1.c:3:6: warning: unused variable 'b'

不过你这个错误用水波纹指出来,跟 clang 很像,不错

Core dump again

struct S {
    int a:1;
};

void foo() {
    struct S s;
    sizeof(s.a);
}

测试下sqlite3.c预处理,core dumped

(gdb) bt
#0  0x0000563dcd985514 in __gnu_cxx::new_allocator<std::_List_node<Token const*> >::max_size (
    this=0x0) at /usr/include/c++/6/ext/new_allocator.h:113
#1  0x0000563dcd984f7e in __gnu_cxx::new_allocator<std::_List_node<Token const*> >::allocate (
    this=0x563dd85fa270, __n=1) at /usr/include/c++/6/ext/new_allocator.h:101
#2  0x0000563dcd9844cb in std::allocator_traits<std::allocator<std::_List_node<Token const*> > >::allocate (__a=..., __n=1) at /usr/include/c++/6/bits/alloc_traits.h:416
#3  0x0000563dcd982635 in std::__cxx11::_List_base<Token const*, std::allocator<Token const*> >::_M_get_node (this=0x563dd85fa270) at /usr/include/c++/6/bits/stl_list.h:383
#4  0x0000563dcd984d3a in std::__cxx11::list<Token const*, std::allocator<Token const*> >::_M_create_node<Token const*&> (this=0x563dd85fa270, __args#0=@0x563dcf09b830: 0x563dcf09ad00)
    at /usr/include/c++/6/bits/stl_list.h:568
#5  0x0000563dcd983dee in std::__cxx11::list<Token const*, std::allocator<Token const*> >::_M_insert<Token const*&> (this=0x563dd85fa270, __position=, __args#0=@0x563dcf09b830: 0x563dcf09ad00)
    at /usr/include/c++/6/bits/stl_list.h:1770
#6  0x0000563dcd98213b in std::__cxx11::list<Token const*, std::allocator<Token const*> >::emplace_back<Token const*&> (this=0x563dd85fa270, __args#0=@0x563dcf09b830: 0x563dcf09ad00)
    at /usr/include/c++/6/bits/stl_list.h:1108
#7  0x0000563dcd980bc3 in std::__cxx11::list<Token const*, std::allocator<Token const*> >::_M_initialize_dispatch<std::_List_iterator<Token const*> > (this=0x563dd85fa270, __first=, __last=)
    at /usr/include/c++/6/bits/stl_list.h:1699
#8  0x0000563dcd97fce2 in std::__cxx11::list<Token const*, std::allocator<Token const*> >::list<std::_List_iterator<Token const*>, void> (this=0x563dd85fa270, __first=, __last=, __a=...)
    at /usr/include/c++/6/bits/stl_list.h:708
#9  0x0000563dcd97e8a1 in TokenSequence::Copy (this=0x7ffc81e37350, other=...) at token.h:339
#10 0x0000563dcd97e060 in Macro::RepSeq (this=0x563dd5b728b0, 
    fileName=0x563dcdbf9340 <inFileName[abi:cxx11]>, line=19815) at cpp.cc:864
#11 0x0000563dcd97949a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:64
#12 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#13 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#14 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
---Type <return> to continue, or q <return> to quit---
#15 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#16 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#17 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#18 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#19 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#20 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#21 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#22 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#23 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#24 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#25 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#26 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#27 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#28 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#29 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88
#30 0x0000563dcd979d90 in Preprocessor::Subst (this=0x7ffc82635e40, os=..., is=..., leadingWS=true, 
    hs=std::set with 2 elements = {...}, params=std::map with 2 elements = {...}) at cpp.cc:149
#31 0x0000563dcd97972a in Preprocessor::Expand (this=0x7ffc82635e40, os=..., is=..., inCond=false)
    at cpp.cc:88

wgtcc -E sqlite3.c

I'm not familiar with linux,can you help me?

Maybe it's strange ,but I'm not familiar with Linux now. I don't know the meaning of makefile and bash ,So I don't know how to compile your code. Can you help me ? I 'm new in here and please give me some help,Thanks a lot !

Core dump

struct S {
    int a;
};
struct S bar();
void foo() {
    &bar().a;
}
wgtcc: ./code_gen.h:262: virtual void LValGenerator::VisitFuncCall(FuncCall *): Assertion `false' failed.
Aborted (core dumped)

subtle bug

When a function return a structure. The return address is passed by %rdi. But one can't store this return address in %rdi until the arguments are all visited! Otherwise, the visition of arguments(e.g. a function call with a interger parameter) will overwrite %rdi, while will results in disaster(e.g. segment fault).

Anyway, it has been fixed:)

error: complex not supported yet

struct S {
    unsigned a:3;
    signed b:3;
    int c:3;
};

error: complex not supported yet

把 signed b:3 改成 signed int b:3 则通过。

支持macOS

这个项目测试过macOS吗??或者有打算支持macOS吗?
顺便问一句,这个项目还在维护吗?

compile error appears in ubuntu 16.10

when i execute : make test
the follow error messages come up:

/usr/bin/ld: /tmp/cc3cDNDM.o: relocation R_X86_64_32S against .rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status`

the gcc version in my os is 6.2.0 , hope for your reply!

The macro __DATE__ is wrong

There is a samll mistake in the function Date which is in the file cpp.cc.
It maybe //strftime(buf, 14, ""%b %d %Y"", tm);

'defined' not supported

A simple example:

# if defined(FOO)
#warning "xxx"
#else
#warning "yyy"
#endif

Another example:

#define X defined(FOO)

# if X
#warning "xxx"
#else
#warning "yyy"
#endif

ABI problem

struct S {
    int a, b;
};
void bar(struct S);
void foo() {
    struct S s;
    bar(s);
}

According to the amd64 ABI,the whole s is passed by register rdi. 😒

处理#include_next的方法不对?

cpp.cc中SearchFile函数对next的处理有这么一段:

 if (next) {
          assert(curPath);
          if (path != *curPath)
            continue;
          else 
            next = false;
        } 
...
}

意思是要从当前file的下一个路径开始找。

但是我看https://gcc.gnu.org/onlinedocs/cpp/Wrapper-Headers.html 中的介绍:

Suppose you specify -I /usr/local/include, and the list of directories to search also includes /usr/include;
and suppose both directories contain signal.h. Ordinary #include <signal.h> finds the file under
/usr/local/include. If that file contains #include_next <signal.h>, it starts searching after that directory, and finds the file in /usr/include.

指的是从找到的第一个包含的路径开始的下一个路径。

I have built it on Ubuntu 14.04 ,and failed.

I run "make install " in the codes' root folder, and get the errors like this:
m100@m100-desktop:~/gitRep/wgtcc$ make install
make[1]: 正在进入目录 /home/m100/gitRep/wgtcc' make[2]: 正在进入目录/home/m100/gitRep/wgtcc'
g++ -g -std=c++11 -Wall -o build/main.o -c main.cc
In file included from ast.h:5:0,
from code_gen.h:4,
from main.cc:1:
token.h: In member function ‘void TokenSequence::InsertBack(TokenSequence&)’:
token.h:454:57: error: ‘void pos’ has incomplete type
auto pos = tokList_->insert(end_, ts.begin_, ts.end_);
^
token.h: In member function ‘void TokenSequence::InsertFront(TokenSequence&)’:
token.h:470:12: error: no match for ‘operator=’ (operand types are ‘std::list<const Token*>::iterator {aka std::List_iterator<const Token*>}’ and ‘void’)
begin
= tokList_->insert(pos, ts.begin_, ts.end_);
^
token.h:470:12: note: candidates are:
In file included from /usr/include/c++/4.8/list:63:0,
from token.h:10,
from ast.h:5,
from code_gen.h:4,
from main.cc:1:
/usr/include/c++/4.8/bits/stl_list.h:125:12: note: std::_List_iterator<const Token*>& std::_List_iterator<const Token*>::operator=(const std::_List_iterator<const Token*>&)
struct _List_iterator
^
/usr/include/c++/4.8/bits/stl_list.h:125:12: note: no known conversion for argument 1 from ‘void’ to ‘const std::_List_iterator<const Token*>&’
/usr/include/c++/4.8/bits/stl_list.h:125:12: note: std::_List_iterator<const Token*>& std::_List_iterator<const Token*>::operator=(std::_List_iterator<const Token*>&&)
/usr/include/c++/4.8/bits/stl_list.h:125:12: note: no known conversion for argument 1 from ‘void’ to ‘std::_List_iterator<const Token*>&&’
make[2]: *** [build/main.o] 错误 1
make[2]:正在离开目录 /home/m100/gitRep/wgtcc' make[1]: *** [all] 错误 2 make[1]:正在离开目录/home/m100/gitRep/wgtcc'
make: *** [install] 错误 2

if I have some wrong actions with building it?
Thank you!

Token spacing in preprocessor

Macro expansion is a tricky operation, fraught with nasty corner cases. I've tried some compilers (gcc, clang, lcc, tcc, 9cc, wgtcc, 8cc) for below's code snippet. Unfortunately, only gcc, clang and lcc got right.

#define PLUS +
#define EMPTY
#define f(x) =x=
+PLUS -EMPTY- PLUS+ f(=)

The right output is

 + + - - + + = = =

not

++ -- ++ ===

user supplied header file accidentally named 'wgtcc.h'

create a header file named wgtcc.h:

fuck

foo.c:

#include <wgtcc.h>
void foo() {}
$ wgtcc -E foo.c -I.

fuckfuck
void foo() {}

The output is strange.

If we don't create wgtcc.h, the internal wgtcc.h would be included accidentally.

In short, the internal wgtcc.h must NOT be placed in the standard including path.

';' expected, but got '__asm'

/usr/include/stdio.h:153:30: error: ';' expected, but got '__asm'
#define __DARWIN_ALIAS(sym) __asm("_" __STRING(sym) __DARWIN_SUF_UNIX03)

string initializer truncate

struct S {
    char c[10];
};

struct S s = {"abc", .c[0] = 89};

gcc:

	.globl	s1
	.data
	.align 8
	.type	s1, @object
	.size	s1, 10
s1:
	.byte	89
	.byte	98
	.byte	99
	.byte	0
	.zero	6

wgtcc:

	.data
	.globl	s1
	.align	1
	.type	s1, @object
	.size	s1, 10
s1:
	.byte	89
	.zero	9

weird not to pass ref type in UpdateFirstTokenLine

in cpp.h:

void Preprocessor::UpdateFirstTokenLine(TokenSequence ts);

It looks like no side effects will happen to ts because the copy of ts is passed. But the copy assignment behaves like reference and finally the ts's loc_ is changed.

It's weird to me. Why not just pass a reference?

wrong error report 'incompatible type of initializer'

struct S {
    int a;
    int b;
};

int i;
struct S s = {1, (struct S){i}.a};

wgtcc: error: incompatible type of initializer

But there are no type errors, the error is that the initializer is not a compile-time constant.

Can't compile by clang

Such as the line 11 in code_gen.h, it should add template<> before class Evaluator<Addr>. And in cpp.cc, the variable cond type of is int, but ppCondStack_.push() need a bool type parameter. Casting a int to bool is not valid.

clang is so different with gcc?

code_gen.cc 有一处编译错误

void Generator::AllocObjects(Scope* scope, const FuncDef::ParamList& params)
     if (obj->Type()->ToArray()) {
       // The alignment of an array is at least the aligment of a pointer
       // (as it is always cast to a pointer)
       align = std::max(align, );    //<<<<======这里
     }
     offset = Type::MakeAlign(offset, align);
     obj->SetOffset(offset);

compile julia failed

compile and run julia will result in segment fault. struct event structure is packed in the kernel. But as wgtcc do not support this extension, it compiles struct event into 16 bytes(while gcc will packed it to 12 bytes). It causes segment fault when trying to get fd after request accepted.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.