GithubHelp home page GithubHelp logo

tree-sitter / tree-sitter-cpp Goto Github PK

View Code? Open in Web Editor NEW
227.0 17.0 75.0 76.46 MB

C++ grammar for tree-sitter

License: MIT License

JavaScript 83.15% Scheme 3.96% C++ 1.63% C 11.26%
tree-sitter cplusplus parser

tree-sitter-cpp's Introduction

tree-sitter-cpp's People

Contributors

alois31 avatar amaanq avatar aryx avatar asutherland avatar bekavalentine avatar bigredeye avatar brandonspark avatar calixteman avatar chewygumball avatar cynix avatar ecnerwala avatar elbeno avatar jdrouhard avatar khiner avatar lukepistrol avatar luni-4 avatar mathieunls avatar maxbrunsfeld avatar mliszcz avatar msftenhanceprovenance avatar observeroftime avatar sam-mccall avatar sebastiansturm avatar squadrick avatar thehamsta avatar v1nh1shungry avatar vladh avatar xvilka avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tree-sitter-cpp's Issues

I give up

Sorry for bothering you, and sorry for consuming your time, I was really excited about your project, I watched the whole 43 minutes about this project and I wanted to have it work, but I find help nowhere, I hoped that you could help me
I am sorry ,I hoping you have a great time
Good luck

tree-sitter-cpp changes in AST breaks nimterop

This commit just broke nimterop parsing of tree-sitter AST.

bitwise_expression is now binary_expression. I'm not sure what else has changed or why but am sticking with v0.15.0 for now.

Please let me know what other breaking changes were made.

typedef enum {
  enum4 = 3,
  enum5,
  enum6,
  enum6a = enum5 & enum6,
  enum6b = enum5 | enum6
} ENUM2;

Was:

(translation_unit 1 1 113
 (type_definition 1 1 111
  (enum_specifier 1 9 96
   (enumerator_list 1 14 91
    (enumerator 2 3 9
     (identifier 2 3 5)
     (number_literal 2 11 1)
    )
    (enumerator 3 3 5
     (identifier 3 3 5)
    )
    (enumerator 4 3 5
     (identifier 4 3 5)
    )
    (enumerator 5 3 22
     (identifier 5 3 6)
     (bitwise_expression 5 12 13
      (identifier 5 12 5)
      (identifier 5 20 5)
     )
    )
    (enumerator 6 3 22
     (identifier 6 3 6)
     (bitwise_expression 6 12 13
      (identifier 6 12 5)
      (identifier 6 20 5)
     )
    )
   )
  )
  (type_identifier 7 3 5)
 )
)

Now:

(translation_unit 1 1 113
 (type_definition 1 1 111
  (enum_specifier 1 9 96
   (enumerator_list 1 14 91
    (enumerator 2 3 9
     (identifier 2 3 5)
     (number_literal 2 11 1)
    )
    (enumerator 3 3 5
     (identifier 3 3 5)
    )
    (enumerator 4 3 5
     (identifier 4 3 5)
    )
    (enumerator 5 3 22
     (identifier 5 3 6)
     (binary_expression 5 12 13
      (identifier 5 12 5)
      (identifier 5 20 5)
     )
    )
    (enumerator 6 3 22
     (identifier 6 3 6)
     (binary_expression 6 12 13
      (identifier 6 12 5)
      (identifier 6 20 5)
     )
    )
   )
  )
  (type_identifier 7 3 5)
 )
)

Errors detected for valid C++

From the soloud.h file:

handle play(AudioSource &aSound, float aVolume = -1.0f, float aPan = 0.0f, bool aPaused = 0, unsigned int aBus = 0);

Error detected for -1.0f for the f

(translation_unit 1 1 116
 (declaration 1 1 116
  (type_identifier 1 1 6)
  (function_declarator 1 8 108
   (identifier 1 8 4)
   (parameter_list 1 12 104
    (parameter_declaration 1 13 19
     (type_identifier 1 13 11)
     (reference_declarator 1 25 7
      (identifier 1 26 6)
     )
    )
    (optional_parameter_declaration 1 34 20
     (primitive_type 1 34 5)
     (identifier 1 40 7)
     (math_expression 1 50 4
      (number_literal 1 51 3)
     )
    )
    (ERROR 1 54 1                                     <==
     (identifier 1 54 1)
    )
    (optional_parameter_declaration 1 57 16
     (primitive_type 1 57 5)
     (identifier 1 63 4)
     (number_literal 1 70 3)
    )
    (ERROR 1 73 1                                     <==
     (identifier 1 73 1)
    )
    (optional_parameter_declaration 1 76 16
     (primitive_type 1 76 4)
     (identifier 1 81 7)
     (number_literal 1 91 1)
    )
    (optional_parameter_declaration 1 94 21
     (sized_type_specifier 1 94 12
      (primitive_type 1 103 3)
     )
     (identifier 1 107 4)
     (number_literal 1 114 1)
    )
   )
  )
 )
)

The numbers in the AST are the row, column and length.

Macro annotations

In firefox code, we use some macros to make annotations (for clang static analysis for example).
For example:
https://searchfox.org/mozilla-central/source/dom/animation/AnimationTarget.h#54
or
https://searchfox.org/mozilla-central/source/mfbt/CheckedInt.h#510
Such annotations lead to parse errors and I understand that (because likely this comment itself is valid C++/C: we just have to define some good macros).
So is it something tree-sitter-c/cpp could handle or should I have my own grammar for this code ?
I'd prefer of course the former.

Weird behaviour with Structs

I feel the following is an issue because the full code compiles, which indicates it is a valid construction. I don't know much about structs in C++ though.

Anyway, the following is handled strangely (can see easier in Atom).

struct Scanner {

  vector<CategoryDescription> category_descriptions = {
    {FOO, FOO, FOO},
    {FOO, FOO, FOO}
  };

  map<string, TokenType> tokens = {
    {"(", BEGIN_INLINE_MATH},
    {")", END_INLINE_MATH},
  };

  map<char, Category> catcodes = {
    {'\\',   ESCAPE_CATEGORY},
    {'{',    BEGIN_CATEGORY}
  };

};

The first item is scoped differently to the others, and reordering them or adding fields above does not affect this (whatever the first is is still scoped differently).

It also folds incorrectly; folding line 1 folds up until the semicolon of the vector, when it should be folding the entire struct.

value field of declaration node is required in node-types.json but does not exist in parse tree

It is said that value field of declaration node is required in node-types.json. The following code is related part of node-types.json.

  {
    "type": "declaration",
    "named": true,
    "fields": {
      .....
      "value": {
        "multiple": false,
        "required": true,
        "types": [
          {
            "type": "_expression",
            "named": true
          },
          {
            "type": "initializer_list",
            "named": true
          }
        ]
      },
      .....
    },

However, value field of declaration node does not exist in the parse tree generated by tree sitter. The parse tree of source code int func(); is,

declaration:
   children: [ 
     primitive_tive,
     function_declarator,
    ;
  ]
  ...

The field name of primitive_tive child node is type and field name of function_declarator child node is declarator. However, there is no child node with value field name.

Method with trailing semi colon

This code will parse without error:

class Test
{
  char *name() { return ""; }
};

However this code generates a parse error:

class Test
{
  char *name() { return ""; };
};

Template Member Initialization

A templated constructor that using member inititialization breaks syntax highlighting in the class's definition.

template_member_init

In this example the types of the following functions and the first member type are missing their highlighting.

Error on multiple names for same struct in definition

Snippet is adapted from FreeImage.h.

typedef struct tagBITMAPINFOHEADER{
  int biClrImportant;
} BITMAPINFOHEADER, *PBITMAPINFOHEADER;

This compiles fine in gcc but fails in tree-sitter.

(translation_unit 1 1 97
 (type_definition 1 1 97
  (struct_specifier 1 9 51
   (type_identifier 1 16 19)
   (field_declaration_list 1 35 25
    (field_declaration 2 3 19
     (primitive_type 2 3 3)
     (field_identifier 2 7 14)
    )
   )
  )
  (ERROR 3 3 17
   (type_identifier 3 3 16)
  )
  (pointer_declarator 3 21 18
   (type_identifier 3 22 17)
  )
 )
)

Instructions

I don't know how to associate the tree-sitter-cpp to the built in Atom tree-sitter. How to install the tree-sitter-cpp ? This is not shiped with Atom editor and how to build a grammar, there are no instructions neither on https://tree-sitter.github.io/tree-sitter/
I want clear instructions, do this and do that
I hope you can help me please
Windows 7 64x bit

Error when defining multiple structs following typedef

typedef struct foo {
    int i;
} a, b;

Causes the bug, as seen in this AST output:

(translation_unit 1 1 39
 (type_definition 1 1 39
  (struct_specifier 1 9 25
   (type_identifier 1 16 3)
   (field_declaration_list 1 20 14
    (field_declaration 2 5 6
     (primitive_type 2 5 3)
     (field_identifier 2 9 1)
    )
   )
  )
  (ERROR 3 3 2
   (type_identifier 3 3 1)
  )
  (type_identifier 3 6 1)
 )
)

When only defining one struct, (ie. } a; instead of a, b):

(translation_unit 1 1 36
 (type_definition 1 1 36
  (struct_specifier 1 9 25
   (type_identifier 1 16 3)
   (field_declaration_list 1 20 14
    (field_declaration 2 5 6
     (primitive_type 2 5 3)
     (field_identifier 2 9 1)
    )
   )
  )
  (type_identifier 3 3 1)
 )
)

No distinction between line/block comments

There appears to be one type of comment node, and it's called comment. Both line comments and block comments are assigned to this node type. As a result, the grammar in atom/language-c can't distinguish between the two types and scopes both as comment rather than comment.line or comment.block.

This means that line comments can't be styled differently from block comments. It also means that a package like DocBlockr can't function properly because it relies on the scope name to tell it how to behave within different types of comments.

I'm not a C++ expert, to put it mildly; so if there's a reason for this that I'm unfamiliar with, please let me know.

Make the varargs ellipsis a named node

No information is provided about the ellipsis in the AST.

extern void eggx_gsetinitialparsegeometry( const char *, ... ) ;

AST is:

(translation_unit 1 1 64
 (declaration 1 1 64
  (storage_class_specifier 1 1 6)
  (primitive_type 1 8 4)
  (function_declarator 1 13 50
   (identifier 1 13 29)
   (parameter_list 1 42 21
    (parameter_declaration 1 44 12
     (type_qualifier 1 44 5)
     (primitive_type 1 50 4)
     (abstract_pointer_declarator 1 55 1)
    )
   )
  )
 )
)

Support GCC's __thread

Due to __thread, the following code is parsed as function node instead of struct node.

static __thread struct
{
    const void **tab;
    size_t count;
} list = { NULL, 0 };

Struct with __attribute__

Structures with attributes are parsed incorrectly. For example:

struct __attribute__((packed)) struct1_t {
    static constexpr int A = 1;
    static constexpr int B = 2;
};

It seems to me, that they are taken for function_definition somehow.
This will help to resolve this issue.
Thank you very much.

The C++ grammar is ambiguous

I just come across the Strange Loop talk about tree-sitter, which claimed that tree-sitter aims to enable syntax highlighting that both:

  1. does not require building a full semantic model of the code (the way, say, an LSP language server would); and
  2. highlights different language elements differently, e.g. types differently from variables

It seems to me that these two goals are in fundamental conflict for a language with an ambiguous grammar such as C++.

To take the classic example: in a statement context, the following statement in C++:

a * b;

is ambiguous between an expression-statement which multiplies the two variables a and b, and a declaration-statement which declares a variable b of type pointer-toa. The answer depends on how a is previously declared: as a variable, or as a type (and note the declaration could occur in an included header file).

A quick check in the tree-sitter playground suggests that tree-sitter is always parsing this as a declaration-statement. For example, given the input code:

int a, b;
void foo() {
  a * b;
}

the resulting syntax tree is:

translation_unit [0, 0] - [4, 0])
  declaration [0, 0] - [0, 9])
    type: primitive_type [0, 0] - [0, 3])
    declarator: identifier [0, 4] - [0, 5])
    declarator: identifier [0, 7] - [0, 8])
  function_definition [1, 0] - [3, 1])
    type: primitive_type [1, 0] - [1, 4])
    declarator: function_declarator [1, 5] - [1, 10])
      declarator: identifier [1, 5] - [1, 8])
      parameters: parameter_list [1, 8] - [1, 10])
    body: compound_statement [1, 11] - [3, 1])
      declaration [2, 2] - [2, 8])
        type: type_identifier [2, 2] - [2, 3])
        declarator: pointer_declarator [2, 4] - [2, 7])
          declarator: identifier [2, 6] - [2, 7])

Notably, syntax highlighting built on top of this would incorrectly color the a token in a * b as a type, rather than as a variable.

How does tree-sitter plan to deal with such ambiguities?

Template Class Constructors parsed incorrectly

@rsese - copied over @BeefaloKing's report from atom/language-c#332

Prerequisites

Description

Both the name and initializer list of template class constructors have incorrect syntax highlighting when defined outside the class declaration.
image
Edit: Added clarity in screenshot

Edit by @rsese - copying over a comment

The foo() and data(nullptr) on line 17. These should match the highlighting for the constructor bar() on line 20.

Additionally, if foo() were not a constructor, and had a return type, it would be highlighted correctly. (e.g. template int foo::foobar())

Steps to Reproduce

  1. Define a constructor for a template class outside the class declaration
template<typename T> class foo
{
public:
	foo();
private:
	T *data;
};

class bar
{
public:
	bar();
private:
	int *data;
};

template<typename T> foo<T>::foo() : data(nullptr)
{}

bar::bar() : data(nullptr)
{}

Versions

Atom : 1.38.2
Electron: 2.0.18
Chrome : 61.0.3163.100
Node : 8.9.3

Possible Related Issues

atom/language-c#90

Default operator=

It seems that default_method_clause is missing for assignment operators (operator=).
I'm not sure, but maybe it's worth to add default_method_clause into inline_method_definition alongside delete_method_clause like in constructor_or_destructor_definition.

class SomeClass : public BaseClass {
    SomeClass(const SomeClass&) = delete;            // OK
    SomeClass(SomeClass&&) = default;                // OK
    SomeClass& operator=(const SomeClass&) = delete; // OK
    SomeClass& operator=(SomeClass&&) = default;     // FAILS
}

This will help to improve this extension. Thank you.

Math Operators

The only Math operator that is properly highlighted is the *

math_operators2

Final Keyword

The final modifier for virtual functions is treated as syntax--variable.syntax--other.syntax--member while the override keyword is syntax--support.syntax--storage.syntax--type, causing them to be highlighted differently.

override_final

They should probably both be syntax--support.syntax--storage.syntax--type

`virtual` keyword before member functions causes parsing errors

Thanks for creating tree-sitter and the corresponding grammars!

Parsing the following code causes errors in the syntax tree:

#include <string>

class Point
{
public:
    Point(int x, int y) { this.x = x; this.y = y; }
    ~Point();
    virtual ~Point();
    virtual void dance();
    void dance();

    Foo create();
    
private:
    int x, y;
    std::string name;
};

int main( [[maybe_unused]] int argc, const char** argv )
{
    return 0;
}

Here are the corrsponding s-expressions:

(translation_unit (preproc_include (system_lib_string)) (class_specifier (type_identifier) (field_declaration_list (access_specifier) (function_definition (function_declarator (identifier) (parameter_list (parameter_declaration (primitive_type) (identifier)) (parameter_declaration (primitive_type) (identifier)))) (compound_statement (expression_statement (assignment_expression (field_expression (identifier) (field_identifier)) (identifier))) (expression_statement (assignment_expression (field_expression (identifier) (field_identifier)) (identifier))))) (declaration (function_declarator (destructor_name (identifier)) (parameter_list))) (field_declaration (type_identifier) (ERROR) (function_declarator (field_identifier) (parameter_list))) (field_declaration (type_identifier) (ERROR (identifier)) (function_declarator (field_identifier) (parameter_list))) (field_declaration (primitive_type) (function_declarator (field_identifier) (parameter_list))) (field_declaration (type_identifier) (function_declarator (field_identifier) (parameter_list))) (access_specifier) (field_declaration (primitive_type) (field_identifier) (field_identifier)) (field_declaration (scoped_type_identifier (namespace_identifier) (type_identifier)) (field_identifier)))) (function_definition (primitive_type) (function_declarator (identifier) (parameter_list (ERROR (lambda_capture_specifier (identifier))) (parameter_declaration (primitive_type) (identifier)) (parameter_declaration (type_qualifier) (primitive_type) (pointer_declarator (pointer_declarator (identifier)))))) (compound_statement (return_statement (number_literal)))))

In particular, attributes are not recognized and the keyword virtual in front of member functions.

field_declaration:
         virtual ~Point();

type_identifier:
         virtual

ERROR:
         ~

I used the tree-sitter rust crate and the most up-to-date revision of the c++ grammar (57dd274de60d36645b8445ce808816835cfd2fb9)

Bad AST for the following code

# define SHA256_CBLOCK   (SHA_LBLOCK*4)/* SHA-256 treats input data as a
                                        * contiguous array of 32 bit wide
                                        * big-endian values. */

typedef struct SHA256state_st {
    SHA_LONG h[8];
    SHA_LONG Nl, Nh;
    SHA_LONG data[SHA_LBLOCK];
    unsigned int num, md_len;
} SHA256_CTX;

Results in bad AST with errors. Comment style confuses tree-sitter.

(translation_unit 1 1 359
 (preproc_def 1 1 73
  (identifier 1 10 13)
  (preproc_arg 1 23 50)
 )
 (expression_statement 2 41 12
  (pointer_expression 2 41 12
   (identifier 2 43 10)
  )
 )
 (declaration 2 54 15
  (type_identifier 2 54 5)
  (ERROR 2 60 5
   (identifier 2 60 2)
   (number_literal 2 63 2)
  )
  (identifier 2 66 3)
 )
 (expression_statement 2 70 77
  (binary_expression 2 70 77
   (binary_expression 2 70 50
    (identifier 2 70 4)
    (identifier 3 43 3)
   )
   (ERROR 3 47 14
    (identifier 3 47 6)
    (identifier 3 54 6)
   )
   (binary_expression 3 62 11
    (pointer_expression 3 62 1
     (identifier 3 63 0)
    )
    (identifier 5 1 7)
   )
  )
 )
 (declaration 5 9 138
  (struct_specifier 5 9 126
   (type_identifier 5 16 14)
   (field_declaration_list 5 31 104
    (field_declaration 6 5 14
     (type_identifier 6 5 8)
     (array_declarator 6 14 4
      (field_identifier 6 14 1)
      (number_literal 6 16 1)
     )
    )
    (field_declaration 7 5 16
     (type_identifier 7 5 8)
     (field_identifier 7 14 2)
     (field_identifier 7 18 2)
    )
    (field_declaration 8 5 26
     (type_identifier 8 5 8)
     (array_declarator 8 14 16
      (field_identifier 8 14 4)
      (identifier 8 19 10)
     )
    )
    (field_declaration 9 5 25
     (sized_type_specifier 9 5 12
      (primitive_type 9 14 3)
     )
     (field_identifier 9 18 3)
     (field_identifier 9 23 6)
    )
   )
  )
  (identifier 10 3 10)
 )
)

Default vs Delete

When a member function is marked delete, it treats the delete as syntax--keyword.syntax--operator, while the default is treated as syntax--keyword.syntax--control.

default_delete

It would make sense to have delete in this case to be treated as syntax--keyword.syntax--control

Errors detected for valid C

From the ImageMagick codebase - image-view.h:

typedef MagickBooleanType
  (*DuplexTransferImageViewMethod)(const ImageView *,const ImageView *,
    ImageView *,const ssize_t,const int,void *),
  (*GetImageViewMethod)(const ImageView *,const ssize_t,const int,void *),
  (*SetImageViewMethod)(ImageView *,const ssize_t,const int,void *),
  (*TransferImageViewMethod)(const ImageView *,ImageView *,const ssize_t,
    const int,void *),
(*UpdateImageViewMethod)(ImageView *,const ssize_t,const int,void *);

Multi-line function declaration with same return value, separated by commas - tree-sitter errors on the ,.

AST output is here. See line 65.

Can not parse operator new/delete

Just trying parsing some c++ sources I found that the actual parser do not handle operator new/delete:

class One {
 public:
	char *s;

	void *operator new(size_t);
	void operator delete(void *);
	void dcl(void);

	One();
};

Output:

translation_unit [0, 0] - [11, 0])
  class_specifier [0, 0] - [9, 1])
    name: type_identifier [0, 6] - [0, 9])
    body: field_declaration_list [0, 10] - [9, 1])
      access_specifier [1, 1] - [1, 8])
      field_declaration [2, 1] - [2, 9])
        type: primitive_type [2, 1] - [2, 5])
        declarator: pointer_declarator [2, 6] - [2, 8])
          declarator: field_identifier [2, 7] - [2, 8])
      field_declaration [4, 1] - [4, 15])
        type: primitive_type [4, 1] - [4, 5])
        declarator: pointer_declarator [4, 6] - [4, 15])
          declarator: field_identifier [4, 7] - [4, 15])
        MISSING ; [4, 15] - [4, 15])
      declaration [4, 16] - [4, 28])
        declarator: function_declarator [4, 16] - [4, 27])
          declarator: identifier [4, 16] - [4, 19])
          parameters: parameter_list [4, 19] - [4, 27])
            parameter_declaration [4, 20] - [4, 26])
              type: primitive_type [4, 20] - [4, 26])
      field_declaration [5, 1] - [5, 14])
        type: primitive_type [5, 1] - [5, 5])
        declarator: field_identifier [5, 6] - [5, 14])
        MISSING ; [5, 14] - [5, 14])
      declaration [5, 15] - [5, 30])
        declarator: function_declarator [5, 15] - [5, 29])
          declarator: identifier [5, 15] - [5, 21])
          parameters: parameter_list [5, 21] - [5, 29])
            parameter_declaration [5, 22] - [5, 28])
              type: primitive_type [5, 22] - [5, 26])
              declarator: abstract_pointer_declarator [5, 27] - [5, 28])
      field_declaration [6, 1] - [6, 16])
        type: primitive_type [6, 1] - [6, 5])
        declarator: function_declarator [6, 6] - [6, 15])
          declarator: field_identifier [6, 6] - [6, 9])
          parameters: parameter_list [6, 9] - [6, 15])
            parameter_declaration [6, 10] - [6, 14])
              type: primitive_type [6, 10] - [6, 14])
      declaration [8, 1] - [8, 7])
        declarator: function_declarator [8, 1] - [8, 6])
          declarator: identifier [8, 1] - [8, 4])
          parameters: parameter_list [8, 4] - [8, 6])

Macro prevents correct parsing of class

An imported marco (I simplified the source file), prevented the correct parsing of the following class.

#include <GLFW/glfw3.h>

PXR_NAMESPACE_USING_DIRECTIVE

class Scene
{

};

I know that in the presence of macro a parser almost has no chance to understand C++, but I hope that this failure case may be useful for improving the parser.



translation_unit [3, 0] - [15, 0])
  preproc_include [3, 0] - [6, 0])
    path: system_lib_string [3, 9] - [3, 23])
  declaration [6, 0] - [11, 2])
    type: type_identifier [6, 0] - [6, 29])
    ERROR [8, 0] - [8, 5])
      identifier [8, 0] - [8, 5])
    declarator: init_declarator [8, 6] - [11, 1])
      declarator: identifier [8, 6] - [8, 11])
      value: initializer_list [9, 0] - [11, 1])

Structured Bindings

Structured bindings cause the entire statement to be highlighted the same color. I have tried it across several themes, and the bug is consistent.

This screenshot shows the error in action and an example of how to reproduce it.

int a = std::get<0>(t);

Statements like int a = std::get<0>(t); are parsed incorrectly like rational_expression.
At the same time, next expressions are fine: std::get<0>(t);, int a = std::get<0>();, int a = get<0>(t);. So the issue occurs only when there are assignment, namespace qualifier and argument.

C++11 template type alias declarations

Not sure what these are, I just got told that there's an error in the parse tree. The code itself is from cquery, so it should be valid.

#pragma once

class ClangCursor {
 public:
  template <typename TClientData>
  using Visitor = VisitResult (*)(ClangCursor cursor,
                                  ClangCursor parent,
                                  TClientData* client_data);
};

The errors are in #pragma once (which I understand is not standardised, but is widely used), and VisitResult (*). Looks like the template is the cause?

Fold expressions are not parsed explicitly

Fold expressions, which were introduced in C++17, are currently not explicitly parsed. It doesn't seem like there any scenarios that produces any parser errors though, so this can probably be considered a low-priority enhancement. However, since these expressions are not parsed correctly, it's currently not possible to target specific tokens, e.g. the ellipsis (...).

Full details on fold expressions can be found at https://en.cppreference.com/w/cpp/language/fold

Below are some examples of valid fold expressions, stolen from cppreference.com, which might be useful when attempting to resolve this issue.

template<typename... Args>
bool all(Args... args) { return (... && args); }
template<typename ...Args>
void printer(Args&&... args) {
    (std::cout << ... << args) << '\n';
}
template<typename T, typename... Args>
void push_back_vec(std::vector<T>& v, Args&&... args)
{
    static_assert((std::is_constructible_v<T, Args&> && ...));
    (v.push_back(args), ...);
}
// compile-time endianness swap based on http://stackoverflow.com/a/36937049 
template<class T, std::size_t... N>
constexpr T bswap_impl(T i, std::index_sequence<N...>) {
  return (((i >> N*CHAR_BIT & std::uint8_t(-1)) << (sizeof(T)-1-N)*CHAR_BIT) | ...);
}

template<class T, class U = std::make_unsigned_t<T>>
constexpr U bswap(T i) {
  return bswap_impl<U>(i, std::make_index_sequence<sizeof(T)>{});
}

Handle throw & noexcept specifiers

For example:

void foo() noexcept;
void foo() noexcept(true);
template<class T> T foo() noexcept(sizeof(T) < 4);
void foo() throw();
void foo() throw(int);
void foo() throw(std::string, char *);
void foo() throw(float) { }

parsing errors with virtual keyword

class A {
  virtual void foo() {
      if (x) {
      }
  }
};

With virtual keyword, it seems that the function body is saw as an initializer list (??) and so the if leads to a parsing error.

Casting Operators

All 3 casting operators are recognized as syntax--entity.syntax--name.syntax--function, this causes them to be highlighted the same as regular functions.

casting_operators

It might make sense to treat them as operators, like the TextMate version does

Error detected with #define within struct

Looks like gcc doesn't mind it but it breaks tree-sitter-cpp. Example in jpeglib.h

struct jpeg_error_mgr {
  /* Error exit handler: does not return to caller */
  JMETHOD(void, error_exit, (j_common_ptr cinfo));
  /* Conditionally emit a trace or warning message */
  JMETHOD(void, emit_message, (j_common_ptr cinfo, int msg_level));
  /* Routine that actually outputs a trace or error message */
  JMETHOD(void, output_message, (j_common_ptr cinfo));
  /* Format a message string for the most recent JPEG error or message */
  JMETHOD(void, format_message, (j_common_ptr cinfo, char * buffer));
#define JMSG_LENGTH_MAX  200	/* recommended size of format_message buffer */
  /* Reset error state variables at start of a new image */
  JMETHOD(void, reset_error_mgr, (j_common_ptr cinfo));
  
  /* The message ID code and any parameters are saved here.
   * A message can have one string parameter or up to 8 int parameters.
   */
  int msg_code;
#define JMSG_STR_PARM_MAX  80
  union {
    int i[8];
    char s[JMSG_STR_PARM_MAX];
} msg_parm;

tree-sitter adds an ERROR node.

(translation_unit 1 1 624
 (struct_specifier 1 1 623
  (type_identifier 1 8 14)
  (field_declaration_list 1 23 601
   ...
   (field_declaration 5 3 59
    (primitive_type 5 3 4)
    (function_declarator 5 8 53
     (pointer_declarator 5 9 15
      (field_identifier 5 10 14)
     )
     (parameter_list 5 26 35
      (parameter_declaration 5 27 18
       (type_identifier 5 27 12)
       (identifier 5 40 5)
      )
      (parameter_declaration 5 47 13
       (primitive_type 5 47 4)
       (pointer_declarator 5 52 8
        (identifier 5 54 6)
       )
      )
     )
    )
   )
   (ERROR 6 1 27
    (identifier 6 9 15)
    (number_literal 6 25 3)
   )
   (field_declaration 7 3 45
    (primitive_type 7 3 4)
    (function_declarator 7 8 39
     (pointer_declarator 7 9 16
      (field_identifier 7 10 15)
     )
     (parameter_list 7 27 20
      (parameter_declaration 7 28 18
       (type_identifier 7 28 12)
       (identifier 7 41 5)
      )
     )
    )
   )
   (field_declaration 8 3 13
    (primitive_type 8 3 3)
    (field_identifier 8 7 8)
   )
   (ERROR 9 1 28
    (identifier 9 9 17)
    (number_literal 9 27 2)
   )
   ...

Help mee

How to install this on windows 7 64x in Atom??

Nested classes are not parsed correctly

Example code:

class A {
public:
  class B {

  }
private:
  int a;
}

Scopes at cursor:
image

class A {
private:
  class B { };
  B *z;

  class C : private B {
  private:
      B y;
//      A::B y2;
      C *x;
//      A::C *x2;
    };
};

image

Brace vs Paranthesis Initialization

Brace Initialization treats the class name as a syntax--entity.syntax--name.syntax--type while Parenthesis Initialization treats the class name as a syntax--support.syntax--storage.syntax--function, this causes them to be highlighted differently.

class_highlight_ts_cpp

class ClassA {

};

ClassA a1 = ClassA();
ClassA a2 = ClassA{};

Variadic Templates

Syntax Highlighting for Variadic Template functions causes some strange errors.

variadic_templates

  1. The type name Ts in the func2 declaration isn't highlighted
  2. The variable name args in the func2 declaration isn't highlighted
  3. It causes the void in the func3 declaration to be not highlighted
  4. Error 2 only happens if the && symbol is used
  5. Error 3 only occurs if there is a ; in the body of func2 and they are in the same class declaration

Try Catch

The Catch keyword in try/catch blocks is not highlighted.

try_catch

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.