diff --git a/java2python/lang/Java.g b/java2python/lang/Java.g index 32f70b6..00166a2 100644 --- a/java2python/lang/Java.g +++ b/java2python/lang/Java.g @@ -1,1092 +1,2273 @@ -/** - * An ANTLRv3 capable Java 1.5 grammar for building ASTs. - * BSD licence + +/* + [The "BSD licence"] + Copyright (c) 2007-2008 Terence Parr + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + 3. The name of the author may not be used to endorse or promote products + derived from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +/* + * This file is modified by Yang Jiang (yang.jiang.z@gmail.com), taken from the original + * java grammar in www.antlr.org, with the goal to provide a standard ANTLR grammar + * for java, as well as an implementation to construct the same AST trees as javac does. + * + * The major changes of this version as compared to the original version include: + * 1) Top level rules are changed to include all of their sub-components. + * For example, the rule + * + * classOrInterfaceDeclaration + * : classOrInterfaceModifiers (classDeclaration | interfaceDeclaration) + * ; + * + * is changed to + * + * classOrInterfaceDeclaration + * : classDeclaration | interfaceDeclaration + * ; + * + * with classOrInterfaceModifiers been moved inside classDeclaration and + * interfaceDeclaration. + * + * 2) The original version is not quite clear on certain rules like memberDecl, + * where it mixed the styles of listing of top level rules and listing of sub rules. + * + * memberDecl + * : genericMethodOrConstructorDecl + * | memberDeclaration + * | 'void' Identifier voidMethodDeclaratorRest + * | Identifier constructorDeclaratorRest + * | interfaceDeclaration + * | classDeclaration + * ; * - * Copyright (c) 2007-2008 by HABELITZ Software Developments + * This is changed to a * - * All rights reserved. + * memberDecl + * : fieldDeclaration + * | methodDeclaration + * | classDeclaration + * | interfaceDeclaration + * ; + * by folding similar rules into single rule. * - * http://www.habelitz.com + * 3) Some syntactical predicates are added for efficiency, although this is not necessary + * for correctness. * + * 4) Lexer part is rewritten completely to construct tokens needed for the parser. + * + * 5) This grammar adds more source level support * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: * - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * 3. The name of the author may not be used to endorse or promote products - * derived from this software without specific prior written permission. + * This grammar also adds bug fixes. * - * THIS SOFTWARE IS PROVIDED BY HABELITZ SOFTWARE DEVELOPMENTS ('HSD') ``AS IS'' - * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL 'HSD' BE LIABLE FOR ANY DIRECT, INDIRECT, - * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, - * OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, - * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * 1) Adding typeArguments to superSuffix to alHexSignificandlow input like + * super.method() + * + * 2) Adding typeArguments to innerCreator to allow input like + * new Type1().new Type2() + * + * 3) conditionalExpression is changed to + * conditionalExpression + * : conditionalOrExpression ( '?' expression ':' conditionalExpression )? + * ; + * to accept input like + * true?1:2=3 + * + * Note: note this is by no means a valid input, by the grammar should be able to parse + * this as + * (true?1:2)=3 + * rather than + * true?1:(2=3) * + * + * Know problems: + * Won't pass input containing unicode sequence like this + * char c = '\uffff' + * String s = "\uffff"; + * Because Antlr does not treat '\uffff' as an valid char. This will be fixed in the next Antlr + * release. [Fixed in Antlr-3.1.1] + * + * Things to do: + * More effort to make this grammar faster. + * Error reporting/recovering. + * + * + * NOTE: If you try to compile this file from command line and Antlr gives an exception + * like error message while compiling, add option + * -Xconversiontimeout 100000 + * to the command line. + * If it still doesn't work or the compilation process + * takes too long, try to comment out the following two lines: + * | {isValidSurrogateIdentifierStart((char)input.LT(1), (char)input.LT(2))}?=>('\ud800'..'\udbff') ('\udc00'..'\udfff') + * | {isValidSurrogateIdentifierPart((char)input.LT(1), (char)input.LT(2))}?=>('\ud800'..'\udbff') ('\udc00'..'\udfff') + * + * + * Below are comments found in the original version. */ -grammar Java; +/** A Java 1.5 grammar for ANTLR v3 derived from the spec + * + * This is a very close representation of the spec; the changes + * are comestic (remove left recursion) and also fixes (the spec + * isn't exactly perfect). I have run this on the 1.4.2 source + * and some nasty looking enums from 1.5, but have not really + * tested for 1.5 compatibility. + * + * I built this with: java -Xmx100M org.antlr.Tool java.g + * and got two errors that are ok (for now): + * java.g:691:9: Decision can match input such as + * "'0'..'9'{'E', 'e'}{'+', '-'}'0'..'9'{'D', 'F', 'd', 'f'}" + * using multiple alternatives: 3, 4 + * As a result, alternative(s) 4 were disabled for that input + * java.g:734:35: Decision can match input such as "{'$', 'A'..'Z', + * '_', 'a'..'z', '\u00C0'..'\u00D6', '\u00D8'..'\u00F6', + * '\u00F8'..'\u1FFF', '\u3040'..'\u318F', '\u3300'..'\u337F', + * '\u3400'..'\u3D2D', '\u4E00'..'\u9FFF', '\uF900'..'\uFAFF'}" + * using multiple alternatives: 1, 2 + * As a result, alternative(s) 2 were disabled for that input + * + * You can turn enum on/off as a keyword :) + * + * Version 1.0 -- initial release July 5, 2006 (requires 3.0b2 or higher) + * + * Primary author: Terence Parr, July 2006 + * + * Version 1.0.1 -- corrections by Koen Vanderkimpen & Marko van Dooren, + * October 25, 2006; + * fixed normalInterfaceDeclaration: now uses typeParameters instead + * of typeParameter (according to JLS, 3rd edition) + * fixed castExpression: no longer allows expression next to type + * (according to semantics in JLS, in contrast with syntax in JLS) + * + * Version 1.0.2 -- Terence Parr, Nov 27, 2006 + * java spec I built this from had some bizarre for-loop control. + * Looked weird and so I looked elsewhere...Yep, it's messed up. + * simplified. + * + * Version 1.0.3 -- Chris Hogue, Feb 26, 2007 + * Factored out an annotationName rule and used it in the annotation rule. + * Not sure why, but typeName wasn't recognizing references to inner + * annotations (e.g. @InterfaceName.InnerAnnotation()) + * Factored out the elementValue section of an annotation reference. Created + * elementValuePair and elementValuePairs rules, then used them in the + * annotation rule. Allows it to recognize annotation references with + * multiple, comma separated attributes. + * Updated elementValueArrayInitializer so that it allows multiple elements. + * (It was only allowing 0 or 1 element). + * Updated localVariableDeclaration to allow annotations. Interestingly the JLS + * doesn't appear to indicate this is legal, but it does work as of at least + * JDK 1.5.0_06. + * Moved the Identifier portion of annotationTypeElementRest to annotationMethodRest. + * Because annotationConstantRest already references variableDeclarator which + * has the Identifier portion in it, the parser would fail on constants in + * annotation definitions because it expected two identifiers. + * Added optional trailing ';' to the alternatives in annotationTypeElementRest. + * Wouldn't handle an inner interface that has a trailing ';'. + * Swapped the expression and type rule reference order in castExpression to + * make it check for genericized casts first. It was failing to recognize a + * statement like "Class TYPE = (Class)...;" because it was seeing + * 'Class'. + * Changed createdName to use typeArguments instead of nonWildcardTypeArguments. + * + * Changed the 'this' alternative in primary to allow 'identifierSuffix' rather than + * just 'arguments'. The case it couldn't handle was a call to an explicit + * generic method invocation (e.g. this.doSomething()). Using identifierSuffix + * may be overly aggressive--perhaps should create a more constrained thisSuffix rule? + * + * Version 1.0.4 -- Hiroaki Nakamura, May 3, 2007 + * + * Fixed formalParameterDecls, localVariableDeclaration, forInit, + * and forVarControl to use variableModifier* not 'final'? (annotation)? + * + * Version 1.0.5 -- Terence, June 21, 2007 + * --a[i].foo didn't work. Fixed unaryExpression + * + * Version 1.0.6 -- John Ridgway, March 17, 2008 + * Made "assert" a switchable keyword like "enum". + * Fixed compilationUnit to disallow "annotation importDeclaration ...". + * Changed "Identifier ('.' Identifier)*" to "qualifiedName" in more + * places. + * Changed modifier* and/or variableModifier* to classOrInterfaceModifiers, + * modifiers or variableModifiers, as appropriate. + * Renamed "bound" to "typeBound" to better match language in the JLS. + * Added "memberDeclaration" which rewrites to methodDeclaration or + * fieldDeclaration and pulled type into memberDeclaration. So we parse + * type and then move on to decide whether we're dealing with a field + * or a method. + * Modified "constructorDeclaration" to use "constructorBody" instead of + * "methodBody". constructorBody starts with explicitConstructorInvocation, + * then goes on to blockStatement*. Pulling explicitConstructorInvocation + * out of expressions allowed me to simplify "primary". + * Changed variableDeclarator to simplify it. + * Changed type to use classOrInterfaceType, thus simplifying it; of course + * I then had to add classOrInterfaceType, but it is used in several + * places. + * Fixed annotations, old version allowed "@X(y,z)", which is illegal. + * Added optional comma to end of "elementValueArrayInitializer"; as per JLS. + * Changed annotationTypeElementRest to use normalClassDeclaration and + * normalInterfaceDeclaration rather than classDeclaration and + * interfaceDeclaration, thus getting rid of a couple of grammar ambiguities. + * Split localVariableDeclaration into localVariableDeclarationStatement + * (includes the terminating semi-colon) and localVariableDeclaration. + * This allowed me to use localVariableDeclaration in "forInit" clauses, + * simplifying them. + * Changed switchBlockStatementGroup to use multiple labels. This adds an + * ambiguity, but if one uses appropriately greedy parsing it yields the + * parse that is closest to the meaning of the switch statement. + * Renamed "forVarControl" to "enhancedForControl" -- JLS language. + * Added semantic predicates to test for shift operations rather than other + * things. Thus, for instance, the string "< <" will never be treated + * as a left-shift operator. + * In "creator" we rule out "nonWildcardTypeArguments" on arrayCreation, + * which are illegal. + * Moved "nonWildcardTypeArguments into innerCreator. + * Removed 'super' superSuffix from explicitGenericInvocation, since that + * is only used in explicitConstructorInvocation at the beginning of a + * constructorBody. (This is part of the simplification of expressions + * mentioned earlier.) + * Simplified primary (got rid of those things that are only used in + * explicitConstructorInvocation). + * Lexer -- removed "Exponent?" from FloatingPointLiteral choice 4, since it + * led to an ambiguity. + * + * This grammar successfully parses every .java file in the JDK 1.5 source + * tree (excluding those whose file names include '-', which are not + * valid Java compilation units). + * + * Known remaining problems: + * "Letter" and "JavaIDDigit" are wrong. The actual specification of + * "Letter" should be "a character for which the method + * Character.isJavaIdentifierStart(int) returns true." A "Java + * letter-or-digit is a character for which the method + * Character.isJavaIdentifierPart(int) returns true." + */ + + + /* + This is a merged file, containing two versions of the Java.g grammar. + To extract a version from the file, run the ver.jar with the command provided below. + + Version 1 - tree building version, with all source level support, error recovery etc. + This is the version for compiler grammar workspace. + This version can be extracted by invoking: + java -cp ver.jar Main 1 true true true true true Java.g + + Version 2 - clean version, with no source leve support, no error recovery, no predicts, + assumes 1.6 level, works in Antlrworks. + This is the version for Alex. + This version can be extracted by invoking: + java -cp ver.jar Main 2 false false false false false Java.g +*/ + +grammar Java; options { backtrack=true; memoize=true; - language=Python; - output=AST; - ASTLabelType=CommonTree; -} - - - -tokens { - - // operators and other special chars - - AND = '&' ; - AND_ASSIGN = '&=' ; - ASSIGN = '=' ; - AT = '@' ; - BIT_SHIFT_RIGHT = '>>>' ; - BIT_SHIFT_RIGHT_ASSIGN = '>>>=' ; - COLON = ':' ; - COMMA = ',' ; - DEC = '--' ; - DIV = '/' ; - DIV_ASSIGN = '/=' ; - DOT = '.' ; - DOTSTAR = '.*' ; - ELLIPSIS = '...' ; - EQUAL = '==' ; - GREATER_OR_EQUAL = '>=' ; - GREATER_THAN = '>' ; - INC = '++' ; - LBRACK = '[' ; - LCURLY = '{' ; - LESS_OR_EQUAL = '<=' ; - LESS_THAN = '<' ; - LOGICAL_AND = '&&' ; - LOGICAL_NOT = '!' ; - LOGICAL_OR = '||' ; - LPAREN = '(' ; - MINUS = '-' ; - MINUS_ASSIGN = '-=' ; - MOD = '%' ; - MOD_ASSIGN = '%=' ; - NOT = '~' ; - NOT_EQUAL = '!=' ; - OR = '|' ; - OR_ASSIGN = '|=' ; - PLUS = '+' ; - PLUS_ASSIGN = '+=' ; - QUESTION = '?' ; - RBRACK = ']' ; - RCURLY = '}' ; - RPAREN = ')' ; - SEMI = ';' ; - SHIFT_LEFT = '<<' ; - SHIFT_LEFT_ASSIGN = '<<=' ; - SHIFT_RIGHT = '>>' ; - SHIFT_RIGHT_ASSIGN = '>>=' ; - STAR = '*' ; - STAR_ASSIGN = '*=' ; - XOR = '^' ; - XOR_ASSIGN = '^=' ; - - // keywords - - ABSTRACT = 'abstract' ; - ASSERT = 'assert' ; - BOOLEAN = 'boolean' ; - BREAK = 'break' ; - BYTE = 'byte' ; - CASE = 'case' ; - CATCH = 'catch' ; - CHAR = 'char' ; - CLASS = 'class' ; - CONTINUE = 'continue' ; - DEFAULT = 'default' ; - DO = 'do' ; - DOUBLE = 'double' ; - ELSE = 'else' ; - ENUM = 'enum' ; - EXTENDS = 'extends' ; - FALSE = 'false' ; - FINAL = 'final' ; - FINALLY = 'finally' ; - FLOAT = 'float' ; - FOR = 'for' ; - IF = 'if' ; - IMPLEMENTS = 'implements' ; - INSTANCEOF = 'instanceof' ; - INTERFACE = 'interface' ; - IMPORT = 'import' ; - INT = 'int' ; - LONG = 'long' ; - NATIVE = 'native' ; - NEW = 'new' ; - NULL = 'null' ; - PACKAGE = 'package' ; - PRIVATE = 'private' ; - PROTECTED = 'protected' ; - PUBLIC = 'public' ; - RETURN = 'return' ; - SHORT = 'short' ; - STATIC = 'static' ; - STRICTFP = 'strictfp' ; - SUPER = 'super' ; - SWITCH = 'switch' ; - SYNCHRONIZED = 'synchronized' ; - THIS = 'this' ; - THROW = 'throw' ; - THROWS = 'throws' ; - TRANSIENT = 'transient' ; - TRUE = 'true' ; - TRY = 'try' ; - VOID = 'void' ; - VOLATILE = 'volatile' ; - WHILE = 'while' ; - - // tokens for imaginary nodes - - ANNOTATION_INIT_ARRAY_ELEMENT; - ANNOTATION_INIT_BLOCK; - ANNOTATION_INIT_DEFAULT_KEY; - ANNOTATION_INIT_KEY_LIST; - ANNOTATION_LIST; - ANNOTATION_METHOD_DECL; - ANNOTATION_SCOPE; - ANNOTATION_TOP_LEVEL_SCOPE; - ARGUMENT_LIST; - ARRAY_DECLARATOR; - ARRAY_DECLARATOR_LIST; - ARRAY_ELEMENT_ACCESS; - ARRAY_INITIALIZER; - BLOCK_SCOPE; - CAST_EXPR; - CATCH_CLAUSE_LIST; - CLASS_CONSTRUCTOR_CALL; - CLASS_INSTANCE_INITIALIZER; - CLASS_STATIC_INITIALIZER; - CLASS_TOP_LEVEL_SCOPE; - CONSTRUCTOR_DECL; - ENUM_TOP_LEVEL_SCOPE; - EXPR; - EXTENDS_BOUND_LIST; - EXTENDS_CLAUSE; - FOR_CONDITION; - FOR_EACH; - FOR_INIT; - FOR_UPDATE; - FORMAL_PARAM_LIST; - FORMAL_PARAM_STD_DECL; - FORMAL_PARAM_VARARG_DECL; - FUNCTION_METHOD_DECL; - GENERIC_TYPE_ARG_LIST; - GENERIC_TYPE_PARAM_LIST; - INTERFACE_TOP_LEVEL_SCOPE; - IMPLEMENTS_CLAUSE; - LABELED_STATEMENT; - LOCAL_MODIFIER_LIST; - JAVA_SOURCE; - METHOD_CALL; - MODIFIER_LIST; - PARENTESIZED_EXPR; - POST_DEC; - POST_INC; - PRE_DEC; - PRE_INC; - QUALIFIED_TYPE_IDENT; - STATIC_ARRAY_CREATOR; - SUPER_CONSTRUCTOR_CALL; - SWITCH_BLOCK_LABEL_LIST; - THIS_CONSTRUCTOR_CALL; - THROWS_CLAUSE; - TYPE; - UNARY_MINUS; - UNARY_PLUS; - VAR_DECLARATION; - VAR_DECLARATOR; - VAR_DECLARATOR_LIST; - VOID_METHOD_DECL; } - - -javaSource - : compilationUnit - -> ^(JAVA_SOURCE compilationUnit) +/******************************************************************************************** + Parser section +*********************************************************************************************/ + +compilationUnit + : ( (annotations + )? + packageDeclaration + )? + (importDeclaration + )* + (typeDeclaration + )* ; - -compilationUnit - : annotationList - packageDeclaration? - importDeclaration* - typeDecls* +packageDeclaration + : 'package' qualifiedName + ';' ; - -typeDecls - : typeDeclaration - | SEMI! +importDeclaration + : 'import' + ('static' + )? + IDENTIFIER '.' '*' + ';' + | 'import' + ('static' + )? + IDENTIFIER + ('.' IDENTIFIER + )+ + ('.' '*' + )? + ';' ; - -packageDeclaration - : PACKAGE^ qualifiedIdentifier SEMI! +qualifiedImportName + : IDENTIFIER + ('.' IDENTIFIER + )* ; - -importDeclaration - : IMPORT^ STATIC? qualifiedIdentifier DOTSTAR? SEMI! +typeDeclaration + : classOrInterfaceDeclaration + | ';' + ; +classOrInterfaceDeclaration + : classDeclaration + | interfaceDeclaration + ; + + +modifiers + : + ( annotation + | 'public' + | 'protected' + | 'private' + | 'static' + | 'abstract' + | 'final' + | 'native' + | 'synchronized' + | 'transient' + | 'volatile' + | 'strictfp' + )* + ; +variableModifiers + : ( 'final' + | annotation + )* ; - -typeDeclaration - : modifierList! - ( classTypeDeclaration[$modifierList.tree] - | interfaceTypeDeclaration[$modifierList.tree] - | enumTypeDeclaration[$modifierList.tree] - | annotationTypeDeclaration[$modifierList.tree] + +classDeclaration + : normalClassDeclaration + | enumDeclaration + ; +normalClassDeclaration + : modifiers 'class' IDENTIFIER + (typeParameters + )? + ('extends' type + )? + ('implements' typeList + )? + classBody + ; +typeParameters + : '<' + typeParameter + (',' typeParameter + )* + '>' + ; +typeParameter + : IDENTIFIER + ('extends' typeBound + )? + ; +typeBound + : type + ('&' type + )* + ; +enumDeclaration + : modifiers + ('enum' + ) + IDENTIFIER + ('implements' typeList + )? + enumBody + ; + +enumBody + : '{' + (enumConstants + )? + ','? + (enumBodyDeclarations + )? + '}' + ; +enumConstants + : enumConstant + (',' enumConstant + )* + ; +/** + * NOTE: here differs from the javac grammar, missing TypeArguments. + * EnumeratorDeclaration = AnnotationsOpt [TypeArguments] IDENTIFIER [ Arguments ] [ "{" ClassBody "}" ] + */ +enumConstant + : (annotations + )? + IDENTIFIER + (arguments + )? + (classBody + )? + /* TODO: $GScope::name = names.empty. enum constant body is actually + an anonymous class, where constructor isn't allowed, have to add this check*/ + ; +enumBodyDeclarations + : ';' + (classBodyDeclaration + )* + ; +interfaceDeclaration + : normalInterfaceDeclaration + | annotationTypeDeclaration + ; + +normalInterfaceDeclaration + : modifiers 'interface' IDENTIFIER + (typeParameters + )? + ('extends' typeList + )? + interfaceBody + ; +typeList + : type + (',' type + )* + ; +classBody + : '{' + (classBodyDeclaration + )* + '}' + ; +interfaceBody + : '{' + (interfaceBodyDeclaration + )* + '}' + ; +classBodyDeclaration + : ';' + | ('static' + )? + block + | memberDecl + ; +memberDecl + : fieldDeclaration + | methodDeclaration + | classDeclaration + | interfaceDeclaration + ; +methodDeclaration + : + /* For constructor, return type is null, name is 'init' */ + modifiers + (typeParameters + )? + IDENTIFIER + formalParameters + ('throws' qualifiedNameList + )? + '{' + (explicitConstructorInvocation + )? + (blockStatement + )* + '}' + | modifiers + (typeParameters + )? + (type + | 'void' + ) + IDENTIFIER + formalParameters + ('[' ']' + )* + ('throws' qualifiedNameList + )? + ( + block + | ';' ) ; - -classTypeDeclaration[modifiers] - : CLASS IDENT genericTypeParameterList? classExtendsClause? implementsClause? classBody - -> ^(CLASS {$modifiers} IDENT genericTypeParameterList? classExtendsClause? - implementsClause? classBody) +fieldDeclaration + : modifiers + type + variableDeclarator + (',' variableDeclarator + )* + ';' ; - -classExtendsClause - : EXTENDS type - -> ^(EXTENDS_CLAUSE[$EXTENDS, "EXTENDS_CLAUSE"] type) +variableDeclarator + : IDENTIFIER + ('[' ']' + )* + ('=' variableInitializer + )? ; - -interfaceExtendsClause - : EXTENDS typeList - -> ^(EXTENDS_CLAUSE[$EXTENDS, "EXTENDS_CLAUSE"] typeList) +/** + *TODO: add predicates + */ +interfaceBodyDeclaration + : + interfaceFieldDeclaration + | interfaceMethodDeclaration + | interfaceDeclaration + | classDeclaration + | ';' + ; +interfaceMethodDeclaration + : modifiers + (typeParameters + )? + (type + |'void' + ) + IDENTIFIER + formalParameters + ('[' ']' + )* + ('throws' qualifiedNameList + )? ';' ; - -implementsClause - : IMPLEMENTS typeList - -> ^(IMPLEMENTS_CLAUSE[$IMPLEMENTS, "IMPLEMENTS_CLAUSE"] typeList) +/** + * NOTE, should not use variableDeclarator here, as it doesn't necessary require + * an initializer, while an interface field does, or judge by the returned value. + * But this gives better diagnostic message, or antlr won't predict this rule. + */ +interfaceFieldDeclaration + : modifiers type variableDeclarator + (',' variableDeclarator + )* + ';' ; - -genericTypeParameterList - : LESS_THAN genericTypeParameter (COMMA genericTypeParameter)* genericTypeListClosing - -> ^(GENERIC_TYPE_PARAM_LIST[$LESS_THAN, "GENERIC_TYPE_PARAM_LIST"] genericTypeParameter+) +type + : classOrInterfaceType + ('[' ']' + )* + | primitiveType + ('[' ']' + )* ; - -genericTypeListClosing // This 'trick' is fairly dirty - if there's some time a better solution should - // be found to resolve the problem with nested generic type parameter lists - // (i.e. > for generic type parameters or > for - // generic type arguments etc). - : GREATER_THAN - | SHIFT_RIGHT - | BIT_SHIFT_RIGHT - | // nothing +classOrInterfaceType + : IDENTIFIER + (typeArguments + )? + ('.' IDENTIFIER + (typeArguments + )? + )* ; - -genericTypeParameter - : IDENT bound? - -> ^(IDENT bound?) +primitiveType + : 'boolean' + | 'char' + | 'byte' + | 'short' + | 'int' + | 'long' + | 'float' + | 'double' + ; +typeArguments + : '<' typeArgument + (',' typeArgument + )* + '>' + ; +typeArgument + : type + | '?' + ( + ('extends' + |'super' + ) + type + )? ; - -bound - : EXTENDS type (AND type)* - -> ^(EXTENDS_BOUND_LIST[$EXTENDS, "EXTENDS_BOUND_LIST"] type+) +qualifiedNameList + : qualifiedName + (',' qualifiedName + )* ; - -enumTypeDeclaration[modifiers] - : ENUM IDENT implementsClause? enumBody - -> ^(ENUM {$modifiers} IDENT implementsClause? enumBody) +formalParameters + : '(' + (formalParameterDecls + )? + ')' ; - -enumBody - : LCURLY enumScopeDeclarations RCURLY - -> ^(ENUM_TOP_LEVEL_SCOPE[$LCURLY, "ENUM_TOP_LEVEL_SCOPE"] enumScopeDeclarations) +formalParameterDecls + : ellipsisParameterDecl + | normalParameterDecl + (',' normalParameterDecl + )* + | (normalParameterDecl + ',' + )+ + ellipsisParameterDecl + ; +normalParameterDecl + : variableModifiers type IDENTIFIER + ('[' ']' + )* ; - -enumScopeDeclarations - : enumConstants (COMMA!)? enumClassScopeDeclarations? +ellipsisParameterDecl + : variableModifiers + type '...' + IDENTIFIER ; - -enumClassScopeDeclarations - : SEMI classScopeDeclarations* - -> ^(CLASS_TOP_LEVEL_SCOPE[$SEMI, "CLASS_TOP_LEVEL_SCOPE"] classScopeDeclarations*) +explicitConstructorInvocation + : (nonWildcardTypeArguments + )? //NOTE: the position of Identifier 'super' is set to the type args position here + ('this' + |'super' + ) + arguments ';' + | primary + '.' + (nonWildcardTypeArguments + )? + 'super' + arguments ';' ; - -enumConstants - : enumConstant (COMMA! enumConstant)* +qualifiedName + : IDENTIFIER + ('.' IDENTIFIER + )* ; - -enumConstant - : annotationList IDENT^ arguments? classBody? +annotations + : (annotation + )+ ; - -interfaceTypeDeclaration[modifiers] - : INTERFACE IDENT genericTypeParameterList? interfaceExtendsClause? interfaceBody - -> ^(INTERFACE {$modifiers} IDENT genericTypeParameterList? interfaceExtendsClause? interfaceBody) +/** + * Using an annotation. + * '@' is flaged in modifier + */ +annotation + : '@' qualifiedName + ( '(' + ( elementValuePairs + | elementValue + )? + ')' + )? ; - -typeList - : type (COMMA! type)* +elementValuePairs + : elementValuePair + (',' elementValuePair + )* ; - -classBody - : LCURLY classScopeDeclarations* RCURLY - -> ^(CLASS_TOP_LEVEL_SCOPE[$LCURLY, "CLASS_TOP_LEVEL_SCOPE"] classScopeDeclarations*) +elementValuePair + : IDENTIFIER '=' elementValue ; - -interfaceBody - : LCURLY interfaceScopeDeclarations* RCURLY - -> ^(INTERFACE_TOP_LEVEL_SCOPE[$LCURLY, "CLASS_TOP_LEVEL_SCOPE"] interfaceScopeDeclarations*) +elementValue + : conditionalExpression + | annotation + | elementValueArrayInitializer ; - -classScopeDeclarations - : block -> ^(CLASS_INSTANCE_INITIALIZER block) - | STATIC block -> ^(CLASS_STATIC_INITIALIZER[$STATIC, "CLASS_STATIC_INITIALIZER"] block) - | modifierList - ( genericTypeParameterList? - ( type IDENT formalParameterList arrayDeclaratorList? throwsClause? (block | SEMI) - -> ^(FUNCTION_METHOD_DECL modifierList genericTypeParameterList? type IDENT formalParameterList arrayDeclaratorList? throwsClause? block?) - | VOID IDENT formalParameterList throwsClause? (block | SEMI) - -> ^(VOID_METHOD_DECL modifierList genericTypeParameterList? IDENT formalParameterList throwsClause? block?) - | ident=IDENT formalParameterList throwsClause? block - -> ^(CONSTRUCTOR_DECL[$ident, "CONSTRUCTOR_DECL"] modifierList genericTypeParameterList? formalParameterList throwsClause? block) - ) - | type classFieldDeclaratorList SEMI - -> ^(VAR_DECLARATION modifierList type classFieldDeclaratorList) +elementValueArrayInitializer + : '{' + (elementValue + (',' elementValue + )* + )? (',')? '}' + ; +/** + * Annotation declaration. + */ +annotationTypeDeclaration + : modifiers '@' + 'interface' + IDENTIFIER + annotationTypeBody + ; +annotationTypeBody + : '{' + (annotationTypeElementDeclaration + )* + '}' + ; +/** + * NOTE: here use interfaceFieldDeclaration for field declared inside annotation. they are sytactically the same. + */ +annotationTypeElementDeclaration + : annotationMethodDeclaration + | interfaceFieldDeclaration + | normalClassDeclaration + | normalInterfaceDeclaration + | enumDeclaration + | annotationTypeDeclaration + | ';' + ; +annotationMethodDeclaration + : modifiers type IDENTIFIER + '(' ')' ('default' elementValue + )? + ';' + ; +block + : '{' + (blockStatement + )* + '}' + ; +/* +staticBlock returns [JCBlock tree] + @init { + ListBuffer stats = new ListBuffer(); + int pos = ((AntlrJavacToken) $start).getStartIndex(); + } + @after { + $tree = T.at(pos).Block(Flags.STATIC, stats.toList()); + pu.storeEnd($tree, $stop); + // construct a dummy static modifiers for end position + pu.storeEnd(T.at(pos).Modifiers(Flags.STATIC, com.sun.tools.javac.util.List.nil()),$st); + } + : st_1='static' '{' + (blockStatement + { + if ($blockStatement.tree == null) { + stats.appendList($blockStatement.list); + } else { + stats.append($blockStatement.tree); + } + } + )* '}' + ; +*/ +blockStatement + : localVariableDeclarationStatement + | classOrInterfaceDeclaration + | statement + ; +localVariableDeclarationStatement + : localVariableDeclaration + ';' + ; +localVariableDeclaration + : variableModifiers type + variableDeclarator + (',' variableDeclarator + )* + ; +statement + : block + + | ('assert' ) - | typeDeclaration - | SEMI! + expression (':' expression)? ';' + | 'assert' expression (':' expression)? ';' + | 'if' parExpression statement ('else' statement)? + | forstatement + | 'while' parExpression statement + | 'do' statement 'while' parExpression ';' + | trystatement + | 'switch' parExpression '{' switchBlockStatementGroups '}' + | 'synchronized' parExpression block + | 'return' (expression )? ';' + | 'throw' expression ';' + | 'break' + (IDENTIFIER + )? ';' + | 'continue' + (IDENTIFIER + )? ';' + | expression ';' + | IDENTIFIER ':' statement + | ';' + ; +switchBlockStatementGroups + : (switchBlockStatementGroup )* + ; +switchBlockStatementGroup + : + switchLabel + (blockStatement + )* ; - -interfaceScopeDeclarations - : modifierList - ( genericTypeParameterList? - ( type IDENT formalParameterList arrayDeclaratorList? throwsClause? SEMI - -> ^(FUNCTION_METHOD_DECL modifierList genericTypeParameterList? type IDENT formalParameterList arrayDeclaratorList? throwsClause?) - | VOID IDENT formalParameterList throwsClause? SEMI - -> ^(VOID_METHOD_DECL modifierList genericTypeParameterList? IDENT formalParameterList throwsClause?) - ) - | type interfaceFieldDeclaratorList SEMI - -> ^(VAR_DECLARATION modifierList type interfaceFieldDeclaratorList) +switchLabel + : 'case' expression ':' + | 'default' ':' + ; +trystatement + : 'try' block + ( catches 'finally' block + | catches + | 'finally' block ) - | typeDeclaration - | SEMI! + ; +catches + : catchClause + (catchClause + )* ; - -classFieldDeclaratorList - : classFieldDeclarator (COMMA classFieldDeclarator)* - -> ^(VAR_DECLARATOR_LIST classFieldDeclarator+) +catchClause + : 'catch' '(' formalParameter + ')' block ; - -classFieldDeclarator - : variableDeclaratorId (ASSIGN variableInitializer)? - -> ^(VAR_DECLARATOR variableDeclaratorId variableInitializer?) +formalParameter + : variableModifiers type IDENTIFIER + ('[' ']' + )* ; - -interfaceFieldDeclaratorList - : interfaceFieldDeclarator (COMMA interfaceFieldDeclarator)* - -> ^(VAR_DECLARATOR_LIST interfaceFieldDeclarator+) +forstatement + : + // enhanced for loop + 'for' '(' variableModifiers type IDENTIFIER ':' + expression ')' statement + + // normal for loop + | 'for' '(' + (forInit + )? ';' + (expression + )? ';' + (expressionList + )? ')' statement + ; +forInit + : localVariableDeclaration + | expressionList + ; +parExpression + : '(' expression ')' + ; +expressionList + : expression + (',' expression + )* ; - -interfaceFieldDeclarator - : variableDeclaratorId ASSIGN variableInitializer - -> ^(VAR_DECLARATOR variableDeclaratorId variableInitializer) +expression + : conditionalExpression + (assignmentOperator expression + )? ; - -variableDeclaratorId - : IDENT^ arrayDeclaratorList? +assignmentOperator + : '=' + | '+=' + | '-=' + | '*=' + | '/=' + | '&=' + | '|=' + | '^=' + | '%=' + | '<' '<' '=' + | '>' '>' '>' '=' + | '>' '>' '=' + ; +conditionalExpression + : conditionalOrExpression + ('?' expression ':' conditionalExpression + )? ; - -variableInitializer +conditionalOrExpression + : conditionalAndExpression + ('||' conditionalAndExpression + )* + ; +conditionalAndExpression + : inclusiveOrExpression + ('&&' inclusiveOrExpression + )* + ; +inclusiveOrExpression + : exclusiveOrExpression + ('|' exclusiveOrExpression + )* + ; +exclusiveOrExpression + : andExpression + ('^' andExpression + )* + ; +andExpression + : equalityExpression + ('&' equalityExpression + )* + ; +equalityExpression + : instanceOfExpression + ( + ( '==' + | '!=' + ) + instanceOfExpression + )* + ; +instanceOfExpression + : relationalExpression + ('instanceof' type + )? + ; +relationalExpression + : shiftExpression + (relationalOp shiftExpression + )* + ; +relationalOp + : '<' '=' + | '>' '=' + | '<' + | '>' + ; +shiftExpression + : additiveExpression + (shiftOp additiveExpression + )* + ; +shiftOp + : '<' '<' + | '>' '>' '>' + | '>' '>' + ; +additiveExpression + : multiplicativeExpression + ( + ( '+' + | '-' + ) + multiplicativeExpression + )* + ; +multiplicativeExpression + : + unaryExpression + ( + ( '*' + | '/' + | '%' + ) + unaryExpression + )* + ; +/** + * NOTE: for '+' and '-', if the next token is int or long interal, then it's not a unary expression. + * it's a literal with signed value. INTLTERAL AND LONG LITERAL are added here for this. + */ +unaryExpression + : '+' unaryExpression + | '-' unaryExpression + | '++' unaryExpression + | '--' unaryExpression + | unaryExpressionNotPlusMinus + ; +unaryExpressionNotPlusMinus + : '~' unaryExpression + | '!' unaryExpression + | castExpression + | primary + (selector + )* + ( '++' + | '--' + )? + ; +castExpression + : '(' primitiveType ')' unaryExpression + | '(' type ')' unaryExpressionNotPlusMinus + ; +/** + * have to use scope here, parameter passing isn't well supported in antlr. + */ +primary + : parExpression + | 'this' + ('.' IDENTIFIER + )* + (identifierSuffix + )? + | IDENTIFIER + ('.' IDENTIFIER + )* + (identifierSuffix + )? + | 'super' + superSuffix + | literal + | creator + | primitiveType + ('[' ']' + )* + '.' 'class' + | 'void' '.' 'class' + ; + +superSuffix + : arguments + | '.' (typeArguments + )? + IDENTIFIER + (arguments + )? + ; +identifierSuffix + : ('[' ']' + )+ + '.' 'class' + | ('[' expression ']' + )+ + | arguments + | '.' 'class' + | '.' nonWildcardTypeArguments IDENTIFIER arguments + | '.' 'this' + | '.' 'super' arguments + | innerCreator + ; +selector + : '.' IDENTIFIER + (arguments + )? + | '.' 'this' + | '.' 'super' + superSuffix + | innerCreator + | '[' expression ']' + ; +creator + : 'new' nonWildcardTypeArguments classOrInterfaceType classCreatorRest + | 'new' classOrInterfaceType classCreatorRest + | arrayCreator + ; +arrayCreator + : 'new' createdName + '[' ']' + ('[' ']' + )* + arrayInitializer + | 'new' createdName + '[' expression + ']' + ( '[' expression + ']' + )* + ('[' ']' + )* + ; +variableInitializer : arrayInitializer | expression ; - -arrayDeclarator - : LBRACK RBRACK - -> ^(ARRAY_DECLARATOR) +arrayInitializer + : '{' + (variableInitializer + (',' variableInitializer + )* + )? + (',')? + '}' //Yang's fix, position change. + ; +createdName + : classOrInterfaceType + | primitiveType + ; +innerCreator + : '.' 'new' + (nonWildcardTypeArguments + )? + IDENTIFIER + (typeArguments + )? + classCreatorRest ; - -arrayDeclaratorList - : arrayDeclarator+ - -> ^(ARRAY_DECLARATOR_LIST arrayDeclarator+) +classCreatorRest + : arguments + (classBody + )? ; - -arrayInitializer - : LCURLY (variableInitializer (COMMA variableInitializer)* COMMA?)? RCURLY - -> ^(ARRAY_INITIALIZER[$LCURLY, "ARRAY_INITIALIZER"] variableInitializer*) +nonWildcardTypeArguments + : '<' typeList + '>' + ; +arguments + : '(' (expressionList + )? ')' + ; +literal + : INTLITERAL + | LONGLITERAL + | FLOATLITERAL + | DOUBLELITERAL + | CHARLITERAL + | STRINGLITERAL + | TRUE + | FALSE + | NULL ; - -throwsClause - : THROWS qualifiedIdentList - -> ^(THROWS_CLAUSE[$THROWS, "THROWS_CLAUSE"] qualifiedIdentList) +/** + * These are headers help to make syntatical predicates, not necessary but helps to make grammar faster. + */ + +classHeader + : modifiers 'class' IDENTIFIER ; - -modifierList - : modifier* - -> ^(MODIFIER_LIST modifier*) +enumHeader + : modifiers ('enum'|IDENTIFIER) IDENTIFIER ; - -modifier - : PUBLIC - | PROTECTED - | PRIVATE - | STATIC - | ABSTRACT - | NATIVE - | SYNCHRONIZED - | TRANSIENT - | VOLATILE - | STRICTFP - | localModifier +interfaceHeader + : modifiers 'interface' IDENTIFIER ; - -localModifierList - : localModifier* - -> ^(LOCAL_MODIFIER_LIST localModifier*) +annotationHeader + : modifiers '@' 'interface' IDENTIFIER ; - -localModifier - : FINAL - | annotation +typeHeader + : modifiers ('class'|'enum'|('@' ? 'interface')) IDENTIFIER ; - -type - : simpleType - | objectType +methodHeader + : modifiers typeParameters? (type|'void')? IDENTIFIER '(' ; - -simpleType // including static arrays of simple type elements - : primitiveType arrayDeclaratorList? - -> ^(TYPE primitiveType arrayDeclaratorList?) +fieldHeader + : modifiers type IDENTIFIER ('['']')* ('='|','|';') ; - -objectType // including static arrays of object type reference elements - : qualifiedTypeIdent arrayDeclaratorList? - -> ^(TYPE qualifiedTypeIdent arrayDeclaratorList?) +localVariableHeader + : variableModifiers type IDENTIFIER ('['']')* ('='|','|';') ; - -objectTypeSimplified - : qualifiedTypeIdentSimplified arrayDeclaratorList? - -> ^(TYPE qualifiedTypeIdentSimplified arrayDeclaratorList?) +/******************************************************************************************** + Lexer section +*********************************************************************************************/ +LONGLITERAL + : IntegerNumber LongSuffix ; - -qualifiedTypeIdent - : typeIdent (DOT typeIdent)* - -> ^(QUALIFIED_TYPE_IDENT typeIdent+) + +INTLITERAL + : IntegerNumber ; - -qualifiedTypeIdentSimplified - : typeIdentSimplified (DOT typeIdentSimplified)* - -> ^(QUALIFIED_TYPE_IDENT typeIdentSimplified+) + +fragment +IntegerNumber + : '0' + | '1'..'9' ('0'..'9')* + | '0' ('0'..'7')+ + | HexPrefix HexDigit+ ; - -typeIdent - : IDENT^ genericTypeArgumentList? +fragment +HexPrefix + : '0x' | '0X' ; - -typeIdentSimplified - : IDENT^ genericTypeArgumentListSimplified? + +fragment +HexDigit + : ('0'..'9'|'a'..'f'|'A'..'F') ; - -primitiveType - : BOOLEAN - | CHAR - | BYTE - | SHORT - | INT - | LONG - | FLOAT - | DOUBLE +fragment +LongSuffix + : 'l' | 'L' ; - -genericTypeArgumentList - : LESS_THAN genericTypeArgument (COMMA genericTypeArgument)* genericTypeListClosing - -> ^(GENERIC_TYPE_ARG_LIST[$LESS_THAN, "GENERIC_TYPE_ARG_LIST"] genericTypeArgument+) +fragment +NonIntegerNumber + : ('0' .. '9')+ '.' ('0' .. '9')* Exponent? + | '.' ( '0' .. '9' )+ Exponent? + | ('0' .. '9')+ Exponent + | ('0' .. '9')+ + | + HexPrefix (HexDigit )* + ( () + | ('.' (HexDigit )* ) + ) + ( 'p' | 'P' ) + ( '+' | '-' )? + ( '0' .. '9' )+ + ; + +fragment +Exponent + : ( 'e' | 'E' ) ( '+' | '-' )? ( '0' .. '9' )+ + ; + +fragment +FloatSuffix + : 'f' | 'F' + ; +fragment +DoubleSuffix + : 'd' | 'D' + ; + +FLOATLITERAL + : NonIntegerNumber FloatSuffix + ; + +DOUBLELITERAL + : NonIntegerNumber DoubleSuffix? + ; +CHARLITERAL + : '\'' + ( EscapeSequence + | ~( '\'' | '\\' | '\r' | '\n' ) + ) + '\'' + ; +STRINGLITERAL + : '"' + ( EscapeSequence + | ~( '\\' | '"' | '\r' | '\n' ) + )* + '"' ; - -genericTypeArgument - : type - | QUESTION genericWildcardBoundType? - -> ^(QUESTION genericWildcardBoundType?) +fragment +EscapeSequence + : '\\' ( + 'b' + | 't' + | 'n' + | 'f' + | 'r' + | '\"' + | '\'' + | '\\' + | + ('0'..'3') ('0'..'7') ('0'..'7') + | + ('0'..'7') ('0'..'7') + | + ('0'..'7') + ) +; +WS + : ( + ' ' + | '\r' + | '\t' + | '\u000C' + | '\n' + ) + { + skip(); + } + ; + +COMMENT + @init{ + boolean isJavaDoc = false; + } + : '/*' + { + if((char)input.LA(1) == '*'){ + isJavaDoc = true; + } + } + (options {greedy=false;} : . )* + '*/' + { + if(isJavaDoc==true){ + $channel=HIDDEN; + }else{ + skip(); + } + } ; - -genericWildcardBoundType - : (EXTENDS | SUPER)^ type +LINE_COMMENT + : '//' ~('\n'|'\r')* ('\r\n' | '\r' | '\n') + { + skip(); + } + | '//' ~('\n'|'\r')* // a line comment could appear at the end of the file without CR/LF + { + skip(); + } + ; + +ABSTRACT + : 'abstract' ; - -genericTypeArgumentListSimplified - : LESS_THAN genericTypeArgumentSimplified (COMMA genericTypeArgumentSimplified)* genericTypeListClosing - -> ^(GENERIC_TYPE_ARG_LIST[$LESS_THAN, "GENERIC_TYPE_ARG_LIST"] genericTypeArgumentSimplified+) + +ASSERT + : 'assert' ; - -genericTypeArgumentSimplified - : type - | QUESTION + +BOOLEAN + : 'boolean' ; - -qualifiedIdentList - : qualifiedIdentifier (COMMA! qualifiedIdentifier)* + +BREAK + : 'break' ; - -formalParameterList - : LPAREN - ( // Contains at least one standard argument declaration and optionally a variable argument declaration. - formalParameterStandardDecl (COMMA formalParameterStandardDecl)* (COMMA formalParameterVarArgDecl)? - -> ^(FORMAL_PARAM_LIST[$LPAREN, "FORMAL_PARAM_LIST"] formalParameterStandardDecl+ formalParameterVarArgDecl?) - // Contains a variable argument declaration only. - | formalParameterVarArgDecl - -> ^(FORMAL_PARAM_LIST[$LPAREN, "FORMAL_PARAM_LIST"] formalParameterVarArgDecl) - // Contains nothing. - | -> ^(FORMAL_PARAM_LIST[$LPAREN, "FORMAL_PARAM_LIST"]) - ) - RPAREN + +BYTE + : 'byte' ; - -formalParameterStandardDecl - : localModifierList type variableDeclaratorId - -> ^(FORMAL_PARAM_STD_DECL localModifierList type variableDeclaratorId) + +CASE + : 'case' ; - -formalParameterVarArgDecl - : localModifierList type ELLIPSIS variableDeclaratorId - -> ^(FORMAL_PARAM_VARARG_DECL localModifierList type variableDeclaratorId) + +CATCH + : 'catch' ; - -qualifiedIdentifier - : ( IDENT -> IDENT - ) - ( DOT ident=IDENT -> ^(DOT $qualifiedIdentifier $ident) - )* + +CHAR + : 'char' ; - -// ANNOTATIONS - -annotationList - : annotation* - -> ^(ANNOTATION_LIST annotation*) + +CLASS + : 'class' ; - -annotation - : AT^ qualifiedIdentifier annotationInit? + +CONST + : 'const' ; - -annotationInit - : LPAREN annotationInitializers RPAREN - -> ^(ANNOTATION_INIT_BLOCK[$LPAREN, "ANNOTATION_INIT_BLOCK"] annotationInitializers) +CONTINUE + : 'continue' ; - -annotationInitializers - : annotationInitializer (COMMA annotationInitializer)* - -> ^(ANNOTATION_INIT_KEY_LIST annotationInitializer+) - | annotationElementValue // implicite initialization of the annotation field 'value' - -> ^(ANNOTATION_INIT_DEFAULT_KEY annotationElementValue) +DEFAULT + : 'default' ; - -annotationInitializer - : IDENT^ ASSIGN! annotationElementValue +DO + : 'do' ; - -annotationElementValue - : annotationElementValueExpression - | annotation - | annotationElementValueArrayInitializer +DOUBLE + : 'double' ; - -annotationElementValueExpression - : conditionalExpression - -> ^(EXPR conditionalExpression) +ELSE + : 'else' ; - -annotationElementValueArrayInitializer - : LCURLY (annotationElementValue (COMMA annotationElementValue)*)? (COMMA)? RCURLY - -> ^(ANNOTATION_INIT_ARRAY_ELEMENT[$LCURLY, "ANNOTATION_ELEM_VALUE_ARRAY_INIT"] annotationElementValue*) +ENUM + : 'enum' + ; +EXTENDS + : 'extends' ; - -annotationTypeDeclaration[modifiers] - : AT INTERFACE IDENT annotationBody - -> ^(AT {$modifiers} IDENT annotationBody) +FINAL + : 'final' ; - -annotationBody - : LCURLY annotationScopeDeclarations* RCURLY - -> ^(ANNOTATION_TOP_LEVEL_SCOPE[$LCURLY, "CLASS_TOP_LEVEL_SCOPE"] annotationScopeDeclarations*) +FINALLY + : 'finally' ; - -annotationScopeDeclarations - : modifierList type - ( IDENT LPAREN RPAREN annotationDefaultValue? SEMI - -> ^(ANNOTATION_METHOD_DECL modifierList type IDENT annotationDefaultValue?) - | classFieldDeclaratorList SEMI - -> ^(VAR_DECLARATION modifierList type classFieldDeclaratorList) - ) - | typeDeclaration +FLOAT + : 'float' ; - -annotationDefaultValue - : DEFAULT^ annotationElementValue +FOR + : 'for' ; - -// STATEMENTS / BLOCKS - -block - : LCURLY blockStatement* RCURLY - -> ^(BLOCK_SCOPE[$LCURLY, "BLOCK_SCOPE"] blockStatement*) +GOTO + : 'goto' ; - -blockStatement - : localVariableDeclaration SEMI! - | typeDeclaration - | statement +IF + : 'if' ; - -localVariableDeclaration - : localModifierList type classFieldDeclaratorList - -> ^(VAR_DECLARATION localModifierList type classFieldDeclaratorList) +IMPLEMENTS + : 'implements' ; - - -statement - : block - | ASSERT expr1=expression - ( COLON expr2=expression SEMI -> ^(ASSERT $expr1 $expr2) - | SEMI -> ^(ASSERT $expr1) - ) - | IF parenthesizedExpression ifStat=statement - ( ELSE elseStat=statement -> ^(IF parenthesizedExpression $ifStat $elseStat) - | -> ^(IF parenthesizedExpression $ifStat) - ) - | FOR LPAREN - ( forInit SEMI forCondition SEMI forUpdater RPAREN statement -> ^(FOR forInit forCondition forUpdater statement) - | localModifierList type IDENT COLON expression RPAREN statement - -> ^(FOR_EACH[$FOR, "FOR_EACH"] localModifierList type IDENT expression statement) - ) - | WHILE parenthesizedExpression statement -> ^(WHILE parenthesizedExpression statement) - | DO statement WHILE parenthesizedExpression SEMI -> ^(DO statement parenthesizedExpression) - | TRY block (catches finallyClause? | finallyClause) -> ^(TRY block catches? finallyClause?) - | SWITCH parenthesizedExpression LCURLY switchBlockLabels? RCURLY -> ^(SWITCH parenthesizedExpression switchBlockLabels?) - | SYNCHRONIZED parenthesizedExpression block -> ^(SYNCHRONIZED parenthesizedExpression block) - | RETURN expression? SEMI -> ^(RETURN expression?) - | THROW expression SEMI -> ^(THROW expression) - | BREAK IDENT? SEMI -> ^(BREAK IDENT?) - | CONTINUE IDENT? SEMI -> ^(CONTINUE IDENT?) - | IDENT COLON statement -> ^(LABELED_STATEMENT IDENT statement) - | expression SEMI! - | SEMI // Preserve empty statements. +IMPORT + : 'import' ; - -catches - : catchClause+ - -> ^(CATCH_CLAUSE_LIST catchClause+) +INSTANCEOF + : 'instanceof' ; - -catchClause - : CATCH^ LPAREN! formalParameterStandardDecl RPAREN! block +INT + : 'int' ; - -finallyClause - : FINALLY block - -> block +INTERFACE + : 'interface' ; - -switchBlockLabels - // local modification: changed "switchCaseLabels" to - // "varname=switchCaseLabels?" to match language spec and support empty - // switch statements. - : c0=switchCaseLabels? switchDefaultLabel? c1=switchCaseLabels? - -> ^(SWITCH_BLOCK_LABEL_LIST $c0? switchDefaultLabel? $c1?) +LONG + : 'long' ; - -switchCaseLabels - : switchCaseLabel* +NATIVE + : 'native' ; - -switchCaseLabel - : CASE^ expression COLON! blockStatement* +NEW + : 'new' ; - -switchDefaultLabel - : DEFAULT^ COLON! blockStatement* +PACKAGE + : 'package' ; - -forInit - : localVariableDeclaration -> ^(FOR_INIT localVariableDeclaration) - | expressionList -> ^(FOR_INIT expressionList) - | -> ^(FOR_INIT) +PRIVATE + : 'private' ; - -forCondition - : expression? - -> ^(FOR_CONDITION expression?) +PROTECTED + : 'protected' ; - -forUpdater - : expressionList? - -> ^(FOR_UPDATE expressionList?) +PUBLIC + : 'public' ; - -// EXPRESSIONS - -parenthesizedExpression - : LPAREN expression RPAREN - -> ^(PARENTESIZED_EXPR[$LPAREN, "PARENTESIZED_EXPR"] expression) +RETURN + : 'return' ; - -expressionList - : expression (COMMA! expression)* +SHORT + : 'short' ; - -expression - : assignmentExpression - -> ^(EXPR assignmentExpression) +STATIC + : 'static' ; - -assignmentExpression - : conditionalExpression - ( ( ASSIGN^ - | PLUS_ASSIGN^ - | MINUS_ASSIGN^ - | STAR_ASSIGN^ - | DIV_ASSIGN^ - | AND_ASSIGN^ - | OR_ASSIGN^ - | XOR_ASSIGN^ - | MOD_ASSIGN^ - | SHIFT_LEFT_ASSIGN^ - | SHIFT_RIGHT_ASSIGN^ - | BIT_SHIFT_RIGHT_ASSIGN^ - ) - assignmentExpression)? +STRICTFP + : 'strictfp' ; - -conditionalExpression - : logicalOrExpression (QUESTION^ assignmentExpression COLON! conditionalExpression)? +SUPER + : 'super' ; - -logicalOrExpression - : logicalAndExpression (LOGICAL_OR^ logicalAndExpression)* +SWITCH + : 'switch' ; - -logicalAndExpression - : inclusiveOrExpression (LOGICAL_AND^ inclusiveOrExpression)* +SYNCHRONIZED + : 'synchronized' ; - -inclusiveOrExpression - : exclusiveOrExpression (OR^ exclusiveOrExpression)* +THIS + : 'this' ; - -exclusiveOrExpression - : andExpression (XOR^ andExpression)* +THROW + : 'throw' ; - -andExpression - : equalityExpression (AND^ equalityExpression)* +THROWS + : 'throws' ; - -equalityExpression - : instanceOfExpression - ( ( EQUAL^ - | NOT_EQUAL^ - ) - instanceOfExpression - )* +TRANSIENT + : 'transient' ; - -instanceOfExpression - : relationalExpression (INSTANCEOF^ type)? +TRY + : 'try' ; - -relationalExpression - : shiftExpression - ( ( LESS_OR_EQUAL^ - | GREATER_OR_EQUAL^ - | LESS_THAN^ - | GREATER_THAN^ - ) - shiftExpression - )* +VOID + : 'void' ; - -shiftExpression - : additiveExpression - ( ( BIT_SHIFT_RIGHT^ - | SHIFT_RIGHT^ - | SHIFT_LEFT^ - ) - additiveExpression - )* +VOLATILE + : 'volatile' ; - -additiveExpression - : multiplicativeExpression - ( ( PLUS^ - | MINUS^ - ) - multiplicativeExpression - )* +WHILE + : 'while' ; - -multiplicativeExpression - : unaryExpression - ( ( STAR^ - | DIV^ - | MOD^ - ) - unaryExpression - )* +TRUE + : 'true' ; - -unaryExpression - : PLUS unaryExpression -> ^(UNARY_PLUS[$PLUS, "UNARY_PLUS"] unaryExpression) - | MINUS unaryExpression -> ^(UNARY_MINUS[$MINUS, "UNARY_MINUS"] unaryExpression) - | INC postfixedExpression -> ^(PRE_INC[$INC, "PRE_INC"] postfixedExpression) - | DEC postfixedExpression -> ^(PRE_DEC[$DEC, "PRE_DEC"] postfixedExpression) - | unaryExpressionNotPlusMinus +FALSE + : 'false' ; - -unaryExpressionNotPlusMinus - : NOT unaryExpression -> ^(NOT unaryExpression) - | LOGICAL_NOT unaryExpression -> ^(LOGICAL_NOT unaryExpression) - | LPAREN type RPAREN unaryExpression -> ^(CAST_EXPR[$LPAREN, "CAST_EXPR"] type unaryExpression) - | postfixedExpression +NULL + : 'null' ; - -postfixedExpression - // At first resolve the primary expression ... - : ( primaryExpression -> primaryExpression - ) - // ... and than the optional things that may follow a primary expression 0 or more times. - ( outerDot=DOT - ( ( genericTypeArgumentListSimplified? // Note: generic type arguments are only valid for method calls, i.e. if there - // is an argument list. - IDENT -> ^(DOT $postfixedExpression IDENT) - ) - ( arguments -> ^(METHOD_CALL $postfixedExpression genericTypeArgumentListSimplified? arguments) - )? - | THIS -> ^(DOT $postfixedExpression THIS) - | Super=SUPER arguments -> ^(SUPER_CONSTRUCTOR_CALL[$Super, "SUPER_CONSTRUCTOR_CALL"] $postfixedExpression arguments) - | ( SUPER innerDot=DOT IDENT -> ^($innerDot ^($outerDot $postfixedExpression SUPER) IDENT) - ) - ( arguments -> ^(METHOD_CALL $postfixedExpression arguments) - )? - | innerNewExpression -> ^(DOT $postfixedExpression innerNewExpression) - ) - | LBRACK expression RBRACK -> ^(ARRAY_ELEMENT_ACCESS $postfixedExpression expression) - )* - // At the end there may follow a post increment/decrement. - ( INC -> ^(POST_INC[$INC, "POST_INC"] $postfixedExpression) - | DEC -> ^(POST_DEC[$DEC, "POST_DEC"] $postfixedExpression) - )? +LPAREN + : '(' ; - -primaryExpression - : parenthesizedExpression - | literal - | newExpression - | qualifiedIdentExpression - | genericTypeArgumentListSimplified - ( SUPER - ( arguments -> ^(SUPER_CONSTRUCTOR_CALL[$SUPER, "SUPER_CONSTRUCTOR_CALL"] genericTypeArgumentListSimplified arguments) - | DOT IDENT arguments -> ^(METHOD_CALL ^(DOT SUPER IDENT) genericTypeArgumentListSimplified arguments) - ) - | IDENT arguments -> ^(METHOD_CALL IDENT genericTypeArgumentListSimplified arguments) - | THIS arguments -> ^(THIS_CONSTRUCTOR_CALL[$THIS, "THIS_CONSTRUCTOR_CALL"] genericTypeArgumentListSimplified arguments) - ) - | ( THIS -> THIS - ) - ( arguments -> ^(THIS_CONSTRUCTOR_CALL[$THIS, "THIS_CONSTRUCTOR_CALL"] arguments) - )? - | SUPER arguments -> ^(SUPER_CONSTRUCTOR_CALL[$SUPER, "SUPER_CONSTRUCTOR_CALL"] arguments) - | ( SUPER DOT IDENT - ) - ( arguments -> ^(METHOD_CALL ^(DOT SUPER IDENT) arguments) - | -> ^(DOT SUPER IDENT) - ) - | ( primitiveType -> primitiveType - ) - ( arrayDeclarator -> ^(arrayDeclarator $primaryExpression) - )* - DOT CLASS -> ^(DOT $primaryExpression CLASS) - | VOID DOT CLASS -> ^(DOT VOID CLASS) +RPAREN + : ')' ; - -qualifiedIdentExpression - // The qualified identifier itself is the starting point for this rule. - : ( qualifiedIdentifier -> qualifiedIdentifier - ) - // And now comes the stuff that may follow the qualified identifier. - ( ( arrayDeclarator -> ^(arrayDeclarator $qualifiedIdentExpression) - )+ - ( DOT CLASS -> ^(DOT $qualifiedIdentExpression CLASS) - ) - | arguments -> ^(METHOD_CALL qualifiedIdentifier arguments) - | outerDot=DOT - ( CLASS -> ^(DOT qualifiedIdentifier CLASS) - | genericTypeArgumentListSimplified - ( Super=SUPER arguments -> ^(SUPER_CONSTRUCTOR_CALL[$Super, "SUPER_CONSTRUCTOR_CALL"] qualifiedIdentifier genericTypeArgumentListSimplified arguments) - | SUPER innerDot=DOT IDENT arguments -> ^(METHOD_CALL ^($innerDot ^($outerDot qualifiedIdentifier SUPER) IDENT) genericTypeArgumentListSimplified arguments) - | IDENT arguments -> ^(METHOD_CALL ^(DOT qualifiedIdentifier IDENT) genericTypeArgumentListSimplified arguments) - ) - | THIS -> ^(DOT qualifiedIdentifier THIS) - | Super=SUPER arguments -> ^(SUPER_CONSTRUCTOR_CALL[$Super, "SUPER_CONSTRUCTOR_CALL"] qualifiedIdentifier arguments) - | innerNewExpression -> ^(DOT qualifiedIdentifier innerNewExpression) - ) - )? +LBRACE + : '{' ; - -newExpression - : NEW - ( primitiveType newArrayConstruction // new static array of primitive type elements - -> ^(STATIC_ARRAY_CREATOR[$NEW, "STATIC_ARRAY_CREATOR"] primitiveType newArrayConstruction) - | genericTypeArgumentListSimplified? qualifiedTypeIdentSimplified - ( newArrayConstruction // new static array of object type reference elements - -> ^(STATIC_ARRAY_CREATOR[$NEW, "STATIC_ARRAY_CREATOR"] genericTypeArgumentListSimplified? qualifiedTypeIdentSimplified newArrayConstruction) - | arguments classBody? // new object type via constructor invocation - -> ^(CLASS_CONSTRUCTOR_CALL[$NEW, "STATIC_ARRAY_CREATOR"] genericTypeArgumentListSimplified? qualifiedTypeIdentSimplified arguments classBody?) - ) - ) +RBRACE + : '}' ; - -innerNewExpression // something like 'InnerType innerType = outer.new InnerType();' - : NEW genericTypeArgumentListSimplified? IDENT arguments classBody? - -> ^(CLASS_CONSTRUCTOR_CALL[$NEW, "STATIC_ARRAY_CREATOR"] genericTypeArgumentListSimplified? IDENT arguments classBody?) +LBRACKET + : '[' ; - -newArrayConstruction - : arrayDeclaratorList arrayInitializer - | LBRACK! expression RBRACK! (LBRACK! expression RBRACK!)* arrayDeclaratorList? +RBRACKET + : ']' ; - -arguments - : LPAREN expressionList? RPAREN - -> ^(ARGUMENT_LIST[$LPAREN, "ARGUMENT_LIST"] expressionList?) +SEMI + : ';' ; - -literal - : HEX_LITERAL - | OCTAL_LITERAL - | DECIMAL_LITERAL - | FLOATING_POINT_LITERAL - | CHARACTER_LITERAL - | STRING_LITERAL - | TRUE - | FALSE - | NULL +COMMA + : ',' ; - -// LEXER - -HEX_LITERAL : '0' ('x'|'X') HEX_DIGIT+ INTEGER_TYPE_SUFFIX? ; - -DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) INTEGER_TYPE_SUFFIX? ; - -OCTAL_LITERAL : '0' ('0'..'7')+ INTEGER_TYPE_SUFFIX? ; - -fragment -HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ; - -fragment -INTEGER_TYPE_SUFFIX : ('l'|'L') ; - -FLOATING_POINT_LITERAL - : ('0'..'9')+ - ( - DOT ('0'..'9')* EXPONENT? FLOAT_TYPE_SUFFIX? - | EXPONENT FLOAT_TYPE_SUFFIX? - | FLOAT_TYPE_SUFFIX - ) - | DOT ('0'..'9')+ EXPONENT? FLOAT_TYPE_SUFFIX? +DOT + : '.' ; - -fragment -EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; - -fragment -FLOAT_TYPE_SUFFIX : ('f'|'F'|'d'|'D') ; - -CHARACTER_LITERAL - : '\'' ( ESCAPE_SEQUENCE | ~('\''|'\\') ) '\'' +ELLIPSIS + : '...' ; - -STRING_LITERAL - : '"' ( ESCAPE_SEQUENCE | ~('\\'|'"') )* '"' +EQ + : '=' ; - -fragment -ESCAPE_SEQUENCE - : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\') - | UNICODE_ESCAPE - | OCTAL_ESCAPE +BANG + : '!' ; - -fragment -OCTAL_ESCAPE - : '\\' ('0'..'3') ('0'..'7') ('0'..'7') - | '\\' ('0'..'7') ('0'..'7') - | '\\' ('0'..'7') +TILDE + : '~' ; - -fragment -UNICODE_ESCAPE - : '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT +QUES + : '?' ; - -IDENT - : JAVA_ID_START (JAVA_ID_PART)* +COLON + : ':' ; - -fragment -JAVA_ID_START - : '\u0024' - | '\u0041'..'\u005a' - | '\u005f' - | '\u0061'..'\u007a' - | '\u00c0'..'\u00d6' - | '\u00d8'..'\u00f6' - | '\u00f8'..'\u00ff' - | '\u0100'..'\u1fff' - | '\u3040'..'\u318f' - | '\u3300'..'\u337f' - | '\u3400'..'\u3d2d' - | '\u4e00'..'\u9fff' - | '\uf900'..'\ufaff' +EQEQ + : '==' ; - -fragment -JAVA_ID_PART - : JAVA_ID_START - | '\u0030'..'\u0039' +AMPAMP + : '&&' ; - -WS : (' '|'\r'|'\t'|'\u000C'|'\n') - { - $channel = HIDDEN - } +BARBAR + : '||' ; - -COMMENT - : '/*' ~('*') ( options {greedy=false;} : . )* '*/' - { - $channel = HIDDEN - } +PLUSPLUS + : '++' ; - -LINE_COMMENT - : '//' ~('\n'|'\r')* '\r'? '\n' - { - $channel = HIDDEN - } +SUBSUB + : '--' ; - -JAVADOC_COMMENT - : '/**' ( options {greedy=false;} : . )* '*/' - { - $channel = HIDDEN - } +PLUS + : '+' + ; +SUB + : '-' + ; +STAR + : '*' + ; +SLASH + : '/' + ; +AMP + : '&' + ; +BAR + : '|' + ; +CARET + : '^' + ; +PERCENT + : '%' ; +PLUSEQ + : '+=' + ; + +SUBEQ + : '-=' + ; +STAREQ + : '*=' + ; +SLASHEQ + : '/=' + ; +AMPEQ + : '&=' + ; +BAREQ + : '|=' + ; +CARETEQ + : '^=' + ; +PERCENTEQ + : '%=' + ; +MONKEYS_AT + : '@' + ; +BANGEQ + : '!=' + ; +GT + : '>' + ; +LT + : '<' + ; + +IDENTIFIER + : IdentifierStart IdentifierPart* + ; +fragment +SurrogateIdentifer + : ('\ud800'..'\udbff') ('\udc00'..'\udfff') + ; +fragment +IdentifierStart + : '\u0024' + | '\u0041'..'\u005a' + | '\u005f' + | '\u0061'..'\u007a' + | '\u00a2'..'\u00a5' + | '\u00aa' + | '\u00b5' + | '\u00ba' + | '\u00c0'..'\u00d6' + | '\u00d8'..'\u00f6' + | '\u00f8'..'\u0236' + | '\u0250'..'\u02c1' + | '\u02c6'..'\u02d1' + | '\u02e0'..'\u02e4' + | '\u02ee' + | '\u037a' + | '\u0386' + | '\u0388'..'\u038a' + | '\u038c' + | '\u038e'..'\u03a1' + | '\u03a3'..'\u03ce' + | '\u03d0'..'\u03f5' + | '\u03f7'..'\u03fb' + | '\u0400'..'\u0481' + | '\u048a'..'\u04ce' + | '\u04d0'..'\u04f5' + | '\u04f8'..'\u04f9' + | '\u0500'..'\u050f' + | '\u0531'..'\u0556' + | '\u0559' + | '\u0561'..'\u0587' + | '\u05d0'..'\u05ea' + | '\u05f0'..'\u05f2' + | '\u0621'..'\u063a' + | '\u0640'..'\u064a' + | '\u066e'..'\u066f' + | '\u0671'..'\u06d3' + | '\u06d5' + | '\u06e5'..'\u06e6' + | '\u06ee'..'\u06ef' + | '\u06fa'..'\u06fc' + | '\u06ff' + | '\u0710' + | '\u0712'..'\u072f' + | '\u074d'..'\u074f' + | '\u0780'..'\u07a5' + | '\u07b1' + | '\u0904'..'\u0939' + | '\u093d' + | '\u0950' + | '\u0958'..'\u0961' + | '\u0985'..'\u098c' + | '\u098f'..'\u0990' + | '\u0993'..'\u09a8' + | '\u09aa'..'\u09b0' + | '\u09b2' + | '\u09b6'..'\u09b9' + | '\u09bd' + | '\u09dc'..'\u09dd' + | '\u09df'..'\u09e1' + | '\u09f0'..'\u09f3' + | '\u0a05'..'\u0a0a' + | '\u0a0f'..'\u0a10' + | '\u0a13'..'\u0a28' + | '\u0a2a'..'\u0a30' + | '\u0a32'..'\u0a33' + | '\u0a35'..'\u0a36' + | '\u0a38'..'\u0a39' + | '\u0a59'..'\u0a5c' + | '\u0a5e' + | '\u0a72'..'\u0a74' + | '\u0a85'..'\u0a8d' + | '\u0a8f'..'\u0a91' + | '\u0a93'..'\u0aa8' + | '\u0aaa'..'\u0ab0' + | '\u0ab2'..'\u0ab3' + | '\u0ab5'..'\u0ab9' + | '\u0abd' + | '\u0ad0' + | '\u0ae0'..'\u0ae1' + | '\u0af1' + | '\u0b05'..'\u0b0c' + | '\u0b0f'..'\u0b10' + | '\u0b13'..'\u0b28' + | '\u0b2a'..'\u0b30' + | '\u0b32'..'\u0b33' + | '\u0b35'..'\u0b39' + | '\u0b3d' + | '\u0b5c'..'\u0b5d' + | '\u0b5f'..'\u0b61' + | '\u0b71' + | '\u0b83' + | '\u0b85'..'\u0b8a' + | '\u0b8e'..'\u0b90' + | '\u0b92'..'\u0b95' + | '\u0b99'..'\u0b9a' + | '\u0b9c' + | '\u0b9e'..'\u0b9f' + | '\u0ba3'..'\u0ba4' + | '\u0ba8'..'\u0baa' + | '\u0bae'..'\u0bb5' + | '\u0bb7'..'\u0bb9' + | '\u0bf9' + | '\u0c05'..'\u0c0c' + | '\u0c0e'..'\u0c10' + | '\u0c12'..'\u0c28' + | '\u0c2a'..'\u0c33' + | '\u0c35'..'\u0c39' + | '\u0c60'..'\u0c61' + | '\u0c85'..'\u0c8c' + | '\u0c8e'..'\u0c90' + | '\u0c92'..'\u0ca8' + | '\u0caa'..'\u0cb3' + | '\u0cb5'..'\u0cb9' + | '\u0cbd' + | '\u0cde' + | '\u0ce0'..'\u0ce1' + | '\u0d05'..'\u0d0c' + | '\u0d0e'..'\u0d10' + | '\u0d12'..'\u0d28' + | '\u0d2a'..'\u0d39' + | '\u0d60'..'\u0d61' + | '\u0d85'..'\u0d96' + | '\u0d9a'..'\u0db1' + | '\u0db3'..'\u0dbb' + | '\u0dbd' + | '\u0dc0'..'\u0dc6' + | '\u0e01'..'\u0e30' + | '\u0e32'..'\u0e33' + | '\u0e3f'..'\u0e46' + | '\u0e81'..'\u0e82' + | '\u0e84' + | '\u0e87'..'\u0e88' + | '\u0e8a' + | '\u0e8d' + | '\u0e94'..'\u0e97' + | '\u0e99'..'\u0e9f' + | '\u0ea1'..'\u0ea3' + | '\u0ea5' + | '\u0ea7' + | '\u0eaa'..'\u0eab' + | '\u0ead'..'\u0eb0' + | '\u0eb2'..'\u0eb3' + | '\u0ebd' + | '\u0ec0'..'\u0ec4' + | '\u0ec6' + | '\u0edc'..'\u0edd' + | '\u0f00' + | '\u0f40'..'\u0f47' + | '\u0f49'..'\u0f6a' + | '\u0f88'..'\u0f8b' + | '\u1000'..'\u1021' + | '\u1023'..'\u1027' + | '\u1029'..'\u102a' + | '\u1050'..'\u1055' + | '\u10a0'..'\u10c5' + | '\u10d0'..'\u10f8' + | '\u1100'..'\u1159' + | '\u115f'..'\u11a2' + | '\u11a8'..'\u11f9' + | '\u1200'..'\u1206' + | '\u1208'..'\u1246' + | '\u1248' + | '\u124a'..'\u124d' + | '\u1250'..'\u1256' + | '\u1258' + | '\u125a'..'\u125d' + | '\u1260'..'\u1286' + | '\u1288' + | '\u128a'..'\u128d' + | '\u1290'..'\u12ae' + | '\u12b0' + | '\u12b2'..'\u12b5' + | '\u12b8'..'\u12be' + | '\u12c0' + | '\u12c2'..'\u12c5' + | '\u12c8'..'\u12ce' + | '\u12d0'..'\u12d6' + | '\u12d8'..'\u12ee' + | '\u12f0'..'\u130e' + | '\u1310' + | '\u1312'..'\u1315' + | '\u1318'..'\u131e' + | '\u1320'..'\u1346' + | '\u1348'..'\u135a' + | '\u13a0'..'\u13f4' + | '\u1401'..'\u166c' + | '\u166f'..'\u1676' + | '\u1681'..'\u169a' + | '\u16a0'..'\u16ea' + | '\u16ee'..'\u16f0' + | '\u1700'..'\u170c' + | '\u170e'..'\u1711' + | '\u1720'..'\u1731' + | '\u1740'..'\u1751' + | '\u1760'..'\u176c' + | '\u176e'..'\u1770' + | '\u1780'..'\u17b3' + | '\u17d7' + | '\u17db'..'\u17dc' + | '\u1820'..'\u1877' + | '\u1880'..'\u18a8' + | '\u1900'..'\u191c' + | '\u1950'..'\u196d' + | '\u1970'..'\u1974' + | '\u1d00'..'\u1d6b' + | '\u1e00'..'\u1e9b' + | '\u1ea0'..'\u1ef9' + | '\u1f00'..'\u1f15' + | '\u1f18'..'\u1f1d' + | '\u1f20'..'\u1f45' + | '\u1f48'..'\u1f4d' + | '\u1f50'..'\u1f57' + | '\u1f59' + | '\u1f5b' + | '\u1f5d' + | '\u1f5f'..'\u1f7d' + | '\u1f80'..'\u1fb4' + | '\u1fb6'..'\u1fbc' + | '\u1fbe' + | '\u1fc2'..'\u1fc4' + | '\u1fc6'..'\u1fcc' + | '\u1fd0'..'\u1fd3' + | '\u1fd6'..'\u1fdb' + | '\u1fe0'..'\u1fec' + | '\u1ff2'..'\u1ff4' + | '\u1ff6'..'\u1ffc' + | '\u203f'..'\u2040' + | '\u2054' + | '\u2071' + | '\u207f' + | '\u20a0'..'\u20b1' + | '\u2102' + | '\u2107' + | '\u210a'..'\u2113' + | '\u2115' + | '\u2119'..'\u211d' + | '\u2124' + | '\u2126' + | '\u2128' + | '\u212a'..'\u212d' + | '\u212f'..'\u2131' + | '\u2133'..'\u2139' + | '\u213d'..'\u213f' + | '\u2145'..'\u2149' + | '\u2160'..'\u2183' + | '\u3005'..'\u3007' + | '\u3021'..'\u3029' + | '\u3031'..'\u3035' + | '\u3038'..'\u303c' + | '\u3041'..'\u3096' + | '\u309d'..'\u309f' + | '\u30a1'..'\u30ff' + | '\u3105'..'\u312c' + | '\u3131'..'\u318e' + | '\u31a0'..'\u31b7' + | '\u31f0'..'\u31ff' + | '\u3400'..'\u4db5' + | '\u4e00'..'\u9fa5' + | '\ua000'..'\ua48c' + | '\uac00'..'\ud7a3' + | '\uf900'..'\ufa2d' + | '\ufa30'..'\ufa6a' + | '\ufb00'..'\ufb06' + | '\ufb13'..'\ufb17' + | '\ufb1d' + | '\ufb1f'..'\ufb28' + | '\ufb2a'..'\ufb36' + | '\ufb38'..'\ufb3c' + | '\ufb3e' + | '\ufb40'..'\ufb41' + | '\ufb43'..'\ufb44' + | '\ufb46'..'\ufbb1' + | '\ufbd3'..'\ufd3d' + | '\ufd50'..'\ufd8f' + | '\ufd92'..'\ufdc7' + | '\ufdf0'..'\ufdfc' + | '\ufe33'..'\ufe34' + | '\ufe4d'..'\ufe4f' + | '\ufe69' + | '\ufe70'..'\ufe74' + | '\ufe76'..'\ufefc' + | '\uff04' + | '\uff21'..'\uff3a' + | '\uff3f' + | '\uff41'..'\uff5a' + | '\uff65'..'\uffbe' + | '\uffc2'..'\uffc7' + | '\uffca'..'\uffcf' + | '\uffd2'..'\uffd7' + | '\uffda'..'\uffdc' + | '\uffe0'..'\uffe1' + | '\uffe5'..'\uffe6' + | ('\ud800'..'\udbff') ('\udc00'..'\udfff') + ; + +fragment +IdentifierPart + : '\u0000'..'\u0008' + | '\u000e'..'\u001b' + | '\u0024' + | '\u0030'..'\u0039' + | '\u0041'..'\u005a' + | '\u005f' + | '\u0061'..'\u007a' + | '\u007f'..'\u009f' + | '\u00a2'..'\u00a5' + | '\u00aa' + | '\u00ad' + | '\u00b5' + | '\u00ba' + | '\u00c0'..'\u00d6' + | '\u00d8'..'\u00f6' + | '\u00f8'..'\u0236' + | '\u0250'..'\u02c1' + | '\u02c6'..'\u02d1' + | '\u02e0'..'\u02e4' + | '\u02ee' + | '\u0300'..'\u0357' + | '\u035d'..'\u036f' + | '\u037a' + | '\u0386' + | '\u0388'..'\u038a' + | '\u038c' + | '\u038e'..'\u03a1' + | '\u03a3'..'\u03ce' + | '\u03d0'..'\u03f5' + | '\u03f7'..'\u03fb' + | '\u0400'..'\u0481' + | '\u0483'..'\u0486' + | '\u048a'..'\u04ce' + | '\u04d0'..'\u04f5' + | '\u04f8'..'\u04f9' + | '\u0500'..'\u050f' + | '\u0531'..'\u0556' + | '\u0559' + | '\u0561'..'\u0587' + | '\u0591'..'\u05a1' + | '\u05a3'..'\u05b9' + | '\u05bb'..'\u05bd' + | '\u05bf' + | '\u05c1'..'\u05c2' + | '\u05c4' + | '\u05d0'..'\u05ea' + | '\u05f0'..'\u05f2' + | '\u0600'..'\u0603' + | '\u0610'..'\u0615' + | '\u0621'..'\u063a' + | '\u0640'..'\u0658' + | '\u0660'..'\u0669' + | '\u066e'..'\u06d3' + | '\u06d5'..'\u06dd' + | '\u06df'..'\u06e8' + | '\u06ea'..'\u06fc' + | '\u06ff' + | '\u070f'..'\u074a' + | '\u074d'..'\u074f' + | '\u0780'..'\u07b1' + | '\u0901'..'\u0939' + | '\u093c'..'\u094d' + | '\u0950'..'\u0954' + | '\u0958'..'\u0963' + | '\u0966'..'\u096f' + | '\u0981'..'\u0983' + | '\u0985'..'\u098c' + | '\u098f'..'\u0990' + | '\u0993'..'\u09a8' + | '\u09aa'..'\u09b0' + | '\u09b2' + | '\u09b6'..'\u09b9' + | '\u09bc'..'\u09c4' + | '\u09c7'..'\u09c8' + | '\u09cb'..'\u09cd' + | '\u09d7' + | '\u09dc'..'\u09dd' + | '\u09df'..'\u09e3' + | '\u09e6'..'\u09f3' + | '\u0a01'..'\u0a03' + | '\u0a05'..'\u0a0a' + | '\u0a0f'..'\u0a10' + | '\u0a13'..'\u0a28' + | '\u0a2a'..'\u0a30' + | '\u0a32'..'\u0a33' + | '\u0a35'..'\u0a36' + | '\u0a38'..'\u0a39' + | '\u0a3c' + | '\u0a3e'..'\u0a42' + | '\u0a47'..'\u0a48' + | '\u0a4b'..'\u0a4d' + | '\u0a59'..'\u0a5c' + | '\u0a5e' + | '\u0a66'..'\u0a74' + | '\u0a81'..'\u0a83' + | '\u0a85'..'\u0a8d' + | '\u0a8f'..'\u0a91' + | '\u0a93'..'\u0aa8' + | '\u0aaa'..'\u0ab0' + | '\u0ab2'..'\u0ab3' + | '\u0ab5'..'\u0ab9' + | '\u0abc'..'\u0ac5' + | '\u0ac7'..'\u0ac9' + | '\u0acb'..'\u0acd' + | '\u0ad0' + | '\u0ae0'..'\u0ae3' + | '\u0ae6'..'\u0aef' + | '\u0af1' + | '\u0b01'..'\u0b03' + | '\u0b05'..'\u0b0c' + | '\u0b0f'..'\u0b10' + | '\u0b13'..'\u0b28' + | '\u0b2a'..'\u0b30' + | '\u0b32'..'\u0b33' + | '\u0b35'..'\u0b39' + | '\u0b3c'..'\u0b43' + | '\u0b47'..'\u0b48' + | '\u0b4b'..'\u0b4d' + | '\u0b56'..'\u0b57' + | '\u0b5c'..'\u0b5d' + | '\u0b5f'..'\u0b61' + | '\u0b66'..'\u0b6f' + | '\u0b71' + | '\u0b82'..'\u0b83' + | '\u0b85'..'\u0b8a' + | '\u0b8e'..'\u0b90' + | '\u0b92'..'\u0b95' + | '\u0b99'..'\u0b9a' + | '\u0b9c' + | '\u0b9e'..'\u0b9f' + | '\u0ba3'..'\u0ba4' + | '\u0ba8'..'\u0baa' + | '\u0bae'..'\u0bb5' + | '\u0bb7'..'\u0bb9' + | '\u0bbe'..'\u0bc2' + | '\u0bc6'..'\u0bc8' + | '\u0bca'..'\u0bcd' + | '\u0bd7' + | '\u0be7'..'\u0bef' + | '\u0bf9' + | '\u0c01'..'\u0c03' + | '\u0c05'..'\u0c0c' + | '\u0c0e'..'\u0c10' + | '\u0c12'..'\u0c28' + | '\u0c2a'..'\u0c33' + | '\u0c35'..'\u0c39' + | '\u0c3e'..'\u0c44' + | '\u0c46'..'\u0c48' + | '\u0c4a'..'\u0c4d' + | '\u0c55'..'\u0c56' + | '\u0c60'..'\u0c61' + | '\u0c66'..'\u0c6f' + | '\u0c82'..'\u0c83' + | '\u0c85'..'\u0c8c' + | '\u0c8e'..'\u0c90' + | '\u0c92'..'\u0ca8' + | '\u0caa'..'\u0cb3' + | '\u0cb5'..'\u0cb9' + | '\u0cbc'..'\u0cc4' + | '\u0cc6'..'\u0cc8' + | '\u0cca'..'\u0ccd' + | '\u0cd5'..'\u0cd6' + | '\u0cde' + | '\u0ce0'..'\u0ce1' + | '\u0ce6'..'\u0cef' + | '\u0d02'..'\u0d03' + | '\u0d05'..'\u0d0c' + | '\u0d0e'..'\u0d10' + | '\u0d12'..'\u0d28' + | '\u0d2a'..'\u0d39' + | '\u0d3e'..'\u0d43' + | '\u0d46'..'\u0d48' + | '\u0d4a'..'\u0d4d' + | '\u0d57' + | '\u0d60'..'\u0d61' + | '\u0d66'..'\u0d6f' + | '\u0d82'..'\u0d83' + | '\u0d85'..'\u0d96' + | '\u0d9a'..'\u0db1' + | '\u0db3'..'\u0dbb' + | '\u0dbd' + | '\u0dc0'..'\u0dc6' + | '\u0dca' + | '\u0dcf'..'\u0dd4' + | '\u0dd6' + | '\u0dd8'..'\u0ddf' + | '\u0df2'..'\u0df3' + | '\u0e01'..'\u0e3a' + | '\u0e3f'..'\u0e4e' + | '\u0e50'..'\u0e59' + | '\u0e81'..'\u0e82' + | '\u0e84' + | '\u0e87'..'\u0e88' + | '\u0e8a' + | '\u0e8d' + | '\u0e94'..'\u0e97' + | '\u0e99'..'\u0e9f' + | '\u0ea1'..'\u0ea3' + | '\u0ea5' + | '\u0ea7' + | '\u0eaa'..'\u0eab' + | '\u0ead'..'\u0eb9' + | '\u0ebb'..'\u0ebd' + | '\u0ec0'..'\u0ec4' + | '\u0ec6' + | '\u0ec8'..'\u0ecd' + | '\u0ed0'..'\u0ed9' + | '\u0edc'..'\u0edd' + | '\u0f00' + | '\u0f18'..'\u0f19' + | '\u0f20'..'\u0f29' + | '\u0f35' + | '\u0f37' + | '\u0f39' + | '\u0f3e'..'\u0f47' + | '\u0f49'..'\u0f6a' + | '\u0f71'..'\u0f84' + | '\u0f86'..'\u0f8b' + | '\u0f90'..'\u0f97' + | '\u0f99'..'\u0fbc' + | '\u0fc6' + | '\u1000'..'\u1021' + | '\u1023'..'\u1027' + | '\u1029'..'\u102a' + | '\u102c'..'\u1032' + | '\u1036'..'\u1039' + | '\u1040'..'\u1049' + | '\u1050'..'\u1059' + | '\u10a0'..'\u10c5' + | '\u10d0'..'\u10f8' + | '\u1100'..'\u1159' + | '\u115f'..'\u11a2' + | '\u11a8'..'\u11f9' + | '\u1200'..'\u1206' + | '\u1208'..'\u1246' + | '\u1248' + | '\u124a'..'\u124d' + | '\u1250'..'\u1256' + | '\u1258' + | '\u125a'..'\u125d' + | '\u1260'..'\u1286' + | '\u1288' + | '\u128a'..'\u128d' + | '\u1290'..'\u12ae' + | '\u12b0' + | '\u12b2'..'\u12b5' + | '\u12b8'..'\u12be' + | '\u12c0' + | '\u12c2'..'\u12c5' + | '\u12c8'..'\u12ce' + | '\u12d0'..'\u12d6' + | '\u12d8'..'\u12ee' + | '\u12f0'..'\u130e' + | '\u1310' + | '\u1312'..'\u1315' + | '\u1318'..'\u131e' + | '\u1320'..'\u1346' + | '\u1348'..'\u135a' + | '\u1369'..'\u1371' + | '\u13a0'..'\u13f4' + | '\u1401'..'\u166c' + | '\u166f'..'\u1676' + | '\u1681'..'\u169a' + | '\u16a0'..'\u16ea' + | '\u16ee'..'\u16f0' + | '\u1700'..'\u170c' + | '\u170e'..'\u1714' + | '\u1720'..'\u1734' + | '\u1740'..'\u1753' + | '\u1760'..'\u176c' + | '\u176e'..'\u1770' + | '\u1772'..'\u1773' + | '\u1780'..'\u17d3' + | '\u17d7' + | '\u17db'..'\u17dd' + | '\u17e0'..'\u17e9' + | '\u180b'..'\u180d' + | '\u1810'..'\u1819' + | '\u1820'..'\u1877' + | '\u1880'..'\u18a9' + | '\u1900'..'\u191c' + | '\u1920'..'\u192b' + | '\u1930'..'\u193b' + | '\u1946'..'\u196d' + | '\u1970'..'\u1974' + | '\u1d00'..'\u1d6b' + | '\u1e00'..'\u1e9b' + | '\u1ea0'..'\u1ef9' + | '\u1f00'..'\u1f15' + | '\u1f18'..'\u1f1d' + | '\u1f20'..'\u1f45' + | '\u1f48'..'\u1f4d' + | '\u1f50'..'\u1f57' + | '\u1f59' + | '\u1f5b' + | '\u1f5d' + | '\u1f5f'..'\u1f7d' + | '\u1f80'..'\u1fb4' + | '\u1fb6'..'\u1fbc' + | '\u1fbe' + | '\u1fc2'..'\u1fc4' + | '\u1fc6'..'\u1fcc' + | '\u1fd0'..'\u1fd3' + | '\u1fd6'..'\u1fdb' + | '\u1fe0'..'\u1fec' + | '\u1ff2'..'\u1ff4' + | '\u1ff6'..'\u1ffc' + | '\u200c'..'\u200f' + | '\u202a'..'\u202e' + | '\u203f'..'\u2040' + | '\u2054' + | '\u2060'..'\u2063' + | '\u206a'..'\u206f' + | '\u2071' + | '\u207f' + | '\u20a0'..'\u20b1' + | '\u20d0'..'\u20dc' + | '\u20e1' + | '\u20e5'..'\u20ea' + | '\u2102' + | '\u2107' + | '\u210a'..'\u2113' + | '\u2115' + | '\u2119'..'\u211d' + | '\u2124' + | '\u2126' + | '\u2128' + | '\u212a'..'\u212d' + | '\u212f'..'\u2131' + | '\u2133'..'\u2139' + | '\u213d'..'\u213f' + | '\u2145'..'\u2149' + | '\u2160'..'\u2183' + | '\u3005'..'\u3007' + | '\u3021'..'\u302f' + | '\u3031'..'\u3035' + | '\u3038'..'\u303c' + | '\u3041'..'\u3096' + | '\u3099'..'\u309a' + | '\u309d'..'\u309f' + | '\u30a1'..'\u30ff' + | '\u3105'..'\u312c' + | '\u3131'..'\u318e' + | '\u31a0'..'\u31b7' + | '\u31f0'..'\u31ff' + | '\u3400'..'\u4db5' + | '\u4e00'..'\u9fa5' + | '\ua000'..'\ua48c' + | '\uac00'..'\ud7a3' + | '\uf900'..'\ufa2d' + | '\ufa30'..'\ufa6a' + | '\ufb00'..'\ufb06' + | '\ufb13'..'\ufb17' + | '\ufb1d'..'\ufb28' + | '\ufb2a'..'\ufb36' + | '\ufb38'..'\ufb3c' + | '\ufb3e' + | '\ufb40'..'\ufb41' + | '\ufb43'..'\ufb44' + | '\ufb46'..'\ufbb1' + | '\ufbd3'..'\ufd3d' + | '\ufd50'..'\ufd8f' + | '\ufd92'..'\ufdc7' + | '\ufdf0'..'\ufdfc' + | '\ufe00'..'\ufe0f' + | '\ufe20'..'\ufe23' + | '\ufe33'..'\ufe34' + | '\ufe4d'..'\ufe4f' + | '\ufe69' + | '\ufe70'..'\ufe74' + | '\ufe76'..'\ufefc' + | '\ufeff' + | '\uff04' + | '\uff10'..'\uff19' + | '\uff21'..'\uff3a' + | '\uff3f' + | '\uff41'..'\uff5a' + | '\uff65'..'\uffbe' + | '\uffc2'..'\uffc7' + | '\uffca'..'\uffcf' + | '\uffd2'..'\uffd7' + | '\uffda'..'\uffdc' + | '\uffe0'..'\uffe1' + | '\uffe5'..'\uffe6' + | '\ufff9'..'\ufffb' + | ('\ud800'..'\udbff') ('\udc00'..'\udfff') + ; + +