2362N/A * Copyright (c) 1994, 2004, Oracle and/or its affiliates. All rights reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 2362N/A * published by the Free Software Foundation. Oracle designates this 0N/A * particular file as subject to the "Classpath" exception as provided 2362N/A * by Oracle in the LICENSE file that accompanied this code. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 2362N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 2362N/A * or visit www.oracle.com if you need additional information or have any 0N/A * A Scanner for Java tokens. Errors are reported 0N/A * to the environment object.<p> 0N/A * The scanner keeps track of the current token, 0N/A * the value of the current token (if any), and the start 0N/A * position of the current token.<p> 0N/A * The scan() method advances the scanner to the next 0N/A * token in the input.<p> 0N/A * The match() method is used to quickly match opening 0N/A * brackets (ie: '(', '{', or '[') with their closing 0N/A * counter part. This is useful during error recovery.<p> 0N/A * An position consists of: ((linenr << WHEREOFFSETBITS) | offset) 0N/A * this means that both the line number and the exact offset into 0N/A * the file are encoded in each position value.<p> 0N/A * The compiler treats either "\n", "\r" or "\r\n" as the 0N/A * WARNING: The contents of this source file are not part of any 0N/A * supported API. Code that depends on them does so at its own risk: 0N/A * they are subject to change or removal without notice. 0N/A * @author Arthur van Hoff 0N/A * The increment for each character. 0N/A * The increment for each line. 0N/A public static final int EOF = -
1;
0N/A * Where errors are reported 0N/A * If true, present all comments as tokens. 0N/A * Contents are not saved, but positions are recorded accurately, 0N/A * so the comment can be recovered from the text. 0N/A * Line terminations are also returned as comment tokens, 0N/A * and may be distinguished by their start and end positions, 0N/A * which are equal (meaning, these tokens contain no chars). 0N/A * The position of the current token 0N/A * The position of the previous token 0N/A * The current character 0N/A public int radix;
// Radix, when reading int or long 0N/A * A doc comment preceding the most recent token 0N/A * A growable character buffer. 0N/A // The following two methods have been hand-inlined in 0N/A // scanDocComment. If you make changes here, you should 0N/A // check to see if scanDocComment also needs modification. 0N/A * Create a scanner to scan an input stream. 0N/A * Setup input from the given input stream, 0N/A * and scan the first token from it. 0N/A * Create a scanner to scan an input stream. 0N/A // Expect the subclass to call useInputStream at the right time. 0N/A * Initialized keyword and token Hashtables 0N/A // Statement keywords 0N/A // Type defineKeywords 0N/A // Expression keywords 0N/A // Declaration keywords 0N/A // Modifier keywords 0N/A // reserved keywords 0N/A * Scan a comment. This method should be 0N/A * called once the initial /, * and the next 0N/A * character have been read. 0N/A * Scan a doc comment. This method should be called 0N/A * once the initial /, * and * have been read. It gathers 0N/A * the content of the comment (witout leading spaces and '*'s) 0N/A * in the string buffer. 0N/A // Note: this method has been hand-optimized to yield 0N/A // better performance. This was done after it was noted 0N/A // that javadoc spent a great deal of its time here. 0N/A // This should also help the performance of the compiler 0N/A // as well -- it scans the doc comments to find 0N/A // @deprecated tags. 0N/A // The logic of the method has been completely rewritten 0N/A // to avoid the use of flags that need to be looked at 0N/A // for every character read. Members that are accessed 0N/A // more than once have been stored in local variables. 0N/A // The methods putc() and bufferString() have been 0N/A // inlined by hand. Extra cases have been added to 0N/A // switch statements to trick the compiler into generating 0N/A // a tableswitch instead of a lookupswitch. 0N/A // This implementation aims to preserve the previous 0N/A // behavior of this method. 0N/A // Put `in' in a local variable. 0N/A // We maintain the buffer locally rather than calling putc(). 0N/A // We are called pointing at the second star of the doc 0N/A // Input: /** the rest of the comment ... */ 0N/A // We rely on this in the code below. 0N/A // Consume any number of stars. 0N/A // Is the comment of the form /**/, /***/, /****/, etc.? 0N/A // Set ch and return 0N/A // Skip a newline on the first line of the comment. 0N/A // The outerLoop processes the doc comment, looping once 0N/A // for each line. For each line, it first strips off 0N/A // whitespace, then it consumes any stars, then it 0N/A // puts the rest of the line into our buffer. 0N/A // The wsLoop consumes whitespace from the beginning 0N/A // We could check for other forms of whitespace 0N/A // as well, but this is left as is for minimum 0N/A // disturbance of functionality. 0N/A // Just skip whitespace. 0N/A // We have added extra cases here to trick the 0N/A // compiler into using a tableswitch instead of 0N/A // a lookupswitch. They can be removed without 0N/A // a change in meaning. 0N/A case 10:
case 11:
case 12:
case 13:
case 14:
case 15:
0N/A case 16:
case 17:
case 18:
case 19:
case 20:
case 21:
0N/A case 22:
case 23:
case 24:
case 25:
case 26:
case 27:
0N/A case 28:
case 29:
case 30:
case 31:
0N/A // We've seen something that isn't whitespace, 0N/A // Are there stars here? If so, consume them all 0N/A // and check for the end of comment. 0N/A // Skip all of the stars... 0N/A // ...then check for the closing slash. 0N/A // We're done with the doc comment. 0N/A // Set ch and break out. 0N/A // The textLoop processes the rest of the characters 0N/A // on the line, adding them to our buffer. 0N/A // We've seen a premature EOF. Break out 0N/A // Is this just a star? Or is this the 0N/A // end of a comment? 0N/A // This is the end of the comment, 0N/A // set ch and return our buffer. 0N/A // This is just an ordinary star. Add it to 0N/A // We've seen a newline. Add it to our 0N/A // buffer and break out of this loop, 0N/A // starting fresh on a new line. 0N/A // Again, the extra cases here are a trick 0N/A // to get the compiler to generate a tableswitch. 0N/A case 0:
case 1:
case 2:
case 3:
case 4:
case 5:
0N/A case 6:
case 7:
case 8:
case 11:
case 12:
case 13:
0N/A case 14:
case 15:
case 16:
case 17:
case 18:
case 19:
0N/A case 20:
case 21:
case 22:
case 23:
case 24:
case 25:
0N/A case 26:
case 27:
case 28:
case 29:
case 30:
case 31:
0N/A case 32:
case 33:
case 34:
case 35:
case 36:
case 37:
0N/A case 38:
case 39:
case 40:
0N/A // Add the character to our buffer. 0N/A // We have scanned our doc comment. It is stored in 0N/A // buffer. The previous implementation of scanDocComment 0N/A // stripped off all trailing spaces and stars from the comment. 0N/A // We will do this as well, so as to cause a minimum of 0N/A // disturbance. Is this what we want? 0N/A // And again, the extra cases here are a trick 0N/A // to get the compiler to generate a tableswitch. 0N/A case 0:
case 1:
case 2:
case 3:
case 4:
case 5:
0N/A case 6:
case 7:
case 8:
case 10:
case 11:
case 12:
0N/A case 13:
case 14:
case 15:
case 16:
case 17:
case 18:
0N/A case 19:
case 20:
case 21:
case 22:
case 23:
case 24:
0N/A case 25:
case 26:
case 27:
case 28:
case 29:
case 30:
0N/A case 31:
case 33:
case 34:
case 35:
case 36:
case 37:
0N/A case 38:
case 39:
case 40:
0N/A // Return the text of the doc comment. 0N/A * Scan a number. The first digit of the number should be the current 0N/A * character. We may be scanning hex, decimal, or octal at this point 0N/A boolean seenDigit =
false;
// used to detect invalid hex number 0xL 0N/A // We can't yet throw an error if reading an octal. We might 0N/A // discover we're really reading a real. 0N/A case '0':
case '1':
case '2':
case '3':
0N/A case '4':
case '5':
case '6':
case '7':
0N/A case 'd':
case 'D':
case 'e':
case 'E':
case 'f':
case 'F':
0N/A case 'a':
case 'A':
case 'b':
case 'B':
case 'c':
case 'C':
0N/A // if the first character is a '0' and this is the second 0N/A // letter, then read in a hexadecimal number. Otherwise, error. 0N/A // we'll get an illegal character error 0N/A // We have just finished reading the number. The next thing better 0N/A // not be a letter or digit. 0N/A // Note: There will be deprecation warnings against these uses 0N/A // of Character.isJavaLetterOrDigit and Character.isJavaLetter. 0N/A // Do not fix them yet; allow the compiler to run on pre-JDK1.1 VMs. 0N/A // A bogus octal literal. 0N/A // A hex literal with no digits, 0xL, for example. 0N/A // Check for overflow. Note that base 10 literals 0N/A // have different rules than base 8 and 16. 0N/A // Give a specific error message which tells 0N/A // the user the range. 0N/A // Give a specific error message which tells 0N/A // the user the range. 0N/A * Scan a float. We are either looking at the decimal, or we have already 0N/A * seen it and put it into the buffer. We haven't seen an exponent. 0N/A * Scan a float. Should be called with the current character is either 0N/A * the 'e', 'E' or '.' 0N/A case '0':
case '1':
case '2':
case '3':
case '4':
0N/A case '5':
case '6':
case '7':
case '8':
case '9':
0N/A // we have just finished reading the number. The next thing better 0N/A // not be a letter or digit. 0N/A // We have a token that parses as a number. Is this token possibly zero? 0N/A // i.e. does it have a non-zero value in the mantissa? 0N/A case '1':
case '2':
case '3':
case '4':
case '5':
0N/A case '6':
case '7':
case '8':
case '9':
0N/A case 'e':
case 'E':
case 'f':
case 'F':
0N/A * Scan an escape character. 0N/A * @return the character or -1 if it escaped an 0N/A case '0':
case '1':
case '2':
case '3':
0N/A case '4':
case '5':
case '6':
case '7': {
0N/A for (
int i =
2 ; i >
0 ; i--) {
0N/A case '0':
case '1':
case '2':
case '3':
0N/A case '4':
case '5':
case '6':
case '7':
0N/A * Scan a string. The current character 0N/A * should be the opening " of the string. 0N/A * Scan a character. The current character should be 0N/A * the opening ' of the character constant. 0N/A // There are two standard problems this case deals with. One 0N/A // is the malformed single quote constant (i.e. the programmer 0N/A // uses ''' instead of '\'') and the other is the empty 0N/A // character constant (i.e. ''). Just consume any number of 0N/A // single quotes and emit an error message. 0N/A * Scan an Identifier. The current character should 0N/A * be the first character of the identifier. 0N/A case 'a':
case 'b':
case 'c':
case 'd':
case 'e':
0N/A case 'f':
case 'g':
case 'h':
case 'i':
case 'j':
0N/A case 'k':
case 'l':
case 'm':
case 'n':
case 'o':
0N/A case 'p':
case 'q':
case 'r':
case 's':
case 't':
0N/A case 'u':
case 'v':
case 'w':
case 'x':
case 'y':
0N/A case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
0N/A case 'F':
case 'G':
case 'H':
case 'I':
case 'J':
0N/A case 'K':
case 'L':
case 'M':
case 'N':
case 'O':
0N/A case 'P':
case 'Q':
case 'R':
case 'S':
case 'T':
0N/A case 'U':
case 'V':
case 'W':
case 'X':
case 'Y':
0N/A case '0':
case '1':
case '2':
case '3':
case '4':
0N/A case '5':
case '6':
case '7':
case '8':
case '9':
0N/A * The ending position of the current token 0N/A // Note: This should be part of the pos itself. 0N/A * If the current token is IDENT, return the identifier occurrence. 0N/A * It will be freshly allocated. 0N/A * Scan the next token. 0N/A * @return the position of the previous token. 0N/A // Avoid this path the next time around. 0N/A // Do not just call in.read; we want to present 0N/A // a null token (and also avoid read-ahead). 0N/A // Parse a // comment 0N/A case '0':
case '1':
case '2':
case '3':
case '4':
0N/A case '5':
case '6':
case '7':
case '8':
case '9':
0N/A case '0':
case '1':
case '2':
case '3':
case '4':
0N/A case '5':
case '6':
case '7':
case '8':
case '9':
0N/A case 'a':
case 'b':
case 'c':
case 'd':
case 'e':
case 'f':
0N/A case 'g':
case 'h':
case 'i':
case 'j':
case 'k':
case 'l':
0N/A case 'm':
case 'n':
case 'o':
case 'p':
case 'q':
case 'r':
0N/A case 's':
case 't':
case 'u':
case 'v':
case 'w':
case 'x':
0N/A case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
case 'F':
0N/A case 'G':
case 'H':
case 'I':
case 'J':
case 'K':
case 'L':
0N/A case 'M':
case 'N':
case 'O':
case 'P':
case 'Q':
case 'R':
0N/A case 'S':
case 'T':
case 'U':
case 'V':
case 'W':
case 'X':
0N/A // Our one concession to DOS. 0N/A * Scan to a matching '}', ']' or ')'. The current token must be 0N/A * a '{', '[' or '(';