JFlexTokenizer.java revision 1461
816N/A * The contents of this file are subject to the terms of the 816N/A * Common Development and Distribution License (the "License"). 816N/A * You may not use this file except in compliance with the License. 816N/A * language governing permissions and limitations under the License. 816N/A * When distributing Covered Code, include this CDDL HEADER in each 816N/A * If applicable, add the following below this CDDL HEADER, with the 816N/A * fields enclosed by brackets "[]" replaced with your own identifying 816N/A * information: Portions Copyright [yyyy] [name of copyright owner] 1056N/A * Copyright (c) 2009, 2010, Oracle and/or its affiliates. All rights reserved. 1461N/A * This class was created because of lucene: 1461N/A * <li>2.4.1 update which introduced char[] in Tokens instead of String</li> 1461N/A * <li>3.0.0 uses AttributeSource instead of Tokens to make things even easier :-D</li> 1461N/A * <li>3.5.0 uses CharTermAttribute</li> 1461N/A * Generally this is a "template" for all new Tokenizers, so be carefull when 1461N/A * changing it, it will impact almost ALL symbol tokenizers in OpenGrok ... 816N/A * Created on August 24, 2009 1461N/A /** Stack to remember the order of relevant states for the current parser. */ 1461N/A * Run the scanner to get the next token from the input. 1461N/A * Closes the current input stream, and resets the scanner to read from the 1461N/A * given input stream. All internal variables are reset, the old input 1461N/A * stream cannot be reused (content of the internal buffer is discarded and 1461N/A * lost). The lexical state is set to {@code YY_INITIAL}. 1461N/A * @param reader the new input stream to operate on.*/ 1461N/A * Closes the input stream in use. All subsequent calls to the scanning 1461N/A * method will return the end of file value. 1461N/A * Enter the given lexical state. 1461N/A * @param newState state to enter 1461N/A * Get the current lexical state of the scanner. 1461N/A * Create a new tokenizer using the given stream. 1461N/A * @param input input to process. Might be {@code null}. 1056N/A * Reinitialize the tokenizer with new contents. 1056N/A * @param contents a char buffer with text to tokenize 1056N/A * @param length the number of characters to use from the char buffer 1461N/A * Close the scanner including the input stream in use. 1461N/A /** term text of the current Token */ 1461N/A /** start and end character offset of the current Token */ 1461N/A /** position of the current Token relative to the previous Token within the 1461N/A * Go forward to the next available token. 1461N/A * @return {@code false} if no more tokens available. 816N/A * @throws java.io.IOException 1461N/A * Reset the attributes for the current Token. 1461N/A * NOTE: For now PositionIncrement gets automatically set to {@code 1}. 1461N/A * @param str Token text to set. 1461N/A * @param start the start psition of the current Token 1461N/A * @param end the end position of the current Token 930N/A //FIXME increasing below by one(default) might be tricky, need more analysis 1318N/A // after lucene upgrade to 3.5 below is most probably not even needed 1461N/A * Push the current state to the state order stack and enter the given state. 1461N/A * @param newState new state to enter. 1461N/A * Pop the last entry from the state order stack and enter it.