Scanner.java revision 3471
0N/A * Copyright (c) 2003, 2011, Oracle and/or its affiliates. All rights reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. Oracle designates this 0N/A * particular file as subject to the "Classpath" exception as provided 0N/A * by Oracle in the LICENSE file that accompanied this code. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 0N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 0N/A * or visit www.oracle.com if you need additional information or have any 0N/A * A simple text scanner which can parse primitive types and strings using 0N/A * regular expressions. 0N/A * <p>A <code>Scanner</code> breaks its input into tokens using a 0N/A * delimiter pattern, which by default matches whitespace. The resulting 0N/A * tokens may then be converted into values of different types using the 0N/A * various <tt>next</tt> methods. 0N/A * <p>For example, this code allows a user to read a number from 0N/A * <tt>System.in</tt>: 0N/A * Scanner sc = new Scanner(System.in); 763N/A * int i = sc.nextInt(); 763N/A * <p>As another example, this code allows <code>long</code> types to be 763N/A * assigned from entries in a file <code>myNumbers</code>: 763N/A * Scanner sc = new Scanner(new File("myNumbers")); 763N/A * while (sc.hasNextLong()) { 0N/A * long aLong = sc.nextLong(); 0N/A * }</pre></blockquote> * <p>The scanner can also use delimiters other than whitespace. This * example reads several items in from a string: * String input = "1 fish 2 fish red fish blue fish"; * Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*"); * System.out.println(s.nextInt()); * System.out.println(s.nextInt()); * System.out.println(s.next()); * System.out.println(s.next()); * s.close(); </pre></blockquote> * prints the following output: * blue </pre></blockquote> * <p>The same output can be generated with this code, which uses a regular * expression to parse all four tokens at once: * String input = "1 fish 2 fish red fish blue fish"; * Scanner s = new Scanner(input); * s.findInLine("(\\d+) fish (\\d+) fish (\\w+) fish (\\w+)"); * MatchResult result = s.match(); * for (int i=1; i<=result.groupCount(); i++) * System.out.println(result.group(i)); * s.close(); </pre></blockquote> * <p>The <a name="default-delimiter">default whitespace delimiter</a> used * by a scanner is as recognized by {@link java.lang.Character}.{@link * java.lang.Character#isWhitespace(char) isWhitespace}. The {@link #reset} * method will reset the value of the scanner's delimiter to the default * whitespace delimiter regardless of whether it was previously changed. * <p>A scanning operation may block waiting for input. * <p>The {@link #next} and {@link #hasNext} methods and their * primitive-type companion methods (such as {@link #nextInt} and * {@link #hasNextInt}) first skip any input that matches the delimiter * pattern, and then attempt to return the next token. Both <tt>hasNext</tt> * and <tt>next</tt> methods may block waiting for further input. Whether a * <tt>hasNext</tt> method blocks has no connection to whether or not its * associated <tt>next</tt> method will block. * <p> The {@link #findInLine}, {@link #findWithinHorizon}, and {@link #skip} * methods operate independently of the delimiter pattern. These methods will * attempt to match the specified pattern with no regard to delimiters in the * input and thus can be used in special circumstances where delimiters are * not relevant. These methods may block waiting for more input. * <p>When a scanner throws an {@link InputMismatchException}, the scanner * will not pass the token that caused the exception, so that it may be * retrieved or skipped via some other method. * <p>Depending upon the type of delimiting pattern, empty tokens may be * returned. For example, the pattern <tt>"\\s+"</tt> will return no empty * tokens since it matches multiple instances of the delimiter. The delimiting * pattern <tt>"\\s"</tt> could return empty tokens since it only passes one * <p> A scanner can read text from any object which implements the {@link * java.lang.Readable} interface. If an invocation of the underlying * readable's {@link java.lang.Readable#read} method throws an {@link * java.io.IOException} then the scanner assumes that the end of the input * has been reached. The most recent <tt>IOException</tt> thrown by the * underlying readable can be retrieved via the {@link #ioException} method. * <p>When a <code>Scanner</code> is closed, it will close its input source * if the source implements the {@link java.io.Closeable} interface. * <p>A <code>Scanner</code> is not safe for multithreaded use without * external synchronization. * <p>Unless otherwise mentioned, passing a <code>null</code> parameter into * any method of a <code>Scanner</code> will cause a * <code>NullPointerException</code> to be thrown. * <p>A scanner will default to interpreting numbers as decimal unless a * different radix has been set by using the {@link #useRadix} method. The * {@link #reset} method will reset the value of the scanner's radix to * <code>10</code> regardless of whether it was previously changed. * <a name="localized-numbers"> * <h4> Localized numbers </h4> * <p> An instance of this class is capable of scanning numbers in the standard * formats as well as in the formats of the scanner's locale. A scanner's * <a name="initial-locale">initial locale </a>is the value returned by the {@link * java.util.Locale#getDefault} method; it may be changed via the {@link * #useLocale} method. The {@link #reset} method will reset the value of the * scanner's locale to the initial locale regardless of whether it was * <p>The localized formats are defined in terms of the following parameters, * which for a particular locale are taken from that locale's {@link * java.text.DecimalFormat DecimalFormat} object, <tt>df</tt>, and its and * {@link java.text.DecimalFormatSymbols DecimalFormatSymbols} object, * <tr><td valign="top"><i>LocalGroupSeparator </i></td> * <td valign="top">The character used to separate thousands groups, * <i>i.e.,</i> <tt>dfs.</tt>{@link * java.text.DecimalFormatSymbols#getGroupingSeparator * getGroupingSeparator()}</td></tr> * <tr><td valign="top"><i>LocalDecimalSeparator </i></td> * <td valign="top">The character used for the decimal point, * <i>i.e.,</i> <tt>dfs.</tt>{@link * java.text.DecimalFormatSymbols#getDecimalSeparator * getDecimalSeparator()}</td></tr> * <tr><td valign="top"><i>LocalPositivePrefix </i></td> * <td valign="top">The string that appears before a positive number (may * be empty), <i>i.e.,</i> <tt>df.</tt>{@link * java.text.DecimalFormat#getPositivePrefix * getPositivePrefix()}</td></tr> * <tr><td valign="top"><i>LocalPositiveSuffix </i></td> * <td valign="top">The string that appears after a positive number (may be * empty), <i>i.e.,</i> <tt>df.</tt>{@link * java.text.DecimalFormat#getPositiveSuffix * getPositiveSuffix()}</td></tr> * <tr><td valign="top"><i>LocalNegativePrefix </i></td> * <td valign="top">The string that appears before a negative number (may * be empty), <i>i.e.,</i> <tt>df.</tt>{@link * java.text.DecimalFormat#getNegativePrefix * getNegativePrefix()}</td></tr> * <tr><td valign="top"><i>LocalNegativeSuffix </i></td> * <td valign="top">The string that appears after a negative number (may be * empty), <i>i.e.,</i> <tt>df.</tt>{@link * java.text.DecimalFormat#getNegativeSuffix * getNegativeSuffix()}</td></tr> * <tr><td valign="top"><i>LocalNaN </i></td> * <td valign="top">The string that represents not-a-number for * <i>i.e.,</i> <tt>dfs.</tt>{@link * java.text.DecimalFormatSymbols#getNaN * <tr><td valign="top"><i>LocalInfinity </i></td> * <td valign="top">The string that represents infinity for floating-point * values, <i>i.e.,</i> <tt>dfs.</tt>{@link * java.text.DecimalFormatSymbols#getInfinity * getInfinity()}</td></tr> * <a name="number-syntax"> * <h4> Number syntax </h4> * <p> The strings that can be parsed as numbers by an instance of this class * are specified in terms of the following regular-expression grammar, where * Rmax is the highest digit in the radix being used (for example, Rmax is 9 * <table cellspacing=0 cellpadding=0 align=center> * <tr><td valign=top align=right><i>NonASCIIDigit</i> ::</td> * <td valign=top>= A non-ASCII character c for which * {@link java.lang.Character#isDigit Character.isDigit}<tt>(c)</tt> * returns true</td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>Non0Digit</i> ::</td> * <td><tt>= [1-</tt><i>Rmax</i><tt>] | </tt><i>NonASCIIDigit</i></td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>Digit</i> ::</td> * <td><tt>= [0-</tt><i>Rmax</i><tt>] | </tt><i>NonASCIIDigit</i></td></tr> * <tr><td> </td></tr> * <tr><td valign=top align=right><i>GroupedNumeral</i> ::</td> * <table cellpadding=0 cellspacing=0> * <tr><td><tt>= ( </tt></td> * <td><i>Non0Digit</i><tt> * </tt><i>Digit</i><tt>?</tt></td></tr> * <td><tt>( </tt><i>LocalGroupSeparator</i><tt> * </tt><i>Digit</i><tt> )+ )</tt></td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>Numeral</i> ::</td> * <td><tt>= ( ( </tt><i>Digit</i><tt>+ ) * | </tt><i>GroupedNumeral</i><tt> )</tt></td></tr> * <tr><td> </td></tr> * <tr><td valign=top align=right> * <a name="Integer-regex"><i>Integer</i> ::</td> * <td valign=top><tt>= ( [-+]? ( </tt><i>Numeral</i><tt> * <td><tt>| </tt><i>LocalPositivePrefix</i><tt> </tt><i>Numeral</i><tt> * </tt><i>LocalPositiveSuffix</i></td></tr> * <td><tt>| </tt><i>LocalNegativePrefix</i><tt> </tt><i>Numeral</i><tt> * </tt><i>LocalNegativeSuffix</i></td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>DecimalNumeral</i> ::</td> * <td><tt>= </tt><i>Numeral</i></td></tr> * <td><tt>| </tt><i>Numeral</i><tt> * </tt><i>LocalDecimalSeparator</i><tt> * </tt><i>Digit</i><tt>*</tt></td></tr> * <td><tt>| </tt><i>LocalDecimalSeparator</i><tt> * </tt><i>Digit</i><tt>+</tt></td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>Exponent</i> ::</td> * <td><tt>= ( [eE] [+-]? </tt><i>Digit</i><tt>+ )</tt></td></tr> * <tr><td> </td></tr> * <a name="Decimal-regex"><i>Decimal</i> ::</td> * <td><tt>= ( [-+]? </tt><i>DecimalNumeral</i><tt> * </tt><i>Exponent</i><tt>? )</tt></td></tr> * <td><tt>| </tt><i>LocalPositivePrefix</i><tt> * </tt><i>DecimalNumeral</i><tt> * </tt><i>LocalPositiveSuffix</i> * </tt><i>Exponent</i><tt>?</td></tr> * <td><tt>| </tt><i>LocalNegativePrefix</i><tt> * </tt><i>DecimalNumeral</i><tt> * </tt><i>LocalNegativeSuffix</i> * </tt><i>Exponent</i><tt>?</td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>HexFloat</i> ::</td> * <td><tt>= [-+]? 0[xX][0-9a-fA-F]*\.[0-9a-fA-F]+ * ([pP][-+]?[0-9]+)?</tt></td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>NonNumber</i> ::</td> * <td valign=top><tt>= NaN * | </tt><i>LocalNan</i><tt> * | </tt><i>LocalInfinity</i></td></tr> * <tr><td> </td></tr> * <tr><td align=right><i>SignedNonNumber</i> ::</td> * <td><tt>= ( [-+]? </tt><i>NonNumber</i><tt> )</tt></td></tr> * <td><tt>| </tt><i>LocalPositivePrefix</i><tt> * </tt><i>NonNumber</i><tt> * </tt><i>LocalPositiveSuffix</i></td></tr> * <td><tt>| </tt><i>LocalNegativePrefix</i><tt> * </tt><i>NonNumber</i><tt> * </tt><i>LocalNegativeSuffix</i></td></tr> * <tr><td> </td></tr> * <tr><td valign=top align=right> * <a name="Float-regex"><i>Float</i> ::</td> * <td valign=top><tt>= </tt><i>Decimal</i><tt></td></tr> * <td><tt>| </tt><i>HexFloat</i><tt></td></tr> * <td><tt>| </tt><i>SignedNonNumber</i><tt></td></tr> * <p> Whitespace is not significant in the above regular expressions. // Internal buffer used to hold input // Size of internal character buffer private static final int BUFFER_SIZE =
1024;
// change to 1024; // The index into the buffer currently held by the Scanner // Internal matcher used for finding delimiters // Pattern used to delimit tokens // Pattern found in last hasNext operation // Position after last hasNext operation // Result after last hasNext operation // Boolean is true if source is done // Boolean indicating more input is required // Boolean indicating if a delim has been skipped this operation // A store of a position that the scanner may fall back to // A cache of the last primitive type scanned // Boolean indicating if a match result is available // Boolean indicating if this scanner has been closed private boolean closed =
false;
// The current radix used by this scanner // The default radix for this scanner // The locale used by this scanner // A cache of the last few recently used Patterns // A holder of the last IOException encountered // A pattern for java whitespace // A pattern for any token // A pattern for non-ASCII digits "[\\p{javaDigit}&&[^0-9]]");
// Fields and methods to support scanning primitive types * Locale dependent values used to scan numbers * Fields and an accessor method to match booleans * Fields and methods to match bytes, shorts, ints, and longs private String digits =
"0123456789abcdefghijklmnopqrstuvwxyz";
// \\p{javaDigit} is not guaranteed to be appropriate // here but what can we do? The final authority will be // whatever parse method is invoked, so ultimately the // Scanner will do the right thing // digit++ is the possessive form which is necessary for reducing // backtracking that would otherwise cause unacceptable performance * Fields and an accessor method to match line separators "\r\n|[\n\r\u2028\u2029\u0085]";
* Fields and methods to match floats and doubles // \\p{javaDigit} may not be perfect, see above // Once again digit++ is used for performance, as above "[-+]?0[xX][0-9a-fA-F]*\\.[0-9a-fA-F]+([pP][-+]?[0-9]+)?";
* Constructs a <code>Scanner</code> that returns values scanned * from the specified source delimited by the specified pattern. * @param source A character source implementing the Readable interface * @param pattern A delimiting pattern * @return A scanner with the specified source and pattern assert source !=
null :
"source should not be null";
* Constructs a new <code>Scanner</code> that produces values scanned * from the specified source. * @param source A character source implementing the {@link Readable} * Constructs a new <code>Scanner</code> that produces values scanned * from the specified input stream. Bytes from the stream are converted * into characters using the underlying platform's * {@linkplain java.nio.charset.Charset#defaultCharset() default charset}. * @param source An input stream to be scanned * Constructs a new <code>Scanner</code> that produces values scanned * from the specified input stream. Bytes from the stream are converted * into characters using the specified charset. * @param source An input stream to be scanned * @param charsetName The encoding type used to convert bytes from the * stream into characters to be scanned * @throws IllegalArgumentException if the specified character set * Returns a charset object for the given charset name. * @throws NullPointerException is csn is null * @throws IllegalArgumentException if the charset is not supported // IllegalArgumentException should be thrown * Constructs a new <code>Scanner</code> that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the underlying platform's * {@linkplain java.nio.charset.Charset#defaultCharset() default charset}. * @param source A file to be scanned * @throws FileNotFoundException if source is not found * Constructs a new <code>Scanner</code> that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the specified charset. * @param source A file to be scanned * @param charsetName The encoding type used to convert bytes from the file * into characters to be scanned * @throws FileNotFoundException if source is not found * @throws IllegalArgumentException if the specified encoding is * Constructs a new <code>Scanner</code> that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the underlying platform's * {@linkplain java.nio.charset.Charset#defaultCharset() default charset}. * the path to the file to be scanned * if an I/O error occurs opening source * Constructs a new <code>Scanner</code> that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the specified charset. * the path to the file to be scanned * The encoding type used to convert bytes from the file * into characters to be scanned * if an I/O error occurs opening source * @throws IllegalArgumentException * if the specified encoding is not found * Constructs a new <code>Scanner</code> that produces values scanned * from the specified string. * @param source A string to scan * Constructs a new <code>Scanner</code> that produces values scanned * from the specified channel. Bytes from the source are converted into * characters using the underlying platform's * {@linkplain java.nio.charset.Charset#defaultCharset() default charset}. * @param source A channel to scan * Constructs a new <code>Scanner</code> that produces values scanned * from the specified channel. Bytes from the source are converted into * characters using the specified charset. * @param source A channel to scan * @param charsetName The encoding type used to convert bytes from the * channel into characters to be scanned * @throws IllegalArgumentException if the specified character set // Private primitives used to support scanning // Clears both regular cache and type cache // Also clears both the regular cache and the type cache // Also clears both the regular cache and the type cache // Tries to read more input. May block. // Prepare to receive data // Restore current position and limit for reading // After this method is called there will either be an exception // or else there will be space in the buffer // Gain space by compacting buffer // Gain space by growing buffer // be modified appropriately // If we are at the end of input then NoSuchElement; // If there is still input left then InputMismatch // Returns true if a complete token or partial token is in the buffer. // It is not necessary to find a complete token since a partial token // means that there will be another token with or without more input. // If we are sitting at the end, no more tokens in buffer * Returns a "complete token" that matches the specified pattern * A token is complete if surrounded by delims; a partial token * is prefixed by delims but not postfixed by them * The position is advanced to the end of that complete token * Pattern == null means accept any token at all * 1. valid string means it was found * 2. null with needInput=false means we won't ever find it * 3. null with needInput=true means try again after readInput if (!
skipped) {
// Enforcing only one skip of leading delims // If more input could extend the delimiters then we must wait // The delims were whole and the matcher should skip them // If we are sitting at the end, no more tokens in buffer // Must look for next delims. Simply attempting to match the // pattern at this point may find a match but it might not be // the first longest match because of missing input, or it might // match a partial token instead of the whole thing. // Then look for next delims // Zero length delimiter match; we should find the next one // using the automatic advance past a zero length match; // Otherwise we have just found the same one we just skipped // In the rare case that more input could cause the match // to be lost and there is more input coming we must wait // for more input. Note that hitting the end is okay as long // as the match cannot go away. It is the beginning of the // next delims we want to be sure about, we don't care if // they potentially extend further. // There is a complete token. // Must continue with match to provide valid MatchResult // Attempt to match against the desired pattern }
else {
// Complete token but it does not match // If we can't find the next delims but no more input is coming, // then we can treat the remainder as a whole token // Must continue with match to provide valid MatchResult // Last token; Match the pattern here or throw // Last piece does not match // There is a partial token in the buffer; must read more // Finds the specified pattern in the buffer up to horizon. // Returns a match for the specified input pattern. // The match may be longer if didn't hit horizon or real end // Hit an artificial end; try to extend the match // The match could go away depending on what is next // Rare case: we hit the end of input and it happens // that it is at the horizon and the end of input is // required for the match. // Did not hit end, or hit real end, or hit horizon // If there is no specified horizon, or if we have not searched // to the specified horizon yet, get more input // Returns a match for the specified input pattern anchored at // Get more input and try again // Read more to find pattern // Throws if the scanner is closed * <p> If this scanner has not yet been closed then if its underlying * {@linkplain java.lang.Readable readable} also implements the {@link * java.io.Closeable} interface then the readable's <tt>close</tt> method * will be invoked. If this scanner is already closed then invoking this * method will have no effect. * <p>Attempting to perform search operations after a scanner has * been closed will result in an {@link IllegalStateException}. * Returns the <code>IOException</code> last thrown by this * <code>Scanner</code>'s underlying <code>Readable</code>. This method * returns <code>null</code> if no such exception exists. * @return the last exception thrown by this scanner's readable * Returns the <code>Pattern</code> this <code>Scanner</code> is currently * using to match delimiters. * @return this scanner's delimiting pattern. * Sets this scanner's delimiting pattern to the specified pattern. * @param pattern A delimiting pattern * Sets this scanner's delimiting pattern to a pattern constructed from * the specified <code>String</code>. * <p> An invocation of this method of the form * <tt>useDelimiter(pattern)</tt> behaves in exactly the same way as the * invocation <tt>useDelimiter(Pattern.compile(pattern))</tt>. * <p> Invoking the {@link #reset} method will set the scanner's delimiter * to the <a href= "#default-delimiter">default</a>. * @param pattern A string specifying a delimiting pattern * Returns this scanner's locale. * <p>A scanner's locale affects many elements of its default * primitive matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * @return this scanner's locale * Sets this scanner's locale to the specified locale. * <p>A scanner's locale affects many elements of its default * primitive matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * <p>Invoking the {@link #reset} method will set the scanner's locale to * the <a href= "#initial-locale">initial locale</a>. * @param locale A string specifying the locale to use // These must be literalized to avoid collision with regex // metacharacters such as dot or parenthesis // Quoting the nonzero length locale-specific things // to avoid potential conflict with metacharacters // Force rebuilding and recompilation of locale dependent * Returns this scanner's default radix. * <p>A scanner's radix affects elements of its default * number matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * @return the default radix of this scanner * Sets this scanner's default radix to the specified radix. * <p>A scanner's radix affects elements of its default * number matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * <p>If the radix is less than <code>Character.MIN_RADIX</code> * or greater than <code>Character.MAX_RADIX</code>, then an * <code>IllegalArgumentException</code> is thrown. * <p>Invoking the {@link #reset} method will set the scanner's radix to * @param radix The radix to use when scanning numbers * @throws IllegalArgumentException if radix is out of range // Force rebuilding and recompilation of radix dependent patterns // The next operation should occur in the specified radix but // the default is left untouched. // Force rebuilding and recompilation of radix dependent patterns * Returns the match result of the last scanning operation performed * by this scanner. This method throws <code>IllegalStateException</code> * if no match has been performed, or if the last match was * <p>The various <code>next</code>methods of <code>Scanner</code> * make a match result available if they complete without throwing an * exception. For instance, after an invocation of the {@link #nextInt} * method that returned an int, this method returns a * <code>MatchResult</code> for the search of the * <a href="#Integer-regex"><i>Integer</i></a> regular expression * defined above. Similarly the {@link #findInLine}, * {@link #findWithinHorizon}, and {@link #skip} methods will make a * match available if they succeed. * @return a match result for the last match operation * @throws IllegalStateException If no match result is available * <p>Returns the string representation of this <code>Scanner</code>. The * string representation of a <code>Scanner</code> contains information * that may be useful for debugging. The exact format is unspecified. * @return The string representation of this scanner * Returns true if this scanner has another token in its input. * This method may block while waiting for input to scan. * The scanner does not advance past any input. * @return true if and only if this scanner has another token * @throws IllegalStateException if this scanner is closed * @see java.util.Iterator * Finds and returns the next complete token from this scanner. * A complete token is preceded and followed by input that matches * the delimiter pattern. This method may block while waiting for input * to scan, even if a previous invocation of {@link #hasNext} returned * @throws NoSuchElementException if no more tokens are available * @throws IllegalStateException if this scanner is closed * @see java.util.Iterator * The remove operation is not supported by this implementation of * @throws UnsupportedOperationException if this method is invoked. * @see java.util.Iterator * Returns true if the next token matches the pattern constructed from the * specified string. The scanner does not advance past any input. * <p> An invocation of this method of the form <tt>hasNext(pattern)</tt> * behaves in exactly the same way as the invocation * <tt>hasNext(Pattern.compile(pattern))</tt>. * @param pattern a string specifying the pattern to scan * @return true if and only if this scanner has another token matching * @throws IllegalStateException if this scanner is closed * Returns the next token if it matches the pattern constructed from the * specified string. If the match is successful, the scanner advances * past the input that matched the pattern. * <p> An invocation of this method of the form <tt>next(pattern)</tt> * behaves in exactly the same way as the invocation * <tt>next(Pattern.compile(pattern))</tt>. * @param pattern a string specifying the pattern to scan * @throws NoSuchElementException if no such tokens are available * @throws IllegalStateException if this scanner is closed * Returns true if the next complete token matches the specified pattern. * A complete token is prefixed and postfixed by input that matches * the delimiter pattern. This method may block while waiting for input. * The scanner does not advance past any input. * @param pattern the pattern to scan for * @return true if and only if this scanner has another token matching * @throws IllegalStateException if this scanner is closed * Returns the next token if it matches the specified pattern. This * method may block while waiting for input to scan, even if a previous * invocation of {@link #hasNext(Pattern)} returned <code>true</code>. * If the match is successful, the scanner advances past the input that * @param pattern the pattern to scan for * @throws NoSuchElementException if no more tokens are available * @throws IllegalStateException if this scanner is closed // Did we already find this pattern? // Search for the pattern * Returns true if there is another line in the input of this scanner. * This method may block while waiting for input. The scanner does not * advance past any input. * @return true if and only if this scanner has another line of input * @throws IllegalStateException if this scanner is closed * Advances this scanner past the current line and returns the input * This method returns the rest of the current line, excluding any line * separator at the end. The position is set to the beginning of the next * <p>Since this method continues to search through the input looking * for a line separator, it may buffer all of the input searching for * the line to skip if no line separators are present. * @return the line that was skipped * @throws NoSuchElementException if no line was found * @throws IllegalStateException if this scanner is closed // Public methods that ignore delimiters * Attempts to find the next occurrence of a pattern constructed from the * specified string, ignoring delimiters. * <p>An invocation of this method of the form <tt>findInLine(pattern)</tt> * behaves in exactly the same way as the invocation * <tt>findInLine(Pattern.compile(pattern))</tt>. * @param pattern a string specifying the pattern to search for * @return the text that matched the specified pattern * @throws IllegalStateException if this scanner is closed * Attempts to find the next occurrence of the specified pattern ignoring * delimiters. If the pattern is found before the next line separator, the * scanner advances past the input that matched and returns the string that * If no such pattern is detected in the input up to the next line * separator, then <code>null</code> is returned and the scanner's * position is unchanged. This method may block waiting for input that * <p>Since this method continues to search through the input looking * for the specified pattern, it may buffer all of the input searching for * the desired token if no line separators are present. * @param pattern the pattern to scan for * @return the text that matched the specified pattern * @throws IllegalStateException if this scanner is closed // Expand buffer to include the next newline or end of input break;
// up to next newline break;
// up to end of input // If there is nothing between the current pos and the next // newline simply return null, invoking findWithinHorizon // with "horizon=0" will scan beyond the line bound. // Search for the pattern * Attempts to find the next occurrence of a pattern constructed from the * specified string, ignoring delimiters. * <p>An invocation of this method of the form * <tt>findWithinHorizon(pattern)</tt> behaves in exactly the same way as * <tt>findWithinHorizon(Pattern.compile(pattern, horizon))</tt>. * @param pattern a string specifying the pattern to search for * @return the text that matched the specified pattern * @throws IllegalStateException if this scanner is closed * @throws IllegalArgumentException if horizon is negative * Attempts to find the next occurrence of the specified pattern. * <p>This method searches through the input up to the specified * search horizon, ignoring delimiters. If the pattern is found the * scanner advances past the input that matched and returns the string * that matched the pattern. If no such pattern is detected then the * null is returned and the scanner's position remains unchanged. This * method may block waiting for input that matches the pattern. * <p>A scanner will never search more than <code>horizon</code> code * points beyond its current position. Note that a match may be clipped * by the horizon; that is, an arbitrary match result may have been * different if the horizon had been larger. The scanner treats the * horizon as a transparent, non-anchoring bound (see {@link * Matcher#useTransparentBounds} and {@link Matcher#useAnchoringBounds}). * <p>If horizon is <code>0</code>, then the horizon is ignored and * this method continues to search through the input looking for the * specified pattern without bound. In this case it may buffer all of * the input searching for the pattern. * <p>If horizon is negative, then an IllegalArgumentException is * @param pattern the pattern to scan for * @return the text that matched the specified pattern * @throws IllegalStateException if this scanner is closed * @throws IllegalArgumentException if horizon is negative // Search for the pattern break;
// up to end of input * Skips input that matches the specified pattern, ignoring delimiters. * This method will skip input if an anchored match of the specified * <p>If a match to the specified pattern is not found at the * current position, then no input is skipped and a * <tt>NoSuchElementException</tt> is thrown. * <p>Since this method seeks to match the specified pattern starting at * the scanner's current position, patterns that can match a lot of * input (".*", for example) may cause the scanner to buffer a large * <p>Note that it is possible to skip something without risking a * <code>NoSuchElementException</code> by using a pattern that can * match nothing, e.g., <code>sc.skip("[ \t]*")</code>. * @param pattern a string specifying the pattern to skip over * @throws NoSuchElementException if the specified pattern is not found * @throws IllegalStateException if this scanner is closed // Search for the pattern * Skips input that matches a pattern constructed from the specified * <p> An invocation of this method of the form <tt>skip(pattern)</tt> * behaves in exactly the same way as the invocation * <tt>skip(Pattern.compile(pattern))</tt>. * @param pattern a string specifying the pattern to skip over * @throws IllegalStateException if this scanner is closed // Convenience methods for scanning primitives * Returns true if the next token in this scanner's input can be * interpreted as a boolean value using a case insensitive pattern * created from the string "true|false". The scanner does not * advance past the input that matched. * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Scans the next token of the input into a boolean value and returns * that value. This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid boolean value. * If the match is successful, the scanner advances past the input that * @return the boolean scanned from the input * @throws InputMismatchException if the next token is not a valid boolean * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a byte value in the default radix using the * {@link #nextByte} method. The scanner does not advance past any input. * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a byte value in the specified radix using the * {@link #nextByte} method. The scanner does not advance past any input. * @param radix the radix used to interpret the token as a byte value * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>byte</tt>. * <p> An invocation of this method of the form * <tt>nextByte()</tt> behaves in exactly the same way as the * invocation <tt>nextByte(radix)</tt>, where <code>radix</code> * is the default radix of this scanner. * @return the <tt>byte</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>byte</tt>. * This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid byte value as * described below. If the translation is successful, the scanner advances * past the input that matched. * <p> If the next token matches the <a * href="#Integer-regex"><i>Integer</i></a> regular expression defined * above then the token is converted into a <tt>byte</tt> value as if by * removing all locale specific prefixes, group separators, and locale * specific suffixes, then mapping non-ASCII digits into ASCII * digits via {@link Character#digit Character.digit}, prepending a * negative sign (-) if the locale specific negative prefixes and suffixes * were present, and passing the resulting string to * {@link Byte#parseByte(String, int) Byte.parseByte} with the * @param radix the radix used to interpret the token as a byte value * @return the <tt>byte</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a short value in the default radix using the * {@link #nextShort} method. The scanner does not advance past any input. * @return true if and only if this scanner's next token is a valid * short value in the default radix * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a short value in the specified radix using the * {@link #nextShort} method. The scanner does not advance past any input. * @param radix the radix used to interpret the token as a short value * @return true if and only if this scanner's next token is a valid * short value in the specified radix * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>short</tt>. * <p> An invocation of this method of the form * <tt>nextShort()</tt> behaves in exactly the same way as the * invocation <tt>nextShort(radix)</tt>, where <code>radix</code> * is the default radix of this scanner. * @return the <tt>short</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>short</tt>. * This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid short value as * described below. If the translation is successful, the scanner advances * past the input that matched. * <p> If the next token matches the <a * href="#Integer-regex"><i>Integer</i></a> regular expression defined * above then the token is converted into a <tt>short</tt> value as if by * removing all locale specific prefixes, group separators, and locale * specific suffixes, then mapping non-ASCII digits into ASCII * digits via {@link Character#digit Character.digit}, prepending a * negative sign (-) if the locale specific negative prefixes and suffixes * were present, and passing the resulting string to * {@link Short#parseShort(String, int) Short.parseShort} with the * @param radix the radix used to interpret the token as a short value * @return the <tt>short</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as an int value in the default radix using the * {@link #nextInt} method. The scanner does not advance past any input. * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as an int value in the specified radix using the * {@link #nextInt} method. The scanner does not advance past any input. * @param radix the radix used to interpret the token as an int value * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * The integer token must be stripped of prefixes, group separators, * and suffixes, non ascii digits must be converted into ascii digits * before parse will accept it. * Scans the next token of the input as an <tt>int</tt>. * <p> An invocation of this method of the form * <tt>nextInt()</tt> behaves in exactly the same way as the * invocation <tt>nextInt(radix)</tt>, where <code>radix</code> * is the default radix of this scanner. * @return the <tt>int</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as an <tt>int</tt>. * This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid int value as * described below. If the translation is successful, the scanner advances * past the input that matched. * <p> If the next token matches the <a * href="#Integer-regex"><i>Integer</i></a> regular expression defined * above then the token is converted into an <tt>int</tt> value as if by * removing all locale specific prefixes, group separators, and locale * specific suffixes, then mapping non-ASCII digits into ASCII * digits via {@link Character#digit Character.digit}, prepending a * negative sign (-) if the locale specific negative prefixes and suffixes * were present, and passing the resulting string to * {@link Integer#parseInt(String, int) Integer.parseInt} with the * @param radix the radix used to interpret the token as an int value * @return the <tt>int</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a long value in the default radix using the * {@link #nextLong} method. The scanner does not advance past any input. * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a long value in the specified radix using the * {@link #nextLong} method. The scanner does not advance past any input. * @param radix the radix used to interpret the token as a long value * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>long</tt>. * <p> An invocation of this method of the form * <tt>nextLong()</tt> behaves in exactly the same way as the * invocation <tt>nextLong(radix)</tt>, where <code>radix</code> * is the default radix of this scanner. * @return the <tt>long</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>long</tt>. * This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid long value as * described below. If the translation is successful, the scanner advances * past the input that matched. * <p> If the next token matches the <a * href="#Integer-regex"><i>Integer</i></a> regular expression defined * above then the token is converted into a <tt>long</tt> value as if by * removing all locale specific prefixes, group separators, and locale * specific suffixes, then mapping non-ASCII digits into ASCII * digits via {@link Character#digit Character.digit}, prepending a * negative sign (-) if the locale specific negative prefixes and suffixes * were present, and passing the resulting string to * {@link Long#parseLong(String, int) Long.parseLong} with the * @param radix the radix used to interpret the token as an int value * @return the <tt>long</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * The float token must be stripped of prefixes, group separators, * and suffixes, non ascii digits must be converted into ascii digits * before parseFloat will accept it. * If there are non-ascii digits in the token these digits must * be processed before the token is passed to parseFloat. // Translate non-ASCII digits * Returns true if the next token in this scanner's input can be * interpreted as a float value using the {@link #nextFloat} * method. The scanner does not advance past any input. * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>float</tt>. * This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid float value as * described below. If the translation is successful, the scanner advances * past the input that matched. * <p> If the next token matches the <a * href="#Float-regex"><i>Float</i></a> regular expression defined above * then the token is converted into a <tt>float</tt> value as if by * removing all locale specific prefixes, group separators, and locale * specific suffixes, then mapping non-ASCII digits into ASCII * digits via {@link Character#digit Character.digit}, prepending a * negative sign (-) if the locale specific negative prefixes and suffixes * were present, and passing the resulting string to * {@link Float#parseFloat Float.parseFloat}. If the token matches * the localized NaN or infinity strings, then either "Nan" or "Infinity" * is passed to {@link Float#parseFloat(String) Float.parseFloat} as * @return the <tt>float</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Float</i> * regular expression, or is out of range * @throws NoSuchElementException if input is exhausted * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a double value using the {@link #nextDouble} * method. The scanner does not advance past any input. * @return true if and only if this scanner's next token is a valid * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a <tt>double</tt>. * This method will throw <code>InputMismatchException</code> * if the next token cannot be translated into a valid double value. * If the translation is successful, the scanner advances past the input * <p> If the next token matches the <a * href="#Float-regex"><i>Float</i></a> regular expression defined above * then the token is converted into a <tt>double</tt> value as if by * removing all locale specific prefixes, group separators, and locale * specific suffixes, then mapping non-ASCII digits into ASCII * digits via {@link Character#digit Character.digit}, prepending a * negative sign (-) if the locale specific negative prefixes and suffixes * were present, and passing the resulting string to * {@link Double#parseDouble Double.parseDouble}. If the token matches * the localized NaN or infinity strings, then either "Nan" or "Infinity" * is passed to {@link Double#parseDouble(String) Double.parseDouble} as * @return the <tt>double</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Float</i> * regular expression, or is out of range * @throws NoSuchElementException if the input is exhausted * @throws IllegalStateException if this scanner is closed // Convenience methods for scanning multi precision numbers * Returns true if the next token in this scanner's input can be * interpreted as a <code>BigInteger</code> in the default radix using the * {@link #nextBigInteger} method. The scanner does not advance past any * @return true if and only if this scanner's next token is a valid * <code>BigInteger</code> * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a <code>BigInteger</code> in the specified radix using * the {@link #nextBigInteger} method. The scanner does not advance past * @param radix the radix used to interpret the token as an integer * @return true if and only if this scanner's next token is a valid * <code>BigInteger</code> * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a {@link java.math.BigInteger * <p> An invocation of this method of the form * <tt>nextBigInteger()</tt> behaves in exactly the same way as the * invocation <tt>nextBigInteger(radix)</tt>, where <code>radix</code> * is the default radix of this scanner. * @return the <tt>BigInteger</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if the input is exhausted * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a {@link java.math.BigInteger * <p> If the next token matches the <a * href="#Integer-regex"><i>Integer</i></a> regular expression defined * above then the token is converted into a <tt>BigInteger</tt> value as if * by removing all group separators, mapping non-ASCII digits into ASCII * digits via the {@link Character#digit Character.digit}, and passing the * resulting string to the {@link * java.math.BigInteger#BigInteger(java.lang.String) * BigInteger(String, int)} constructor with the specified radix. * @param radix the radix used to interpret the token * @return the <tt>BigInteger</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Integer</i> * regular expression, or is out of range * @throws NoSuchElementException if the input is exhausted * @throws IllegalStateException if this scanner is closed * Returns true if the next token in this scanner's input can be * interpreted as a <code>BigDecimal</code> using the * {@link #nextBigDecimal} method. The scanner does not advance past any * @return true if and only if this scanner's next token is a valid * <code>BigDecimal</code> * @throws IllegalStateException if this scanner is closed * Scans the next token of the input as a {@link java.math.BigDecimal * <p> If the next token matches the <a * href="#Decimal-regex"><i>Decimal</i></a> regular expression defined * above then the token is converted into a <tt>BigDecimal</tt> value as if * by removing all group separators, mapping non-ASCII digits into ASCII * digits via the {@link Character#digit Character.digit}, and passing the * resulting string to the {@link * java.math.BigDecimal#BigDecimal(java.lang.String) BigDecimal(String)} * @return the <tt>BigDecimal</tt> scanned from the input * @throws InputMismatchException * if the next token does not match the <i>Decimal</i> * regular expression, or is out of range * @throws NoSuchElementException if the input is exhausted * @throws IllegalStateException if this scanner is closed * <p> Resetting a scanner discards all of its explicit state * information which may have been changed by invocations of {@link * #useDelimiter}, {@link #useLocale}, or {@link #useRadix}. * <p> An invocation of this method of the form * <tt>scanner.reset()</tt> behaves in exactly the same way as the * scanner.useDelimiter("\\p{javaWhitespace}+") * .useLocale(Locale.getDefault())