Lines Matching defs:dictionary

51  * A subclass of RuleBasedBreakIterator that adds the ability to use a dictionary
57 * repeatedly compared against a list of known words (i.e., the dictionary)
61 * but adds one more special substitution name: <dictionary>. This substitution
62 * name is used to identify characters in words in the dictionary. The idea is that
64 * in a row that are included in <dictionary>, it goes back through that range and
65 * derives additional break positions (if possible) using the dictionary.
67 * DictionaryBasedBreakIterator is also constructed with the filename of a dictionary
68 * file. It follows a prescribed search path to locate the dictionary (right now,
71 * dictionary file is in a serialized binary format. We have a very primitive (and
72 * slow) BuildDictionaryFile utility for creating dictionary files, but aren't
81 private BreakDictionary dictionary;
85 * the dictionary file (this is used to determine which ranges of characters
86 * to apply the dictionary to)
91 * a temporary hiding place for the number of dictionary characters in the
97 * when a range of characters is divided up using the dictionary, the break
99 * to use either the dictionary or the state table again until the iterator
113 * except for the special meaning of "<dictionary>". This parameter is just
115 * @param dictionaryFilename The filename of the dictionary file to use
125 dictionary = new BreakDictionary(dictionaryFile);
277 // value. dictionaryCharCount tells us how many dictionary characters
283 // if we passed over more than one dictionary character, then we use
314 // passed over any dictionary characters. It calls the inherited lookupCategory()
316 // categories represented in the dictionary. If it is, bump the dictionary-
326 * This is the function that actually implements the dictionary-based
328 * dictionary to determine the positions of any boundaries in this
336 // the range we're dividing may begin or end with non-dictionary characters
339 // range to the first dictionary character
353 // in the dictionary], we back up, possibly delete some breaks from
363 // the dictionary is implemented as a trie, which is treated as a state
365 // dictionary is represented by a path from the root node to -1. A path
373 // dictionary. In this case, we "bless" the break positions that got us the
386 if (dictionary.getNextState(state, 0) == -1) {
390 // look up the new state to transition to in the dictionary
391 state = dictionary.getNextStateFromCharacter(state, c);
394 // the "end of word" state, then it was a non-dictionary character
494 // because the range actually ended with non-dictionary characters we want to