A Query is a series of clauses. A clause may be prefixed by:
- a plus "+" or a minus "-" sign, indicating that the clause
is required or prohibited respectively; or
- a term followed by a colon ":", indicating the
field to be searched. This enables one to construct queries
which search multiple fields.
A clause may be either:
- a term, indicating all the documents that contain this term;
or
- a phrase - group of words surrounded by double quotes
" ", e.g. "hello dolly"
- a nested query, enclosed in parentheses "(" ")" (also
called query/field grouping) . Note that this may be used
with a +/- prefix to require any of a set of terms.
- boolean operators which allow terms to be combined through
logic operators. Supported are AND(&&), "+",
OR(||), NOT(!) and "-" (Note: they
must be ALL CAPS).
Wildcard, Fuzzy, Proximity & Range Searches:
- to perform a single character wildcard search use the "?" symbol,
e.g. te?t
- to perform a multiple character wildcard search use the "*"
symbol, e.g. test* or te*t
- you cannot use a * or ? symbol as the first character of a search
(unless enabled using indexer option -a).
- to do a fuzzy search(find words similar in spelling, based on the
Levenshtein Distance, or Edit Distance algorithm) use the tilde,
"~", e.g. rcs~
- to do a proximity search use the tilde, "~", symbol at the end of a
Phrase. For example to search for a "opengrok" and "help" within 10
words of each other enter: "opengrok help"~10
- range queries allow one to match documents whose field(s) values are
between the lower and upper bound specified by the Range Query. Range
Queries can be inclusive or exclusive of the upper and lower bounds.
Sorting is done lexicographically. Inclusive queries are denoted by
square brackets [ ] , exclusive by curly brackets { }.
For example: title:{Aida TO Carmen} - will find all documents between
Aida to Carmen, exclusive of Aida and Carmen.
Escaping special characters:
Opengrok supports escaping special characters that are part of the query
syntax. Current special characters are:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example to search
for (1+1):2 use the query: \(1\+1\)\:2
NOTE on analyzers: Indexed words are made up of Alpha-Numeric and Underscore
characters. One letter words are usually not indexed as symbols!
Most other characters(including single and double quotes) are treated as
"spaces/whitespace"(so even if you escape them, they will not be found, since
most analyzers ignore them).
The exceptions are: @ $ % ^ & = ? . : which are mostly indexed as
separate words.
Because some of them are part of the query syntax, they must be escaped with a
reverse slash as noted above.
So searching for \+1 or \+ 1 will both find +1 and + 1.
valid FIELDs are
- full
- Search through all text tokens(words,strings,identifiers,numbers) in index.
- defs
- Only finds symbol definitions.
- refs
- Only finds symbols.
- path
- path of the source file.
- hist
- History log comments.
the term(phrases) can be boosted (making it more relevant) using a caret
^ , e.g. help^4 opengrok - will make term help boosted
Examples:
To find where setResourceMonitors is defined: defs:setResourceMonitors
To find files that use sprintf in usr/src/cmd/cmd-inet/usr.sbin/:
refs:sprintf path:usr/src/cmd/cmd-inet/usr.sbin
To find assignments to variable Asign:
"Asign="
To find Makefiles where pstack binary is being built:
pstack path:Makefile
to search for phrase "Bill Joy":
"Bill Joy"
To find perl files that do not use /usr/bin/perl but something else:
-"/usr/bin/perl"+"/bin/perl"
To find all strings begining with foo use the wildcard:
foo*
To find all files which have . c in their name(dot is a token!):
". c"
Opengrok search is powered by lucene, for more detail on query syntax refer to lucene docs.