Class QueryParserBase
- All Implemented Interfaces:
CommonQueryParserConfiguration
- Direct Known Subclasses:
QueryParser
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.QueryBuilder
QueryBuilder.TermAndBoost
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) boolean
static final QueryParser.Operator
Alternative form of QueryParser.Operator.AND(package private) boolean
(package private) static final int
(package private) static final int
(package private) static final int
(package private) DateTools.Resolution
(package private) int
protected String
(package private) Map<String,
DateTools.Resolution> (package private) float
(package private) int
(package private) Locale
(package private) static final int
(package private) static final int
(package private) static final int
(package private) MultiTermQuery.RewriteMethod
(package private) QueryParser.Operator
The actual operator that parser uses to combine query termsstatic final QueryParser.Operator
Alternative form of QueryParser.Operator.OR(package private) int
(package private) TimeZone
private static final Pattern
Fields inherited from class org.apache.lucene.util.QueryBuilder
analyzer, autoGenerateMultiTermSynonymsPhraseQuery, enableGraphQueries, enablePositionIncrements
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
addClause
(List<BooleanClause> clauses, int conj, int mods, Query q) protected void
addMultiTermClauses
(List<BooleanClause> clauses, Query q) Adds clauses generated from analysis over text containing whitespace.private PhraseQuery
addSlopToPhrase
(PhraseQuery query, int slop) Rebuild a phrase query with a slop valueprivate BytesRef
analyzeWildcard
(String field, String termStr) (package private) String
discardEscapeChar
(String input) Returns a String where the escape char has been removed, or kept only once if there was a double escape.static String
Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding\
.boolean
final boolean
protected Query
getBooleanQuery
(List<BooleanClause> clauses) Factory method for generating query, given a set of clauses.getDateResolution
(String fieldName) Returns the date resolution that is used by RangeQueries for the given field.Gets implicit operator setting, which will be either AND_OPERATOR or OR_OPERATOR.int
getField()
protected Query
getFieldQuery
(String field, String queryText, boolean quoted) protected Query
getFieldQuery
(String field, String queryText, int slop) Base implementation delegates togetFieldQuery(String,String,boolean)
.protected float
getFuzzyDistance
(Token fuzzyToken, String termStr) Determines the similarity distance for the given fuzzy token and term string.float
Get the minimal similarity for fuzzy queries.int
Get the prefix length for fuzzy queries.protected Query
getFuzzyQuery
(String field, String termStr, float minSimilarity) Factory method for generating a query (similar togetWildcardQuery(java.lang.String, java.lang.String)
).Returns current locale, allowing access by subclasses.int
Gets the default slop for phrases.protected Query
getPrefixQuery
(String field, String termStr) Factory method for generating a query (similar togetWildcardQuery(java.lang.String, java.lang.String)
).protected Query
getRangeQuery
(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) protected Query
getRegexpQuery
(String field, String termStr) Factory method for generating a query.protected Query
getWildcardQuery
(String field, String termStr) Factory method for generating a query.(package private) Query
handleBareFuzzy
(String qfield, Token fuzzySlop, String termImage) (package private) Query
handleBareTokenQuery
(String qfield, Token term, Token fuzzySlop, boolean prefix, boolean wildcard, boolean fuzzy, boolean regexp) (package private) Query
handleBoost
(Query q, Token boost) (package private) Query
handleQuotedTerm
(String qfield, Token term, Token fuzzySlop) (package private) static final int
hexToInt
(char c) Returns the numeric value of the hexadecimal charactervoid
Initializes a query parser.protected BooleanClause
newBooleanClause
(Query q, BooleanClause.Occur occur) Builds a new BooleanClause instanceprotected Query
newFieldQuery
(Analyzer analyzer, String field, String queryText, boolean quoted) protected Query
newFuzzyQuery
(Term term, float minimumSimilarity, int prefixLength) Builds a new FuzzyQuery instanceprotected Query
Builds a new MatchAllDocsQuery instanceprotected Query
newPrefixQuery
(Term prefix) Builds a new PrefixQuery instanceprotected Query
newRangeQuery
(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) Builds a newTermRangeQuery
instanceprotected Query
newRegexpQuery
(Term regexp) Builds a new RegexpQuery instanceprotected Query
Builds a new WildcardQuery instanceParses a query string, returning aQuery
.abstract void
ReInit
(CharStream stream) void
setAllowLeadingWildcard
(boolean allowLeadingWildcard) Set totrue
to allow leading wildcard characters.void
setAutoGeneratePhraseQueries
(boolean value) Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text.void
setDateResolution
(String fieldName, DateTools.Resolution dateResolution) Sets the date resolution used by RangeQueries for a specific field.void
setDateResolution
(DateTools.Resolution dateResolution) Sets the default date resolution used by RangeQueries for fields for which no specific date resolutions has been set.void
Sets the boolean operator of the QueryParser.void
setDeterminizeWorkLimit
(int determinizeWorkLimit) void
setFuzzyMinSim
(float fuzzyMinSim) Set the minimum similarity for fuzzy queries.void
setFuzzyPrefixLength
(int fuzzyPrefixLength) Set the prefix length for fuzzy queries.void
Set locale used by date range parsing, lowercasing, and other locale-sensitive operations.void
By default QueryParser usesMultiTermQuery.CONSTANT_SCORE_REWRITE
when creating aPrefixQuery
,WildcardQuery
orTermRangeQuery
.void
setPhraseSlop
(int phraseSlop) Sets the default slop for phrases.void
setTimeZone
(TimeZone timeZone) abstract Query
TopLevelQuery
(String field) Methods inherited from class org.apache.lucene.util.QueryBuilder
add, analyzeBoolean, analyzeGraphBoolean, analyzeGraphPhrase, analyzeMultiBoolean, analyzeMultiPhrase, analyzePhrase, analyzeTerm, createBooleanQuery, createBooleanQuery, createFieldQuery, createFieldQuery, createMinShouldMatchQuery, createPhraseQuery, createPhraseQuery, getAnalyzer, getAutoGenerateMultiTermSynonymsPhraseQuery, getEnableGraphQueries, getEnablePositionIncrements, newBooleanQuery, newGraphSynonymQuery, newMultiPhraseQueryBuilder, newSynonymQuery, newTermQuery, setAnalyzer, setAutoGenerateMultiTermSynonymsPhraseQuery, setEnableGraphQueries, setEnablePositionIncrements
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.queryparser.flexible.standard.CommonQueryParserConfiguration
getAnalyzer, getEnablePositionIncrements, setEnablePositionIncrements
-
Field Details
-
CONJ_NONE
static final int CONJ_NONE- See Also:
-
CONJ_AND
static final int CONJ_AND- See Also:
-
CONJ_OR
static final int CONJ_OR- See Also:
-
MOD_NONE
static final int MOD_NONE- See Also:
-
MOD_NOT
static final int MOD_NOT- See Also:
-
MOD_REQ
static final int MOD_REQ- See Also:
-
AND_OPERATOR
Alternative form of QueryParser.Operator.AND -
OR_OPERATOR
Alternative form of QueryParser.Operator.OR -
operator
QueryParser.Operator operatorThe actual operator that parser uses to combine query terms -
multiTermRewriteMethod
MultiTermQuery.RewriteMethod multiTermRewriteMethod -
allowLeadingWildcard
boolean allowLeadingWildcard -
field
-
phraseSlop
int phraseSlop -
fuzzyMinSim
float fuzzyMinSim -
fuzzyPrefixLength
int fuzzyPrefixLength -
locale
Locale locale -
timeZone
TimeZone timeZone -
dateResolution
DateTools.Resolution dateResolution -
fieldToDateResolution
Map<String,DateTools.Resolution> fieldToDateResolution -
autoGeneratePhraseQueries
boolean autoGeneratePhraseQueries -
determinizeWorkLimit
int determinizeWorkLimit -
WILDCARD_PATTERN
-
-
Constructor Details
-
QueryParserBase
protected QueryParserBase()
-
-
Method Details
-
init
Initializes a query parser. Called by the QueryParser constructor- Parameters:
f
- the default field for query terms.a
- used to find terms in the query text.
-
ReInit
-
TopLevelQuery
- Throws:
ParseException
-
parse
Parses a query string, returning aQuery
.- Parameters:
query
- the query string to be parsed.- Throws:
ParseException
- if the parsing fails
-
getField
- Returns:
- Returns the default field.
-
getAutoGeneratePhraseQueries
public final boolean getAutoGeneratePhraseQueries()- See Also:
-
setAutoGeneratePhraseQueries
public void setAutoGeneratePhraseQueries(boolean value) Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text. NOTE: this behavior may not be suitable for all languages.Set to false if phrase queries should only be generated when surrounded by double quotes.
-
getFuzzyMinSim
public float getFuzzyMinSim()Get the minimal similarity for fuzzy queries.- Specified by:
getFuzzyMinSim
in interfaceCommonQueryParserConfiguration
-
setFuzzyMinSim
public void setFuzzyMinSim(float fuzzyMinSim) Set the minimum similarity for fuzzy queries. Default is 2f.- Specified by:
setFuzzyMinSim
in interfaceCommonQueryParserConfiguration
-
getFuzzyPrefixLength
public int getFuzzyPrefixLength()Get the prefix length for fuzzy queries.- Specified by:
getFuzzyPrefixLength
in interfaceCommonQueryParserConfiguration
- Returns:
- Returns the fuzzyPrefixLength.
-
setFuzzyPrefixLength
public void setFuzzyPrefixLength(int fuzzyPrefixLength) Set the prefix length for fuzzy queries. Default is 0.- Specified by:
setFuzzyPrefixLength
in interfaceCommonQueryParserConfiguration
- Parameters:
fuzzyPrefixLength
- The fuzzyPrefixLength to set.
-
setPhraseSlop
public void setPhraseSlop(int phraseSlop) Sets the default slop for phrases. If zero, then exact phrase matches are required. Default value is zero.- Specified by:
setPhraseSlop
in interfaceCommonQueryParserConfiguration
-
getPhraseSlop
public int getPhraseSlop()Gets the default slop for phrases.- Specified by:
getPhraseSlop
in interfaceCommonQueryParserConfiguration
-
setAllowLeadingWildcard
public void setAllowLeadingWildcard(boolean allowLeadingWildcard) Set totrue
to allow leading wildcard characters.When set,
*
or?
are allowed as the first character of a PrefixQuery and WildcardQuery. Note that this can produce very slow queries on big indexes.Default: false.
- Specified by:
setAllowLeadingWildcard
in interfaceCommonQueryParserConfiguration
-
getAllowLeadingWildcard
public boolean getAllowLeadingWildcard()- Specified by:
getAllowLeadingWildcard
in interfaceCommonQueryParserConfiguration
- See Also:
-
setDefaultOperator
Sets the boolean operator of the QueryParser. In default mode (OR_OPERATOR
) terms without any modifiers are considered optional: for examplecapital of Hungary
is equal tocapital OR of OR Hungary
.
InAND_OPERATOR
mode terms are considered to be in conjunction: the above mentioned query is parsed ascapital AND of AND Hungary
-
getDefaultOperator
Gets implicit operator setting, which will be either AND_OPERATOR or OR_OPERATOR. -
setMultiTermRewriteMethod
By default QueryParser usesMultiTermQuery.CONSTANT_SCORE_REWRITE
when creating aPrefixQuery
,WildcardQuery
orTermRangeQuery
. This implementation is generally preferable because it a) Runs faster b) Does not have the scarcity of terms unduly influence score c) avoids anyIndexSearcher.TooManyClauses
exception. However, if your application really needs to use the old-fashionedBooleanQuery
expansion rewriting and the above points are not relevant then use this to change the rewrite method.- Specified by:
setMultiTermRewriteMethod
in interfaceCommonQueryParserConfiguration
-
getMultiTermRewriteMethod
- Specified by:
getMultiTermRewriteMethod
in interfaceCommonQueryParserConfiguration
- See Also:
-
setLocale
Set locale used by date range parsing, lowercasing, and other locale-sensitive operations.- Specified by:
setLocale
in interfaceCommonQueryParserConfiguration
-
getLocale
Returns current locale, allowing access by subclasses.- Specified by:
getLocale
in interfaceCommonQueryParserConfiguration
-
setTimeZone
- Specified by:
setTimeZone
in interfaceCommonQueryParserConfiguration
-
getTimeZone
- Specified by:
getTimeZone
in interfaceCommonQueryParserConfiguration
-
setDateResolution
Sets the default date resolution used by RangeQueries for fields for which no specific date resolutions has been set. Field specific resolutions can be set withsetDateResolution(String, org.apache.lucene.document.DateTools.Resolution)
.- Specified by:
setDateResolution
in interfaceCommonQueryParserConfiguration
- Parameters:
dateResolution
- the default date resolution to set
-
setDateResolution
Sets the date resolution used by RangeQueries for a specific field.- Parameters:
fieldName
- field for which the date resolution is to be setdateResolution
- date resolution to set
-
getDateResolution
Returns the date resolution that is used by RangeQueries for the given field. Returns null, if no default or field specific date resolution has been set for the given field. -
setDeterminizeWorkLimit
public void setDeterminizeWorkLimit(int determinizeWorkLimit) - Parameters:
determinizeWorkLimit
- the maximum effort that determinizing a regexp query can spend. If the query requires more effort, a TooComplexToDeterminizeException is thrown.
-
getDeterminizeWorkLimit
public int getDeterminizeWorkLimit()- Returns:
- the maximum effort that determinizing a regexp query can spend. If the query requires more effort, a TooComplexToDeterminizeException is thrown.
-
addClause
-
addMultiTermClauses
Adds clauses generated from analysis over text containing whitespace. There are no operators, so the query's clauses can either be MUST (if the default operator is AND) or SHOULD (default OR).If all of the clauses in the given Query are TermQuery-s, this method flattens the result by adding the TermQuery-s individually to the output clause list; otherwise, the given Query is added as a single clause including its nested clauses.
-
getFieldQuery
- Throws:
ParseException
- throw in overridden method to disallow
-
newFieldQuery
protected Query newFieldQuery(Analyzer analyzer, String field, String queryText, boolean quoted) throws ParseException - Throws:
ParseException
- throw in overridden method to disallow
-
getFieldQuery
Base implementation delegates togetFieldQuery(String,String,boolean)
. This method may be overridden, for example, to return a SpanNearQuery instead of a PhraseQuery.- Throws:
ParseException
- throw in overridden method to disallow
-
addSlopToPhrase
Rebuild a phrase query with a slop value -
getRangeQuery
protected Query getRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) throws ParseException - Throws:
ParseException
-
newBooleanClause
Builds a new BooleanClause instance- Parameters:
q
- sub queryoccur
- how this clause should occur when matching documents- Returns:
- new BooleanClause instance
-
newPrefixQuery
Builds a new PrefixQuery instance- Parameters:
prefix
- Prefix term- Returns:
- new PrefixQuery instance
-
newRegexpQuery
Builds a new RegexpQuery instance- Parameters:
regexp
- Regexp term- Returns:
- new RegexpQuery instance
-
newFuzzyQuery
Builds a new FuzzyQuery instance- Parameters:
term
- TermminimumSimilarity
- minimum similarityprefixLength
- prefix length- Returns:
- new FuzzyQuery Instance
-
newRangeQuery
protected Query newRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) Builds a newTermRangeQuery
instance- Parameters:
field
- Fieldpart1
- minpart2
- maxstartInclusive
- true if the start of the range is inclusiveendInclusive
- true if the end of the range is inclusive- Returns:
- new
TermRangeQuery
instance
-
newMatchAllDocsQuery
Builds a new MatchAllDocsQuery instance- Returns:
- new MatchAllDocsQuery instance
-
newWildcardQuery
Builds a new WildcardQuery instance- Parameters:
t
- wildcard term- Returns:
- new WildcardQuery instance
-
getBooleanQuery
Factory method for generating query, given a set of clauses. By default creates a boolean query composed of clauses passed in.Can be overridden by extending classes, to modify query being returned.
- Parameters:
clauses
- List that containsBooleanClause
instances to join.- Returns:
- Resulting
Query
object. - Throws:
ParseException
- throw in overridden method to disallow
-
getWildcardQuery
Factory method for generating a query. Called when parser parses an input term token that contains one or more wildcard characters (? and *), but is not a prefix term token (one that has just a single * character at the end)Depending on settings, prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.
Can be overridden by extending classes, to provide custom handling for wildcard queries, which may be necessary due to missing analyzer calls.
- Parameters:
field
- Name of the field query will use.termStr
- Term token that contains one or more wild card characters (? or *), but is not simple prefix term- Returns:
- Resulting
Query
built for the term - Throws:
ParseException
- throw in overridden method to disallow
-
analyzeWildcard
-
getRegexpQuery
Factory method for generating a query. Called when parser parses an input term token that contains a regular expression query.Depending on settings, pattern term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with regular expression templates.
Can be overridden by extending classes, to provide custom handling for regular expression queries, which may be necessary due to missing analyzer calls.
- Parameters:
field
- Name of the field query will use.termStr
- Term token that contains a regular expression- Returns:
- Resulting
Query
built for the term - Throws:
ParseException
- throw in overridden method to disallow
-
getPrefixQuery
Factory method for generating a query (similar togetWildcardQuery(java.lang.String, java.lang.String)
). Called when parser parses an input term token that uses prefix notation; that is, contains a single '*' wildcard character as its last character. Since this is a special case of generic wildcard term, and such a query can be optimized easily, this usually results in a different query object.Depending on settings, a prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.
Can be overridden by extending classes, to provide custom handling for wild card queries, which may be necessary due to missing analyzer calls.
- Parameters:
field
- Name of the field query will use.termStr
- Term token to use for building term for the query (without trailing '*' character!)- Returns:
- Resulting
Query
built for the term - Throws:
ParseException
- throw in overridden method to disallow
-
getFuzzyQuery
protected Query getFuzzyQuery(String field, String termStr, float minSimilarity) throws ParseException Factory method for generating a query (similar togetWildcardQuery(java.lang.String, java.lang.String)
). Called when parser parses an input term token that has the fuzzy suffix (~) appended.- Parameters:
field
- Name of the field query will use.termStr
- Term token to use for building term for the query- Returns:
- Resulting
Query
built for the term - Throws:
ParseException
- throw in overridden method to disallow
-
handleBareTokenQuery
Query handleBareTokenQuery(String qfield, Token term, Token fuzzySlop, boolean prefix, boolean wildcard, boolean fuzzy, boolean regexp) throws ParseException - Throws:
ParseException
-
getFuzzyDistance
Determines the similarity distance for the given fuzzy token and term string.The default implementation uses the string image of the
fuzzyToken
in an attempt to parse it to a primitive float value. Otherwise, the minimal similarity distance is returned. Subclasses can override this method to return a similarity distance, say based on thetermStr
, if thefuzzyToken
does not specify a distance.- Parameters:
fuzzyToken
- The Fuzzy tokentermStr
- The Term string- Returns:
- The similarity distance
-
handleBareFuzzy
- Throws:
ParseException
-
handleQuotedTerm
- Throws:
ParseException
-
handleBoost
-
discardEscapeChar
Returns a String where the escape char has been removed, or kept only once if there was a double escape.Supports escaped unicode characters, e. g. translates
\\u0041
toA
.- Throws:
ParseException
-
hexToInt
Returns the numeric value of the hexadecimal character- Throws:
ParseException
-
escape
Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding\
.
-