Uses of Package
org.apache.lucene.util

Packages that use org.apache.lucene.util
Package
Description
Text analysis.
Analyzer for Arabic.
Analyzer for Bulgarian.
Analyzer for Bengali Language.
Provides various convenience classes for creating boosts on Tokens.
Analyzer for Brazilian Portuguese.
Normalization of text before the tokenizer.
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
Analyzer for Sorani Kurdish.
Fast, general-purpose grammar-based tokenizers.
Analyzer for Simplified Chinese, which indexes words.
SmartChineseAnalyzer Hidden Markov Model package.
Construct n-grams for frequently occurring terms and phrases.
A filter that decomposes compound words you find in many Germanic languages into the word parts.
Basic, general-purpose analysis components.
A general-purpose Analyzer that can be created with a builder-style API.
Analyzer for Czech.
Analyzer for German.
Analyzer for Greek.
Fast, general-purpose URLs and email addresses tokenizers.
Analyzer for English.
Analyzer for Spanish.
Analyzer for Persian.
Analyzer for Finnish.
Analyzer for French.
Analyzer for Irish.
Analyzer for Galician.
Analyzer for Hindi.
Analyzer for Hungarian.
A Java implementation of Hunspell stemming and spell-checking algorithms (Hunspell), and a stemming TokenFilter (HunspellStemFilter) based on it.
Analysis components based on ICU
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
Additional ICU-specific Attributes for text analysis.
Analyzer for Indonesian.
Analyzer for Indian languages.
Analyzer for Italian.
Analyzer for Japanese.
Kuromoji dictionary implementation.
Additional Kuromoji-specific Attributes for text analysis.
Analyzer for Korean.
Korean dictionary implementation.
Additional Korean-specific Attributes for text analysis.
Analyzer for Latvian.
MinHash filtering (for LSH).
Miscellaneous Tokenstreams.
Character n-gram tokenizers and filters.
Analyzer for Norwegian.
Analysis components for path-like strings such as filenames.
Set of components for pattern-based (regex) analysis.
Provides various convenience classes for creating payloads on Tokens.
Analysis components for phonetic search.
Analyzer for Portuguese.
Filter to reverse token text.
Analyzer for Russian.
Word n-gram filters.
TokenFilter and Analyzer implementations that use a modified version of Snowball stemmers.
Analyzer for Serbian.
Fast, general-purpose grammar-based tokenizer StandardTokenizer implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.
Stempel: Algorithmic Stemmer
Analyzer for Swedish.
Analysis components for Synonyms.
Analyzer for Telugu Language.
Analyzer for Thai.
General-purpose attributes for text analysis.
Analyzer for Turkish.
Utility functions for text analysis.
Tokenizer that is aware of Wikipedia syntax.
Compressing helper classes.
BlockTree terms dictionary.
Lucene 5.0 file format.
Lucene 5.0 compressing format.
Components from the Lucene 7.0 index format.
Components from the Lucene 8.0 index format.
Lucene 8.4 file format.
Lucene 8.6 file format.
Lucene 8.7 file format.
Lucene 9.0 file format.
Lucene 9.1 file format.
Lucene 9.2 file format.
Lucene 9.4 file format.
Legacy PackedInts methods
Uses already seen data (the indexed documents) to classify an input ( can be simple text or a structured document).
Uses already seen data (the indexed documents) to classify new documents.
Utilities for evaluation, data preparation, etc.
Codecs API: API for customization of the encoding and structure of the index.
Pluggable term index / block terms dictionary implementations.
Same postings format as Lucene50, except the terms dictionary also supports ords, i.e.
Codec PostingsFormat for fast access to low-frequency terms such as primary key fields.
Compressing helper classes.
Lucene 9.0 file format.
BlockTree terms dictionary.
Lucene 9.0 compressing format.
Lucene 9.5 file format.
Term dictionary, DocValues or Postings formats that are read entirely into memory.
Postings format that can delegate to different formats per-field.
Simpletext Codec: writes human readable postings.
Pluggable term index / block terms dictionary implementations.
Pluggable term index / block terms dictionary implementations.
Unicode collation support.
Custom AttributeImpl for indexing collation keys as index terms.
The logical representation of a Document for indexing and searching.
Code to maintain and access indices.
High-performance single-document main memory Apache Lucene fulltext search index.
Miscellaneous Lucene utilities that don't really fit anywhere else.
Misc extensions of the Document/Field API.
Misc index tools and index support.
Misc search implementations.
Misc Directory implementations.
Misc FST classes.
Monitoring framework
Queries that compute score based upon a function.
FunctionValues for different data types.
A variety of functions to use with FunctionQuery.
Intervals queries
Document similarity query generators.
The payloads package provides Query mechanisms for finding and using payloads.
The calculus of spans.
A simple query parser implemented with JavaCC.
QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*"
Extendable QueryParser provides a simple and flexible extension mechanism by overloading query field names.
Standard Lucene Query Nodes.
A simple query parser for human-entered queries.
This package contains SrndQuery and its subclasses.
A primary-key postings format that associates a version (long) with each term and can provide fail-fast lookups by ID and version.
This package contains several point types: BigIntegerPoint for 128-bit integers LatLonPoint for latitude/longitude geospatial points
Experimental index-related classes
Additional queries (some may have caveats or limitations)
This package contains a flexible graph-based proximity query, TermAutomatonQuery, and geospatial queries.
Code to search indices.
Comparators, used to compare hits so as to determine their sort order when collecting the top results with TopFieldCollector.
Grouping.
Highlighting search terms.
Support for index-time and query-time joins.
This package contains several components useful to build a highlighter on top of the Matches API.
Suggest alternate spellings for words.
Support for Autocomplete/Autosuggest
Analyzer based autosuggest.
Support for document suggestion
Finite-state based autosuggest.
Ternary Search Tree based autosuggest.
The UnifiedHighlighter -- a flexible highlighter that can get offsets from postings, term vectors, or analysis.
Lucene field & query support for the spatial geometry implemented in org.apache.lucene.spatial3d.geom.
Binary i/o API, used for all index data.
Some utility classes.
Finite-state automaton for regular expressions.
Block KD-tree, implementing the generic spatial data structure described in this paper.
Finite state transducers
Utility classes for working with token streams as graphs.
Navigable Small-World graph, nominally Hierarchical but currently only has a single layer.
Comparable object wrappers
Packed integer arrays and streams.
Egothor stemmer API.