Class TokenInfoDictionaryBuilder
- java.lang.Object
-
- org.apache.lucene.analysis.ja.util.TokenInfoDictionaryBuilder
-
class TokenInfoDictionaryBuilder extends java.lang.Object
-
-
Field Summary
Fields Modifier and Type Field Description private java.lang.String
encoding
private DictionaryBuilder.DictionaryFormat
format
private java.text.Normalizer.Form
normalForm
private int
offset
Internal word id - incrementally assigned as entries are read and added.
-
Constructor Summary
Constructors Constructor Description TokenInfoDictionaryBuilder(DictionaryBuilder.DictionaryFormat format, java.lang.String encoding, boolean normalizeEntries)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenInfoDictionaryWriter
build(java.nio.file.Path dir)
private TokenInfoDictionaryWriter
buildDictionary(java.util.List<java.nio.file.Path> csvFiles)
private java.lang.String[]
formatEntry(java.lang.String[] features)
-
-
-
Field Detail
-
encoding
private final java.lang.String encoding
-
normalForm
private final java.text.Normalizer.Form normalForm
-
format
private final DictionaryBuilder.DictionaryFormat format
-
offset
private int offset
Internal word id - incrementally assigned as entries are read and added. This will be byte offset of dictionary file
-
-
Constructor Detail
-
TokenInfoDictionaryBuilder
public TokenInfoDictionaryBuilder(DictionaryBuilder.DictionaryFormat format, java.lang.String encoding, boolean normalizeEntries)
-
-
Method Detail
-
build
public TokenInfoDictionaryWriter build(java.nio.file.Path dir) throws java.io.IOException
- Throws:
java.io.IOException
-
buildDictionary
private TokenInfoDictionaryWriter buildDictionary(java.util.List<java.nio.file.Path> csvFiles) throws java.io.IOException
- Throws:
java.io.IOException
-
formatEntry
private java.lang.String[] formatEntry(java.lang.String[] features)
-
-