Class BinaryDictionary
- java.lang.Object
-
- org.apache.lucene.analysis.ko.dict.BinaryDictionary
-
- All Implemented Interfaces:
Dictionary
- Direct Known Subclasses:
TokenInfoDictionary
,UnknownDictionary
public abstract class BinaryDictionary extends java.lang.Object implements Dictionary
Base class for a binary-encoded in-memory dictionary.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BinaryDictionary.ResourceScheme
Used to specify where (dictionary) resources get loaded from.-
Nested classes/interfaces inherited from interface org.apache.lucene.analysis.ko.dict.Dictionary
Dictionary.Morpheme
-
-
Field Summary
Fields Modifier and Type Field Description private java.nio.ByteBuffer
buffer
static java.lang.String
DICT_FILENAME_SUFFIX
static java.lang.String
DICT_HEADER
static int
HAS_READING
flag that the entry has reading data.static int
HAS_SINGLE_POS
flag that the entry has a single part of speech (leftPOS)private POS.Tag[]
posDict
static java.lang.String
POSDICT_FILENAME_SUFFIX
static java.lang.String
POSDICT_HEADER
private java.lang.String
resourcePath
private BinaryDictionary.ResourceScheme
resourceScheme
private int[]
targetMap
static java.lang.String
TARGETMAP_FILENAME_SUFFIX
static java.lang.String
TARGETMAP_HEADER
private int[]
targetMapOffsets
static int
VERSION
-
Constructor Summary
Constructors Modifier Constructor Description protected
BinaryDictionary()
protected
BinaryDictionary(BinaryDictionary.ResourceScheme resourceScheme, java.lang.String resourcePath)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.io.InputStream
getClassResource(java.lang.Class<?> clazz, java.lang.String suffix)
private java.io.InputStream
getClassResource(java.lang.String path)
int
getLeftId(int wordId)
Get left id of specified wordPOS.Tag
getLeftPOS(int wordId)
Get the leftPOS.Tag
of specfied word.Dictionary.Morpheme[]
getMorphemes(int wordId, char[] surfaceForm, int off, int len)
Get the morphemes of specified word (e.g.POS.Type
getPOSType(int wordId)
Get thePOS.Type
of specified word (morpheme, compound, inflect or pre-analysis)java.lang.String
getReading(int wordId)
Get the reading of specified word (mainly used for Hanja to Hangul conversion).protected java.io.InputStream
getResource(java.lang.String suffix)
int
getRightId(int wordId)
Get right id of specified wordPOS.Tag
getRightPOS(int wordId)
Get the rightPOS.Tag
of specfied word.int
getWordCost(int wordId)
Get word cost of specified wordprivate boolean
hasReadingData(int wordId)
private boolean
hasSinglePOS(int wordId)
void
lookupWordIds(int sourceId, IntsRef ref)
private java.lang.String
readString(int offset)
-
-
-
Field Detail
-
TARGETMAP_FILENAME_SUFFIX
public static final java.lang.String TARGETMAP_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
DICT_FILENAME_SUFFIX
public static final java.lang.String DICT_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
POSDICT_FILENAME_SUFFIX
public static final java.lang.String POSDICT_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
DICT_HEADER
public static final java.lang.String DICT_HEADER
- See Also:
- Constant Field Values
-
TARGETMAP_HEADER
public static final java.lang.String TARGETMAP_HEADER
- See Also:
- Constant Field Values
-
POSDICT_HEADER
public static final java.lang.String POSDICT_HEADER
- See Also:
- Constant Field Values
-
VERSION
public static final int VERSION
- See Also:
- Constant Field Values
-
resourceScheme
private final BinaryDictionary.ResourceScheme resourceScheme
-
resourcePath
private final java.lang.String resourcePath
-
buffer
private final java.nio.ByteBuffer buffer
-
targetMapOffsets
private final int[] targetMapOffsets
-
targetMap
private final int[] targetMap
-
posDict
private final POS.Tag[] posDict
-
HAS_SINGLE_POS
public static final int HAS_SINGLE_POS
flag that the entry has a single part of speech (leftPOS)- See Also:
- Constant Field Values
-
HAS_READING
public static final int HAS_READING
flag that the entry has reading data. otherwise reading is surface form- See Also:
- Constant Field Values
-
-
Constructor Detail
-
BinaryDictionary
protected BinaryDictionary() throws java.io.IOException
- Throws:
java.io.IOException
-
BinaryDictionary
protected BinaryDictionary(BinaryDictionary.ResourceScheme resourceScheme, java.lang.String resourcePath) throws java.io.IOException
- Parameters:
resourceScheme
- - scheme for loading resources (FILE or CLASSPATH).resourcePath
- - where to load resources (dictionaries) from. If null, with CLASSPATH scheme only, use this class's name as the path.- Throws:
java.io.IOException
-
-
Method Detail
-
getResource
protected final java.io.InputStream getResource(java.lang.String suffix) throws java.io.IOException
- Throws:
java.io.IOException
-
getClassResource
public static final java.io.InputStream getClassResource(java.lang.Class<?> clazz, java.lang.String suffix) throws java.io.IOException
- Throws:
java.io.IOException
-
getClassResource
private java.io.InputStream getClassResource(java.lang.String path) throws java.io.IOException
- Throws:
java.io.IOException
-
lookupWordIds
public void lookupWordIds(int sourceId, IntsRef ref)
-
getLeftId
public int getLeftId(int wordId)
Description copied from interface:Dictionary
Get left id of specified word- Specified by:
getLeftId
in interfaceDictionary
-
getRightId
public int getRightId(int wordId)
Description copied from interface:Dictionary
Get right id of specified word- Specified by:
getRightId
in interfaceDictionary
-
getWordCost
public int getWordCost(int wordId)
Description copied from interface:Dictionary
Get word cost of specified word- Specified by:
getWordCost
in interfaceDictionary
-
getPOSType
public POS.Type getPOSType(int wordId)
Description copied from interface:Dictionary
Get thePOS.Type
of specified word (morpheme, compound, inflect or pre-analysis)- Specified by:
getPOSType
in interfaceDictionary
-
getLeftPOS
public POS.Tag getLeftPOS(int wordId)
Description copied from interface:Dictionary
Get the leftPOS.Tag
of specfied word. ForPOS.Type.MORPHEME
andPOS.Type.COMPOUND
the left and right POS are the same.- Specified by:
getLeftPOS
in interfaceDictionary
-
getRightPOS
public POS.Tag getRightPOS(int wordId)
Description copied from interface:Dictionary
Get the rightPOS.Tag
of specfied word. ForPOS.Type.MORPHEME
andPOS.Type.COMPOUND
the left and right POS are the same.- Specified by:
getRightPOS
in interfaceDictionary
-
getReading
public java.lang.String getReading(int wordId)
Description copied from interface:Dictionary
Get the reading of specified word (mainly used for Hanja to Hangul conversion).- Specified by:
getReading
in interfaceDictionary
-
getMorphemes
public Dictionary.Morpheme[] getMorphemes(int wordId, char[] surfaceForm, int off, int len)
Description copied from interface:Dictionary
Get the morphemes of specified word (e.g. 가깝으나: 가깝 + 으나).- Specified by:
getMorphemes
in interfaceDictionary
-
readString
private java.lang.String readString(int offset)
-
hasSinglePOS
private boolean hasSinglePOS(int wordId)
-
hasReadingData
private boolean hasReadingData(int wordId)
-
-