org.apache.commons.codec.language

Class DoubleMetaphone

public class DoubleMetaphone extends Object implements StringEncoder

Encodes a string into a double metaphone value. This Implementation is based on the algorithm by Lawrence Philips.

Version: $Id: DoubleMetaphone.java 130375 2004-06-05 18:32:04Z ggregory $

Author: Apache Software Foundation

Nested Class Summary
classDoubleMetaphone.DoubleMetaphoneResult
Inner class for storing results, since there is the optional alternate encoding.
Field Summary
static String[]ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER
static String[]L_R_N_M_B_H_F_V_W_SPACE
static String[]L_T_K_S_N_M_B_Z
protected intmaxCodeLen
Maximum length of an encoding, default is 4
static String[]SILENT_START
Prefixes when present which are not pronounced
static StringVOWELS
"Vowels" to test for
Constructor Summary
DoubleMetaphone()
Creates an instance of this DoubleMetaphone encoder
Method Summary
protected charcharAt(String value, int index)
Gets the character at index index if available, otherwise it returns Character.MIN_VALUE so that there is some sort of a default
StringcleanInput(String input)
Cleans the input
booleanconditionC0(String value, int index)
Complex condition 0 for 'C'
booleanconditionCH0(String value, int index)
Complex condition 0 for 'CH'
booleanconditionCH1(String value, int index)
Complex condition 1 for 'CH'
booleanconditionL0(String value, int index)
Complex condition 0 for 'L'
booleanconditionM0(String value, int index)
Complex condition 0 for 'M'
static booleancontains(String value, int start, int length, String criteria)
Shortcut method with 1 criteria
static booleancontains(String value, int start, int length, String criteria1, String criteria2)
Shortcut method with 2 criteria
static booleancontains(String value, int start, int length, String criteria1, String criteria2, String criteria3)
Shortcut method with 3 criteria
static booleancontains(String value, int start, int length, String criteria1, String criteria2, String criteria3, String criteria4)
Shortcut method with 4 criteria
static booleancontains(String value, int start, int length, String criteria1, String criteria2, String criteria3, String criteria4, String criteria5)
Shortcut method with 5 criteria
static booleancontains(String value, int start, int length, String criteria1, String criteria2, String criteria3, String criteria4, String criteria5, String criteria6)
Shortcut method with 6 criteria
protected static booleancontains(String value, int start, int length, String[] criteria)
Determines whether value contains any of the criteria starting at index start and matching up to length length
StringdoubleMetaphone(String value)
Encode a value with Double Metaphone
StringdoubleMetaphone(String value, boolean alternate)
Encode a value with Double Metaphone, optionally using the alternate encoding.
Objectencode(Object obj)
Encode the value using DoubleMetaphone.
Stringencode(String value)
Encode the value using DoubleMetaphone.
intgetMaxCodeLen()
Returns the maxCodeLen.
inthandleAEIOUY(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'A', 'E', 'I', 'O', 'U', and 'Y' cases
inthandleC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'C' cases
inthandleCC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'CC' cases
inthandleCH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'CH' cases
inthandleD(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'D' cases
inthandleG(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'G' cases
inthandleGH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'GH' cases
inthandleH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'H' cases
inthandleJ(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'J' cases
inthandleL(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'L' cases
inthandleP(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'P' cases
inthandleR(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'R' cases
inthandleS(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'S' cases
inthandleSC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'SC' cases
inthandleT(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'T' cases
inthandleW(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'W' cases
inthandleX(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'X' cases
inthandleZ(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'Z' cases
booleanisDoubleMetaphoneEqual(String value1, String value2)
Check if the Double Metaphone values of two String values are equal.
booleanisDoubleMetaphoneEqual(String value1, String value2, boolean alternate)
Check if the Double Metaphone values of two String values are equal, optionally using the alternate value.
booleanisSilentStart(String value)
Determines whether or not the value starts with a silent letter.
booleanisSlavoGermanic(String value)
Determines whether or not a value is of slavo-germanic orgin.
booleanisVowel(char ch)
Determines whether or not a character is a vowel or not
voidsetMaxCodeLen(int maxCodeLen)
Sets the maxCodeLen.

Field Detail

ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER

private static final String[] ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER

L_R_N_M_B_H_F_V_W_SPACE

private static final String[] L_R_N_M_B_H_F_V_W_SPACE

L_T_K_S_N_M_B_Z

private static final String[] L_T_K_S_N_M_B_Z

maxCodeLen

protected int maxCodeLen
Maximum length of an encoding, default is 4

SILENT_START

private static final String[] SILENT_START
Prefixes when present which are not pronounced

VOWELS

private static final String VOWELS
"Vowels" to test for

Constructor Detail

DoubleMetaphone

public DoubleMetaphone()
Creates an instance of this DoubleMetaphone encoder

Method Detail

charAt

protected char charAt(String value, int index)
Gets the character at index index if available, otherwise it returns Character.MIN_VALUE so that there is some sort of a default

cleanInput

private String cleanInput(String input)
Cleans the input

conditionC0

private boolean conditionC0(String value, int index)
Complex condition 0 for 'C'

conditionCH0

private boolean conditionCH0(String value, int index)
Complex condition 0 for 'CH'

conditionCH1

private boolean conditionCH1(String value, int index)
Complex condition 1 for 'CH'

conditionL0

private boolean conditionL0(String value, int index)
Complex condition 0 for 'L'

conditionM0

private boolean conditionM0(String value, int index)
Complex condition 0 for 'M'

contains

private static boolean contains(String value, int start, int length, String criteria)
Shortcut method with 1 criteria

contains

private static boolean contains(String value, int start, int length, String criteria1, String criteria2)
Shortcut method with 2 criteria

contains

private static boolean contains(String value, int start, int length, String criteria1, String criteria2, String criteria3)
Shortcut method with 3 criteria

contains

private static boolean contains(String value, int start, int length, String criteria1, String criteria2, String criteria3, String criteria4)
Shortcut method with 4 criteria

contains

private static boolean contains(String value, int start, int length, String criteria1, String criteria2, String criteria3, String criteria4, String criteria5)
Shortcut method with 5 criteria

contains

private static boolean contains(String value, int start, int length, String criteria1, String criteria2, String criteria3, String criteria4, String criteria5, String criteria6)
Shortcut method with 6 criteria

contains

protected static boolean contains(String value, int start, int length, String[] criteria)
Determines whether value contains any of the criteria starting at index start and matching up to length length

doubleMetaphone

public String doubleMetaphone(String value)
Encode a value with Double Metaphone

Parameters: value String to encode

Returns: an encoded string

doubleMetaphone

public String doubleMetaphone(String value, boolean alternate)
Encode a value with Double Metaphone, optionally using the alternate encoding.

Parameters: value String to encode alternate use alternate encode

Returns: an encoded string

encode

public Object encode(Object obj)
Encode the value using DoubleMetaphone. It will only work if obj is a String (like Metaphone).

Parameters: obj Object to encode (should be of type String)

Returns: An encoded Object (will be of type String)

Throws: EncoderException encode parameter is not of type String

encode

public String encode(String value)
Encode the value using DoubleMetaphone.

Parameters: value String to encode

Returns: An encoded String

getMaxCodeLen

public int getMaxCodeLen()
Returns the maxCodeLen.

Returns: int

handleAEIOUY

private int handleAEIOUY(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'A', 'E', 'I', 'O', 'U', and 'Y' cases

handleC

private int handleC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'C' cases

handleCC

private int handleCC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'CC' cases

handleCH

private int handleCH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'CH' cases

handleD

private int handleD(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'D' cases

handleG

private int handleG(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'G' cases

handleGH

private int handleGH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'GH' cases

handleH

private int handleH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'H' cases

handleJ

private int handleJ(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'J' cases

handleL

private int handleL(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'L' cases

handleP

private int handleP(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'P' cases

handleR

private int handleR(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'R' cases

handleS

private int handleS(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'S' cases

handleSC

private int handleSC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'SC' cases

handleT

private int handleT(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'T' cases

handleW

private int handleW(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'W' cases

handleX

private int handleX(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
Handles 'X' cases

handleZ

private int handleZ(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
Handles 'Z' cases

isDoubleMetaphoneEqual

public boolean isDoubleMetaphoneEqual(String value1, String value2)
Check if the Double Metaphone values of two String values are equal.

Parameters: value1 The left-hand side of the encoded {@link String#equals(Object)}. value2 The right-hand side of the encoded {@link String#equals(Object)}.

Returns: true if the encoded Strings are equal; false otherwise.

See Also: DoubleMetaphone

isDoubleMetaphoneEqual

public boolean isDoubleMetaphoneEqual(String value1, String value2, boolean alternate)
Check if the Double Metaphone values of two String values are equal, optionally using the alternate value.

Parameters: value1 The left-hand side of the encoded {@link String#equals(Object)}. value2 The right-hand side of the encoded {@link String#equals(Object)}. alternate use the alternate value if true.

Returns: true if the encoded Strings are equal; false otherwise.

isSilentStart

private boolean isSilentStart(String value)
Determines whether or not the value starts with a silent letter. It will return true if the value starts with any of 'GN', 'KN', 'PN', 'WR' or 'PS'.

isSlavoGermanic

private boolean isSlavoGermanic(String value)
Determines whether or not a value is of slavo-germanic orgin. A value is of slavo-germanic origin if it contians any of 'W', 'K', 'CZ', or 'WITZ'.

isVowel

private boolean isVowel(char ch)
Determines whether or not a character is a vowel or not

setMaxCodeLen

public void setMaxCodeLen(int maxCodeLen)
Sets the maxCodeLen.

Parameters: maxCodeLen The maxCodeLen to set

commons-codec version 1.3 - Copyright © 2002-2004 - Apache Software Foundation