net.sf.saxon.charcode

Class UTF8CharacterSet

public final class UTF8CharacterSet extends Object implements CharacterSet

This class defines properties of the UTF-8 character set
Method Summary
static intdecodeUTF8(byte[] in, int used)
Decode a UTF8 character
StringgetCanonicalName()
static UTF8CharacterSetgetInstance()
Get the singular instance of this class
static intgetUTF8Encoding(char in, char in2, byte[] out)
Static method to generate the UTF-8 representation of a Unicode character
booleaninCharset(int c)

Method Detail

decodeUTF8

public static int decodeUTF8(byte[] in, int used)
Decode a UTF8 character

Parameters: in array of bytes representing a single UTF-8 encoded character used number of bytes in the array that are actually used

Returns: the Unicode codepoint of this character

Throws: IllegalArgumentException if the byte sequence is not a valid UTF-8 representation

getCanonicalName

public String getCanonicalName()

getInstance

public static UTF8CharacterSet getInstance()
Get the singular instance of this class

Returns: the singular instance of this classthe singular instance of this class

getUTF8Encoding

public static int getUTF8Encoding(char in, char in2, byte[] out)
Static method to generate the UTF-8 representation of a Unicode character

Parameters: in the Unicode character, or the high half of a surrogate pair in2 the low half of a surrogate pair (ignored unless the first argument is in the range for a surrogate pair) out an array of at least 4 bytes to hold the UTF-8 representation.

Returns: the number of bytes in the UTF-8 representation

inCharset

public boolean inCharset(int c)