org.gjt.xpp.impl.pullparser
Class PullParser
public
class
PullParser
extends Object
implements XmlPullParser, XmlPullParserBufferControl, XmlPullParserEventPosition
XML Pull Parser (XPP) allows to pull XML events from input stream.
Advantages:
- very simple pull interface
- ideal for deserializing XML objects (like SOAP)
- simple and efficient thin wrapper around Tokenizer class
- when compared with using Tokenizer directly adds
about 10% for big documents,
maximum 50% more processing time for small documents
- lightweight memory model - minimized memory allocation:
element content and attributes are only read on explicit
method calls,
both StartTag and EndTag can be reused during parsing
- small - total compiled size around 20K
- by default supports namespaces parsing
(can be switched off)
- support for mixed content can be explicitly disabled
Limitations:
- this is beta version - may have still bugs :-)
- does not parse DTD (recognizes only predefined entities)
Author: Aleksander Slominski
Field Summary |
protected Attribute[] | attrPos temporary array of current attributes |
protected int | attrPosEnd index for last attribute in attrPos array |
protected int | attrPosSize size of attrPos array |
protected static boolean | CHECK_ATTRIB_UNIQ
Should attribute uniqueness be checked for attributes
as in specified XML and NS specifications? |
protected String | elContent Content of current element if in CONTENT state |
protected ElementContent[] | elStack temprary array to keep ElementContent stack |
protected int | elStackDepth how many elements are on elStack |
protected int | elStackSize size of elStack array |
protected boolean | emptyElement Have we read empty element? |
protected int | eventEnd end position of current event in tokenizer biffer |
protected int | eventStart start position of current event in tokenizer biffer |
protected Hashtable | prefix2Ns mapping of names prefixes to uris |
protected boolean | reportNsAttribs should parser report namespace xmlns* attributes ? |
protected boolean | seenRootElement Have we seen root element |
protected byte | state what is current event type as returned from next()? |
protected boolean | supportNs should parser support namespaces? |
protected byte | token what is current token returned from tokeizer |
protected Tokenizer | tokenizer XML tokenizer that is doing actual tokenizning of input stream. |
protected static boolean | USE_QNAMEBUF |
Method Summary |
protected void | ensureAttribs(int size)
Make sure that in attributes temporary array is enough space. |
protected void | ensureCapacity(int size)
Make sure that we have enough space to keep element stack if passed size. |
int | getBufferShrinkOffset() |
int | getColumnNumber() |
int | getContentLength() |
int | getDepth() |
char[] | getEventBuffer() |
int | getEventEnd() |
int | getEventStart() |
byte | getEventType() |
int | getHardLimit() |
int | getLineNumber() |
String | getLocalName() |
int | getNamespacesLength(int depth) |
String | getNamespaceUri() |
String | getPosDesc()
Return string describing current position of parser in input stream. |
String | getPrefix() |
String | getQNameLocal(String qName) |
String | getQNameUri(String qName) |
String | getRawName() |
int | getSoftLimit() |
boolean | isAllowedMixedContent() |
boolean | isBufferShrinkable() |
boolean | isNamespaceAttributesReporting() |
boolean | isNamespaceAware() |
boolean | isWhitespaceContent()
Return true if just read CONTENT contained only white spaces. |
byte | next()
This is key method - it reads more from input stream
and returns next event type
(such as START_TAG, END_TAG, CONTENT).
or END_DOCUMENT if no more input.
|
String | readContent()
Return String that contains just read CONTENT. |
void | readEndTag(XmlEndTag etag)
Read value of just read END_TAG into passed as argument EndTag. |
void | readNamespacesPrefixes(int depth, String[] prefixes, int off, int len)
Return namespace prefixes for element at depth |
void | readNamespacesUris(int depth, String[] uris, int off, int len)
Return namespace URIs for element at depth |
byte | readNode(XmlNode node) |
void | readNodeWithoutChildren(XmlNode node) |
void | readStartTag(XmlStartTag stag)
Read value of just read START_TAG into passed as argument StartTag. |
void | reset()
Reset parser state so it can be used to parse new |
protected void | resetState() |
void | setAllowedMixedContent(boolean enable)
Allow for mixed element content.
|
void | setBufferShrinkable(boolean shrinkable) |
void | setHardLimit(int value) |
void | setInput(Reader reader)
Reset parser and set new input. |
void | setInput(char[] buf)
Reset parser and set new input. |
void | setInput(char[] buf, int off, int len) |
void | setNamespaceAttributesReporting(boolean enable)
Make parser to report xmlns* attributes. |
void | setNamespaceAware(boolean awareness)
Set support of namespaces. |
void | setSoftLimit(int value) |
byte | skipNode()
If parser has just read start tag it allows to skip whoole
subtree contined in this element. |
temporary array of current attributes
protected int attrPosEnd
index for last attribute in attrPos array
protected int attrPosSize
size of attrPos array
protected static final boolean CHECK_ATTRIB_UNIQ
Should attribute uniqueness be checked for attributes
as in specified XML and NS specifications?
protected String elContent
Content of current element if in CONTENT state
temprary array to keep ElementContent stack
protected int elStackDepth
how many elements are on elStack
protected int elStackSize
size of elStack array
protected boolean emptyElement
Have we read empty element?
protected int eventEnd
end position of current event in tokenizer biffer
protected int eventStart
start position of current event in tokenizer biffer
protected Hashtable prefix2Ns
mapping of names prefixes to uris
protected boolean reportNsAttribs
should parser report namespace xmlns* attributes ?
protected boolean seenRootElement
Have we seen root element
protected byte state
what is current event type as returned from next()?
protected boolean supportNs
should parser support namespaces?
protected byte token
what is current token returned from tokeizer
XML tokenizer that is doing actual tokenizning of input stream.
protected static final boolean USE_QNAMEBUF
public PullParser()
Create instance of pull parser.
protected void ensureAttribs(int size)
Make sure that in attributes temporary array is enough space.
protected void ensureCapacity(int size)
Make sure that we have enough space to keep element stack if passed size.
public int getBufferShrinkOffset()
public int getColumnNumber()
public int getContentLength()
public int getDepth()
public char[] getEventBuffer()
public int getEventEnd()
public int getEventStart()
public byte getEventType()
public int getHardLimit()
public int getLineNumber()
public String getLocalName()
public int getNamespacesLength(int depth)
public String getNamespaceUri()
public String getPosDesc()
Return string describing current position of parser in input stream.
public String getPrefix()
public String getQNameLocal(String qName)
public String getQNameUri(String qName)
public String getRawName()
public int getSoftLimit()
public boolean isAllowedMixedContent()
public boolean isBufferShrinkable()
public boolean isNamespaceAttributesReporting()
public boolean isNamespaceAware()
public boolean isWhitespaceContent()
Return true if just read CONTENT contained only white spaces.
public byte next()
This is key method - it reads more from input stream
and returns next event type
(such as START_TAG, END_TAG, CONTENT).
or END_DOCUMENT if no more input.
This is simple automata (in pseudo-code):
byte next() {
while(state != END_DOCUMENT) {
token = tokenizer.next(); // get next XML token
switch(token) {
case Tokenizer.END_DOCUMENT:
return state = END_DOCUMENT
case Tokenizer.CONTENT:
// check if content allowed - only inside element
return state = CONTENT
case Tokenizer.ETAG_NAME:
// popup element from stack - compare if matched start and end tag
// if namespaces supported restore namespaces prefix mappings
return state = END_TAG;
case Tokenizer.STAG_NAME:
// create new element push it on stack
// process attributes (including namespaces)
// set emptyElement = true; if empty element
// check atribute uniqueness (including nmespacese prefixes)
return state = START_TAG;
}
}
}
Actual parsing is more complex especilly for start tag due to
dealing with attributes reported separately from tokenizer and
declaring namespace prefixes and uris.
public String readContent()
Return String that contains just read CONTENT.
Read value of just read END_TAG into passed as argument EndTag.
public void readNamespacesPrefixes(int depth, String[] prefixes, int off, int len)
Return namespace prefixes for element at depth
public void readNamespacesUris(int depth, String[] uris, int off, int len)
Return namespace URIs for element at depth
public void readNodeWithoutChildren(
XmlNode node)
Read value of just read START_TAG into passed as argument StartTag.
public void reset()
Reset parser state so it can be used to parse new
protected void resetState()
public void setAllowedMixedContent(boolean enable)
Allow for mixed element content.
Enabled by default.
When disbaled element must containt either text
or other elements.
public void setBufferShrinkable(boolean shrinkable)
public void setHardLimit(int value)
public void setInput(Reader reader)
Reset parser and set new input.
public void setInput(char[] buf)
Reset parser and set new input.
public void setInput(char[] buf, int off, int len)
public void setNamespaceAttributesReporting(boolean enable)
Make parser to report xmlns* attributes. Disabled by default.
Only meaningful when namespaces are enabled (when namespaces
are disabled all attributes are always reported).
public void setNamespaceAware(boolean awareness)
Set support of namespaces. Disabled by default.
public void setSoftLimit(int value)
public byte skipNode()
If parser has just read start tag it allows to skip whoole
subtree contined in this element. Returns when encounters
end tag matching the start tag.
Copyright (c) 2003 IU Extreme! Lab http://www.extreme.indiana.edu/ All Rights Reserved.
Note this package is deprecated by
XPP3 that implements
XmlPull API