XML::Parser::PerlSAX - Perl SAX parser using XML::Parser |
XML::Parser::PerlSAX - Perl SAX parser using XML::Parser
use XML::Parser::PerlSAX;
$parser = XML::Parser::PerlSAX->new( [OPTIONS] ); $result = $parser->parse( [OPTIONS] );
$result = $parser->parse($string);
XML::Parser::PerlSAX
is a PerlSAX parser using the XML::Parser
module. This man page summarizes the specific options, handlers, and
properties supported by XML::Parser::PerlSAX
; please refer to the
PerlSAX standard in `PerlSAX.pod
' for general usage information.
parse()
' override the default options in the
parser object for the duration of the parse.
parse()
' override
default options in the parser object.
ColumnNumber The column number of the parse. LineNumber The line number of the parse. BytePosition The current byte position of the parse. PublicId A string containing the public identifier, or undef if none is available. SystemId A string containing the system identifier, or undef if none is available. Base The current value of the base for resolving relative URIs.
ALPHA WARNING: The `SystemId
' and `PublicId
' properties returned
are the system and public identifiers of the document passed to
`parse()
', not the identifiers of the currently parsing external
entity. The column, line, and byte positions are of the current
entity being parsed.
The following options are supported by XML::Parser::PerlSAX
:
Handler default handler to receive events DocumentHandler handler to receive document events DTDHandler handler to receive DTD events ErrorHandler handler to receive error events EntityResolver handler to resolve entities Locale locale to provide localisation for errors Source hash containing the input source for parsing UseAttributeOrder set to true to provide AttributeOrder and Defaulted properties in `start_element()'
If no handlers are provided then all events will be silently ignored,
except for `fatal_error()
' which will cause a `die()
' to be
called after calling `end_document()
'.
If a single string argument is passed to the `parse()
' method, it
is treated as if a `Source
' option was given with a `String
'
parameter.
The `Source
' hash may contain the following parameters:
ByteStream The raw byte stream (file handle) containing the document. String A string containing the document. SystemId The system identifier (URI) of the document. PublicId The public identifier. Encoding A string describing the character encoding.
If more than one of `ByteStream
', `String
', or `SystemId
',
then preference is given first to `ByteStream
', then `String
',
then `SystemId
'.
The following handlers and properties are supported by
XML::Parser::PerlSAX
:
No properties defined.
No properties defined.
Name The element type name. Attributes A hash containing the attributes attached to the element, if any.
The `Attributes
' hash contains only string values.
If the `UseAttributeOrder
' parser option is true, the following
properties are also passed to `start_element
':
AttributeOrder An array of attribute names in the order they were specified, followed by the defaulted attribute names. Defaulted The index number of the first defaulted attribute in `AttributeOrder. If this index is equal to the length of `AttributeOrder', there were no defaulted values.
Note to XML::Parser
users: `Defaulted
' will be half the value of
XML::Parser::Expat
's `specified_attr()
' function because only
attribute names are provided, not their values.
Name The element type name.
Data The characters from the XML document.
Target The processing instruction target. Data The processing instruction data, if any.
Data The comment data, if any.
No properties defined.
No properties defined.
characters()
' handler. If this handler is not defined,
internal entities will be expanded if possible and passed to the
`characters()
' handler.
Name The entity reference name Value The entity reference value
Name The notation name. PublicId The notation's public identifier, if any. SystemId The notation's system identifier, if any. Base The base for resolving a relative URI, if any.
Name The unparsed entity's name. SystemId The entity's system identifier. PublicId The entity's public identifier, if any. Base The base for resolving a relative URI, if any.
Name The entity name. Value The entity value, if any. PublicId The notation's public identifier, if any. SystemId The notation's system identifier, if any. Notation The notation declared for this entity, if any.
For internal entities, the `Value
' parameter will contain the value
and the `PublicId
', `SystemId
', and `Notation
' will be
undefined. For external entities, the `Value
' parameter will be
undefined, the `SystemId
' parameter will have the system id, the
`PublicId
' parameter will have the public id if it was provided (it
will be undefined otherwise), the `Notation
' parameter will contain
the notation name for unparsed entities. If this is a parameter entity
declaration, then a '%' will be prefixed to the entity name.
Note that `entity_decl()
' and `unparsed_entity_decl()
' overlap.
If both methods are implemented by a handler, then this handler will
not be called for unparsed entities.
Name The element type name. Model The content model as a string.
This handler is called for each attribute in an ATTLIST declaration found in the internal subset. So an ATTLIST declaration that has multiple attributes will generate multiple calls to this handler.
ElementName The element type name. AttributeName The attribute name. Type The attribute type. Fixed True if this is a fixed attribute.
The default for `Type
' is the default value, which will either be
``#REQUIRED'', ``#IMPLIED'' or a quoted string (i.e. the returned string
will begin and end with a quote character).
Name The document type name. SystemId The document's system identifier. PublicId The document's public identifier, if any. Internal The internal subset as a string, if any.
Internal will contain all whitespace, comments, processing instructions, and declarations seen in the internal subset. The declarations will be there whether or not they have been processed by another handler (except for unparsed entities processed by the Unparsed handler). However, comments and processing instructions will not appear if they've been processed by their respective handlers.
Version The version. Encoding The encoding string, if any. Standalone True, false, or undefined if not declared.
Name The notation name. SystemId The notation's system identifier. PublicId The notation's public identifier, if any. Base The base for resolving a relative URI, if any.
`resolve_entity()
' should return undef to request that the parser
open a regular URI connection to the system identifier or a hash
describing the new input source. This hash has the same properties as
the `Source
' parameter to `parse()
':
PublicId The public identifier of the external entity being referenced, or undef if none was supplied. SystemId The system identifier of the external entity being referenced. String String containing XML text ByteStream An open file handle. CharacterStream An open file handle. Encoding The character encoding, if known.
Ken MacLeod, ken@bitsko.slc.ut.us
perl(1), PerlSAX.pod(3)
Extensible Markup Language (XML) <http://www.w3c.org/XML/> SAX 1.0: The Simple API for XML <http://www.megginson.com/SAX/>
XML::Parser::PerlSAX - Perl SAX parser using XML::Parser |