Pod::Simple - framework for parsing Pod |
Pod::Simple - framework for parsing Pod
TODO
Pod::Simple is a Perl library for parsing text in the Pod (``plain old
documentation'') markup language that is typically used for writing
documentation for Perl and for Perl modules. The Pod format is explained
in the perlpod manpage; the most common formatter is called perldoc
.
Be sure to read ENCODING if your Pod contains non-ASCII characters.
Pod formatters can use Pod::Simple to parse Pod documents and render them into
plain text, HTML, or any number of other formats. Typically, such formatters
will be subclasses of Pod::Simple, and so they will inherit its methods, like
parse_file
.
If you're reading this document just because you have a Pod-processing subclass that you want to use, this document (plus the documentation for the subclass) is probably all you need to read.
If you're reading this document because you want to write a formatter subclass, continue reading it and then read the Pod::Simple::Subclassing manpage, and then possibly even read the perlpodspec manpage (some of which is for parser-writers, but much of which is notes to formatter-writers).
$parser = SomeClass->new();
SomeClass
is a subclass
of Pod::Simple.
$parser->output_fh( *OUT );
$parser
's output will be written to.
You can pass *STDOUT
or *STDERR
, otherwise you should probably do
something like this:
my $outfile = "output.txt"; open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!"; $parser->output_fh(*TXTOUT);
...before you call one of the $parser->parse_whatever
methods.
$parser->output_string( \$somestring );
$parser
's output will be sent to,
instead of any filehandle.
$parser->parse_file( $some_filename );
$parser->parse_file( *INPUT_FH );
$parser
object, according to however
$parser
's class works, and according to whatever parser options you
have set up for this $parser
object.
$parser->parse_string_document( $all_content );
parse_file
except that it reads the Pod
content not from a file, but from a string that you have already
in memory.
$parser->parse_lines( ...@lines..., undef );
@lines
(where each list item must be a
defined value, and must contain exactly one line of content -- so no
items like "foo\nbar"
are allowed). The final undef
is used to
indicate the end of document being parsed.
The other parser_whatever
methods are meant to be called only once
per $parser
object; but parse_lines
can be called as many times per
$parser
object as you want, as long as the last call (and only
the last call) ends with an undef
value.
$parser->content_seen
SomeClass->filter( $filename );
SomeClass->filter( *INPUT_FH );
SomeClass->filter( \$document_content );
perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')"
Some of these methods might be of interest to general users, as well as of interest to formatter-writers.
Note that the general pattern here is that the accessor-methods
read the attribute's value with $value = $parser->attribute
and set the attribute's value with
$parser->attribute(newvalue)
. For each accessor, I typically
only mention one syntax or another, based on which I think you are actually
most likely to use.
$parser->parse_characters( SOMEVALUE )
=encoding
declaration in the Pod source. Set
this option to a true value to indicate that the Pod source is already a Perl
character stream. This tells the parser to ignore any =encoding
command
and to skip all the code paths involving decoding octets.
$parser->no_whining( SOMEVALUE )
Note that turning this attribute to true won't suppress one or two kinds of complaints about rarely occurring unrecoverable errors.
$parser->no_errata_section( SOMEVALUE )
$parser->complain_stderr( SOMEVALUE )
Setting complain_stderr
also sets no_errata_section
.
$parser->source_filename
$parser->doc_has_started
$parser
has read from a source, and has seen
Pod content in it.
$parser->source_dead
$parser
has read from a source, and come to the
end of that source.
$parser->strip_verbatim_indent( SOMEVALUE )
If the POD you're parsing adheres to a consistent indentation policy, you can have such indentation stripped from the beginning of every line of your verbatim blocks. This method tells Pod::Simple what to strip. For two-space indents, you'd use:
$parser->strip_verbatim_indent(' ');
For tab indents, you'd use a tab character:
$parser->strip_verbatim_indent("\t");
If the POD is inconsistent about the indentation of verbatim blocks, but you have figured out a heuristic to determine how much a particular verbatim block is indented, you can pass a code reference instead. The code reference will be executed with one argument, an array reference of all the lines in the verbatim block, and should return the value to be stripped from each line. For example, if you decide that you're fine to use the first line of the verbatim block to set the standard for indentation of the rest of the block, you can look at the first line and return the appropriate value, like so:
$new->strip_verbatim_indent(sub { my $lines = shift; (my $indent = $lines->[0]) =~ s/\S.*//; return $indent; });
If you'd rather treat each line individually, you can do that, too, by just
transforming them in-place in the code reference and returning undef
. Say
that you don't want any lines indented. You can do something like this:
$new->strip_verbatim_indent(sub { my $lines = shift; sub { s/^\s+// for @{ $lines }, return undef; });
$parser->abandon_output_fh()
$parser
is not
effected.
$parser->abandon_output_string()
$parser
is not
effected.
$parser->accept_code( @codes )
$parser->accept_codes( @codes )
$parser
to accept a list of Formatting Codes in the perlpod manpage. This can be
used to implement user-defined codes.
$parser->accept_directive_as_data( @directives )
$parser
to accept a list of directives for data paragraphs. A
directive is the label of a Command Paragraph in the perlpod manpage. A data paragraph is
one delimited by =begin/=for/=end
directives. This can be used to
implement user-defined directives.
$parser->accept_directive_as_processed( @directives )
$parser
to accept a list of directives for processed paragraphs. A
directive is the label of a Command Paragraph in the perlpod manpage. A processed
paragraph is also known as Ordinary Paragraph in the perlpod manpage. This can be used to
implement user-defined directives.
$parser->accept_directive_as_verbatim( @directives )
$parser
to accept a list of directives for Verbatim Paragraph in the perlpod manpage. A directive is the label of a Command Paragraph in the perlpod manpage. This
can be used to implement user-defined directives.
$parser->accept_target( @targets )
$parser->accept_target_as_text( @targets )
$parser->accept_targets( @targets )
=begin/=for/=end
sections of the POD.
$parser->accept_targets_as_text( @targets )
=begin/=for/=end
sections that should be parsed as
POD. For details, see About Data Paragraphs in the perlpodspec manpage.
$parser->any_errata_seen()
Example:
die "too many errors\n" if $parser->any_errata_seen();
$parser->errata_seen()
Example:
if ( $parser->any_errata_seen() ) { $logger->log( $parser->errata_seen() ); }
$parser->detected_encoding()
=encoding
, but only if the
encoding was recognized and handled.
$parser->encoding()
$parser->parse_from_file( $source, $to )
$source
file to $to
file. Similar to parse_from_file in the < Pod::Parser manpage>.
$parser->scream( @error_messages )
$parser->unaccept_code( @codes )
$parser->unaccept_codes( @codes )
@codes
as valid codes for the parse.
$parser->unaccept_directive( @directives )
$parser->unaccept_directives( @directives )
@directives
as valid directives for the parse.
$parser->unaccept_target( @targets )
$parser->unaccept_targets( @targets )
@targets
as valid targets for the parse.
$parser->version_report()
$parser->whine( @error_messages )
$parser->no_whining( TRUE );
.
The Pod::Simple parser expects to read octets. The parser will decode the
octets into Perl's internal character string representation using the value of
the =encoding
declaration in the POD source.
If the POD source does not include an =encoding
declaration, the parser will
attempt to guess the encoding (selecting one of UTF-8 or CP 1252) by examining
the first non-ASCII bytes and applying the heuristic described in
the perlpodspec manpage. (If the POD source contains only ASCII bytes, the
encoding is assumed to be ASCII.)
If you set the parse_characters
option to a true value the parser will
expect characters rather than octets; will ignore any =encoding
; and will
make no attempt to decode the input.
the Pod::Simple::Subclassing manpage
Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.
This module is managed in an open GitHub repository, https://github.com/perl-pod/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/perl-pod/pod-simple.git and send patches!
Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.
Copyright (c) 2002 Sean M. Burke.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.
Pod::Simple is maintained by:
Documentation has been contributed by:
Pod::Simple - framework for parsing Pod |