Getopt::Declare - Declaratively Expressed Command-Line Arguments via Regular Expressions |
Getopt::Declare - Declaratively Expressed Command-Line Arguments via Regular Expressions
This document describes version 1.14 of Getopt::Declare
use Getopt::Declare;
$args = Getopt::Declare->new($specification_string, $optional_source);
# or:
use Getopt::Declare $specification_string => $args;
Getopt::Declare is yet another command-line argument parser, one which is specifically designed to be powerful but exceptionally easy to use.
To parse the command-line in @ARGV
, one simply creates a
Getopt::Declare object, by passing Getopt::Declare::new()
a
specification of the various parameters that may be encountered:
use Getopt::Declare; $args = Getopt::Declare->new($specification);
This may also be done in a one-liner:
use Getopt::Declare, $specification => $args;
The specification is a single string such as this:
$specification = q(
-a Process all data
-b <N:n> Set mean byte length threshold to <N> { bytelen = $N; }
+c <FILE> Create new file <FILE>
--del Delete old file { delold() }
delete [ditto]
e <H:i>x<W:i> Expand image to height <H> and width <W> { expand($H,$W); }
-F <file>... Process named file(s) { defer {for (@file) {process()}} }
=getrand [<N>] Get a random number (or, optionally, <N> of them) { $N = 1 unless defined $N; }
-- Traditionally indicates end of arguments { finish } );
Note that in each of the cases above, there is a tab between each parameter definition and description (even if you can't see it)! In the specification, the syntax of each parameter is declared, along with a description and (optionally) one or more actions to be performed when the parameter is encountered. The specification string may also include other usage formatting information (such as group headings or separators) as well as standard Perl comments (which are ignored).
Calling Getopt::Delare::new()
parses the contents of the array @ARGV
,
extracting any arguments which match the parameters defined in the
specification string, and storing the parsed values as hash elements
within the new Getopt::Declare object being created.
Other features of the Getopt::Declare package include:
@ARGV
)
Declarative specification of various inter-parameter relationships (for
example, two parameters may be declared mutually exclusive and this
relationship will then be automatically enforced).
Intelligent clustering of adjacent flags (for example: the
command-line sequence ``-a -b -c'' may be abbreviated to ``-abc'', unless
there is also a -abc
flag declared).
Selective or global case-insensitivity of parameters.
The ability to parse files (especially configuration files) instead of
the command-line.
The terminology of command-line processing is often confusing, with various terms (such as ``argument'', ``parameter'', ``option'', ``flag'', etc.) frequently being used interchangeably and inconsistently in the various Getopt:: packages available. In this documentation, the following terms are used consistently:
@ARGV
at the time a Getopt::Declare object is created.
--window <height> x <width> Set window to <height> by <width> { setwin($width,$height); }
--window <h>x<w>@<x>,<y> Set window size and centroid { setwin($w,$h,$x,$y); }
@ARGV
, or part of a single @ARGV
element,
or the concatenation of several adjacent @ARGV
elements.
--window <height> x <width>
is a parameter definition.
--window
is the parameter flag.
<height>
, <width>
, <h>
, <y>
,
<x>
, and <y>
are all parameter variables.
x
and @
are punctuators.
Set window to <height> by <width> is a parameter description.
{ setwin($width,$height); }
is a parameter action.
--window
parameter.
As indicated above, a parameter specification consists of three parts: the parameter definition, a textual description, and any actions to be performed when the parameter is matched.
The parameter definition consists of a leading flag or parameter variable, followed by any number of parameter variables or punctuators, optionally separated by spaces. The parameter definition is terminated by the first tab that is encountered after the start of the parameter definition. At least one trailing tab must be present.
For example, all of the following are valid Getopt::Declare parameter definitions:
-v in=<infile> +range <from>..<to> --lines <start> - <stop> ignore bad lines <outfile>
Note that each of the above examples has at least one trailing tab (even if you can't see them)!. Note too that this hodge-podge of parameter styles is certainly not recommended within a single program, but is shown so as to illustrate some of the range of parameter syntax conventions Getopt::Declare supports.
The spaces between components of the parameter definition are optional but significant, both in the definition itself and in the arguments that the definition may match. If there is no space between components in the specification, then no space is allowed between corresponding arguments on the command-line. If there is space between components of the specification, then space between those components is optional on the command-line.
For example, the --lines
parameter above matches:
--lines1-10 --lines 1-10 --lines 1 -10 --lines 1 - 10 --lines1- 10
If it were instead specified as:
--lines <start>-<stop>
then it would match only:
--lines1-10 --lines 1-10
Note that the optional nature of spaces in parameter specification implies that flags and punctuators cannot contain the character '<' (which is taken as the delimiter for a parameter variable) nor the character '[' (which introduces an optional parameter component - see Optional parameter components).
By default, a parameter variable will match a single blank-terminated or comma-delimited string. For example, the parameter:
-val <value>
would match any of the following arguments:
-value # <value> <- "ue" -val abcd # <value> <- "abcd" -val 1234 # <value> <- "1234" -val "a value" # <value> <- "a value"
It is also possible to restrict the types of values which may be matched by a given parameter variable. For example:
-limit <threshold:n> Set threshold to some (real) value -count <N:i> Set count to <N> (must be an integer)
If a parameter variable is suffixed with ``:n'', it will match any reasonable numeric value, whilst the ``:i'' suffix restricts a parameter variable to only matching integer values. These two ``type specifiers'' are the simplest examples of a much more powerful mechanism, which allows parameter variables to be restricted to matching any specific regular expression. See Defining new parameter variable types.
Parameter variables are treated as scalars by default, but this too
can be altered. Any parameter variable immediately followed by
an ellipsis (...
) is treated as a list variable, and matches its
specified type sequentially as many times as possible. For example,
the parameter specification:
-pages <page:i>...
would match either of the following arguments:
-pages 1 -pages 1 2 7 20
Note that both scalar and list parameter variables are ``respectful'' of the flags of other parameters as well as their own trailing punctuators. For example, given the specifications:
-a -b <b_list>... -c <c_list>... ;
The following arguments will be parsed as indicated:
-b -d -e -a # <b_list> <- ("-d", "-e") -b -d ; # <b_list> <- ("-d", ";") -c -d ; # <c_list> <- ("-d")
List parameter variables are also ``repectful'' of the needs of subsequent parameter variables. That is, a parameter specification like:
-copy <files>... <dir>
will behave as expected, putting all but the last string after the -copy
flag into the parameter variable <files>
, whilst the very
last string is assigned to <dir>
.
Except for the leading flag, any part of a parameter definition may be made optional by placing it in square brackets. For example:
+range <from> [..] [<to>]
which matches any of:
+range 1..10 +range 1.. +range 1 10 +range 1
List parameter variables may also be made optional (the ellipsis must follow the parameter variable name immediately, so it goes inside the square brackets):
-list [<page>...]
Two or more parameter components may be made jointly optional, by specifying them in the same pair of brackets. Optional components may also be nested. For example:
-range <from> [.. [<to>] ]
Scalar optional parameter variables (such as [<to>]
)
are given undefined values if they are skipped during
a successful parameter match. List optional parameter variables (such as
[<page>...]
) are assigned an empty list if unmatched.
One important use for optional punctuators is to provide abbreviated versions of specific flags. For example:
-num[eric] # Match "-num" or "-numeric" -lexic[ographic]al # Match "-lexical" or "-lexicographical" -b[ells+]w[histles] # Match "-bw" or "-bells+whistles"
Note that the actual flags for these three parameters are -num
, -lexic
and -b
, respectively.
Providing a textual description for each parameter (or parameter variant) is optional, but strongly recommended. Apart from providing internal documentation, parameter descriptions are used in the automatically-generated usage information provided by Getopt::Declare.
Descriptions may be placed after the first tab(s)
following the
parameter definition and may be continued on subsequent lines,
provided those lines do not contain any tabs after the first
non-whitespace character (because any such line will instead be
treated as a new parameter specification). The description is
terminated by a blank line, an action specification (see below) or
another parameter specification.
For example:
-v Verbose mode in=<infile> Specify input file (will fail if file does not exist)
+range <from>..<to> Specify range of columns to consider --lines <start> - <stop> Specify range of lines to process
ignore bad lines Ignore bad lines :-)
<outfile> Specify an output file
The parameter description may also contain special directives which alter the way in which the parameter is parsed. See the various subsections of ADVANCED FEATURES for more information.
A common mistake is to use tabs to separate components of a parameter description:
-delete <filename> Delete the named file -d <filename> Delete the named file
The tabs after "-delete"
and "-d"
do a good job of lining up the
two "<filename>"
parameter variables, but they also mark the
start of the description, which means that after descriptions are
stripped, the two parameters are:
-delete -d
The solution is to use spaces, not tabs, to align components within a parameter specification.
Each parameter specification may also include one or more blocks of Perl code, specified in a pair of curly brackets (which must start on a new line).
Each action is executed as soon as the corresponding parameter is successfully matched in the command-line (but see Deferred actions for a means of delaying this response).
For example:
-v Verbose mode { $::verbose = 1; } -q Quiet mode { $::verbose = 0; }
Actions are executed (as do
blocks) in the package in which the
Getopt::Declare object containing them was created. Hence they
have access to all package variables and functions in that namespace.
In addition, each parameter variable belonging to the corresponding parameter is made available as a (block-scoped) Perl variable with the same name. For example:
+range <from>..<to> Set range { setrange($from, $to); }
-list <page:i>... Specify pages to list { foreach (@page) { list($_) if $_ > 0; } }
Note that scalar parameter variables become scalar Perl variables, and list parameter variables become Perl arrays.
Within an action the following variables are also available:
$_PARAM_
%_PUNCT_
-v[erbose] Set verbose mode (doubly verbose if full word used) { if ($_PUNCT_{"erbose"}) { $verbose = 2; } else { $verbose = 1; } }
%_FOUND_
-q
and -v
parameters mutually exclusive
(but see Parameter dependencies for a much easier way to achieve
this effect):
-v Set verbose mode { die "Can't be verbose *and* quiet!\n" if $_FOUND_{"-q"}; }
-q Set quiet mode { die "Can't be quiet *and* verbose!\n" if $_FOUND_{"-v"}; }
For reasons that will be explained in Rejection and termination,
a given parameter is not marked as found until after its
associated actions are executed. That is, $_FOUND_{$_PARAM_}
will not
(usually) be true during a parameter action.
Note that, although numerous other internal variables on which the generated parser relies are also visible within parameter actions, accessing any of them may have Dire Consequences. Moreover, these other variables may no longer be accessible (or even present) in future versions of Getopt::Declare. All such internal variables have names beginning with an underscore. Avoiding such variables names will ensure there are no conflicts between actions and the parser itself.
Whenever a Getopt::Declare object is created, the current command-line
is parsed sequentially, by attempting to match each parameter
in the object's specification string against the current elements in the
@ARGV
array (but see Parsing from other sources). The order
in which parameters are compared against the arguments in @ARGV
is determined by three rules:
-quiet
rather than the parameter -q <string>
, even if the -q
parameter was defined first.
Parameter variants with the most components are
matched first. Hence the argument ``-rand 12345'' would be parsed as matching
the parameter variant -rand <seed>
, rather than the
variant -rand
, even if the ``shorter'' -rand
variant was defined first.
Otherwise, parameters are matched in the order they are defined.
Note, however, that the arguments themselves are considered strictly in the order they appear on the command line. That is: Getopt::Declare takes the first (leftmost) argument and compares it against all the parameter specifications in the order described above. Then it gets the second argument and does the same. Et cetera. So, whilst parameters are considered ``flags-first-by-length'', arguments are considered ``left-to-right''. If that seems paradoxical, you probably need to review the difference between ``arguments'' and ``parameters'', as explained in Terminology.
Elements of @ARGV
which do not match any defined parameter are collected
during the parse and are eventually put back into @ARGV
(see Strict and non-strict command-line parsing).
By default, a Getopt::Declare object parses the command-line in
a case-sensitive manner. The [nocase]
directive enables a specific
parameter (or, alternatively, all parameters) to be matched
case-insensitively.
If a [nocase]
directive is included in the description of a
specific parameter variant, then that variant (only) will be matched
without regard for case. For example, the specification:
-q Quiet mode [nocase]
-v Verbose mode
means that the arguments ``-q'' and ``-Q'' will both match the -q
parameter, but
that only ``-v'' (and not ``-V'') will match the -v
parameter.
If a [nocase]
directive appears anywhere outside a parameter description,
then the entire specification is declared case-insensitive and all parameters
defined in that specification are matched without regard to case.
It is sometimes useful to be able to terminate command-line
processing before all arguments have been parsed. To this end,
Getopt::Declare provides a special local operator (finish
) which
may be used within actions. The finish
operator takes a single optional
argument. If the argument is true (or omitted),
command-line processing is terminated at once (although the current
parameter is still marked as having been successfully matched). For
example:
-- Traditional argument list terminator { finish }
-no-- Use non-traditional terminator instead { $nontrad = 1; }
## Non-traditional terminator (only valid if -no-- flag seen) { finish($nontrad); }
It is also possible to reject a single parameter match from within an action (and then continue trying other candidates). This allows actions to be used to perform more sophisticated tests on the type of a parameter variable, or to implement complicated parameter interdependencies.
To reject a parameter match, the reject
operator is used. The
reject
operator takes an optional argument.
If the argument is true (or was omitted), the current parameter
match is immediately rejected. For example:
-ar <R:n> Set aspect ratio (must be in the range (0..1]) { $::sawaspect++; reject $R <= 0 || $R > 1 ; setaspect($R); }
-q Quiet option (not available on Wednesdays) { reject((localtime)[6] == 3); $::verbose = 0; }
Note that any actions performed before the call to reject
will
still have effect (for example, the variable $::sawaspect
remains
incremented even if the aspect ratio parameter is subsequently rejected).
The reject
operator may also take a second argument, which is
used as an error message if the rejected argument subsequently
fails to match any other parameter. For example:
-q Quiet option (not available on Wednesdays) { reject((localtime)[6] == 3 => "Not today!"); $::verbose = 0; }
As was mentioned in Type of parameter variables, parameter variables can be restricted to matching only numbers or only integers by using the type specifiers ``:n'' and ``:i''. Getopt::Declare provides seven other inbuilt type specifiers, as well as two mechanisms for defining new restrictions on parameter variables.
The other inbuilt type specifiers are:
For example:
-repeat <count:+i> Repeat <count> times (must be > 0)
-scale <factor:0+n> Set scaling factor (cannot be negative)
Alternatively, parameter variables can be restricted to matching a specific regular expression, by providing the required pattern explicitly (in matched ``/'' delimiters after the ``:''). For example:
-parity <p:/even|odd|both/> Set parity (<p> must be "even", "odd" or "both")
-file <name:/\w*\.[A-Z]{3}/> File name must have a three- capital-letter extension
If an explicit regular expression is used, there are three ``convenience'' extensions available:
%T
appears in a pattern, it is translated to a negative
lookahead containing the parameter variable's trailing context.
Hence the parameter definition:
-find <what:/(%T\.)+/> ;
ensures that the command line argument ``-find abcd;'' causes <what>
to match ``abcd'', not ``abcd;''.
%D
appears in a pattern, it is translated into a subpattern
which matches any single digit (like a \d
), but only if that digit
would not match the parameter variable's trailing context.
Hence %D
is just a convenient short-hand for (?:%T\d)
(and is actually
implemented that way).
%F
appears anywhere in a
pattern, it causes the pattern not to reject strings which would
otherwise match another flag. By default, no inbuilt type allows
arguments to match a flag.
Explicit regular expressions are very powerful, but also cumbersome to use (or reuse) in some situations. Getopt::Declare provides a general ``parameter variable type definition'' mechanism to simplify such cases.
To declare a new parameter variable type, the [pvtype:...]
directive
is used. A [pvtype...]
directive specifies the name, matching
pattern, and action for the new parameter variable type (though both
the pattern and action are optional).
The name string may be any whitespace-terminated sequence of characters which does not include a ``>''. The name may also be specified within a pair of quotation marks (single or double) or within any Perl quotelike operation. For example:
[pvtype: num ] # Makes this valid: -count <N:num> [pvtype: 'a num' ] # Makes this valid: -count <N:a num> [pvtype: q{nbr} ] # Makes this valid: -count <N:nbr>
The pattern is used in initial matching of the parameter variable. Patterns are normally specified as a ``/.../''-delimited Perl regular expression:
[pvtype: num /\d+/ ] [pvtype: 'a num' /\d+(?:\.\d*)/ ] [pvtype: q{nbr} /[+-]?\d+/ ]
Note that the regular expression should not contain any capturing parentheses, as this will interfere with the correct processing of subsequent parameter variables.
Alternatively the pattern associated with a new type may be specified as a ``:'' followed by the name of another parameter variable type (in quotes if necessary). In this case the new type matches the same pattern (and action! - see below) as the named type. For example:
[pvtype: num :+i ] # <X:num> is the same as <X:+i> [pvtype: 'a num' :n ] # <X:a num> is the same as <X:n> [pvtype: q{nbr} :'a num' ] # <X:nbr> is also the same as <X:n>
As a third alternative, the pattern may be omitted altogether, in which case the new type matches whatever the inbuilt pattern ``:s'' matches.
The optional action which may be included in any [pvtype:...]
directive is executed after the corresponding parameter variable
matches the command line but before any actions belonging to the
enclosing parameter are executed. Typically, such type actions
will call the reject
operator (see Termination and rejection)
to test extra conditions, but any valid Perl code is acceptable. For
example:
[pvtype: num /\d+/ { reject if (localtime)[6]==3 } ] [pvtype: 'a num' :n { print "a num!" } ] [pvtype: q{nbr} :'a num' { reject $::no_nbr } ]
If a new type is defined in terms of another (for example, ``:a num'' and ``:nbr'' above), any action specified by that new type is prepended to the action of that other type. Hence:
$::no_nbr
variable
is true. Otherwise it next prints out ``a num!'' (like its parent type
``:a num''), and finally rejects the match if it's Wednesday (like its
grandparent type ``:num'').
When a type action is executed (as part of a particular parameter match), three local variables are available:
$_VAL_
$_VAL_
,
that changed value becomes the ``real'' value of the corresponding parameter
variable (see the Roman numeral example below).
$_VAR_
$_PARAM_
Here is a example of the use of these variables:
$specs = q{ [pvtype: type /[OAB]|AB')/ ] [pvtype: Rh? /Rh[+-]/ ] [pvtype: days :+i { reject $_VAL_<14 " $_PARAM_ (too soon!)"} ]
-donated <D:days> Days since last donation -applied <A:days> Days since applied to donate
-blood <type:type> [<rh:Rh?>] Specify blood type and (optionally) rhesus factor }; $args = Getopt::Declare->new($specs);
In the above example, the ``:days'' parameter variable type is defined
to match whatever the ``:+i'' type matches (that is positive, non-zero
integers), with the proviso that the matching value ($_VAL_
) must
be at least 14. If a shorter value is specified for <D>
,
or <A>
parameter variables, then Getopt::Declare would
issue the following (respective) error messages:
Error: -donated (too soon!) Error: -applied (too soon!)
Note that the ``inbuilt'' parameter variable types (``i'', ``n'', etc.) are really just predefined type names, and hence can be altered if necessary:
$args = Getopt::Declare->new(<<'EOPARAM');
[pvtype: 'n' /[MDCLXVI]+/ { reject !($_VAL_=to_roman $_VAL_) } ]
-index <number:n> Index number { print $data[$number]; } EOPARAM
The above [pvtype:...]
directive means that all parameter variables
specified with a type ``:n'' henceforth only match valid Roman
numerals, but that any such numerals are automatically converted to
ordinary numbers (by passing $_VAL_
) through the to_roman
function).
Hence the requirement that all ``:n'' numbers now must be Roman can be
imposed transparently, at least as far as the actual parameter
variables which use the ``:n'' type are concerned. Thus $number
can
be still used to index the array @data
despite the new restrictions
placed upon it by the redefinition of type ``:n''.
Note too that, because the ``:+n'' and ``:0+n'' types are implicitly defined in terms of the original ``:n'' type (as if the directives:
[pvtype: '+n' :n { reject if $_VAL <= 0 } ] [pvtype: '0+n' :n { reject if $_VAL < 0 } ]
were included in every specification), the above redefinition of ``:n'' affects those types as well. In such cases the format conversion is performed before the ``sign'' tests (in other words, the ``inherited'' actions are performed after any newly defined ones).
Parameter variable type definitions may appear anywhere in a Getopt::Declare specification and are effective for the entire scope of the specification. In particular, new parameter variable types may be defined after they are used.
If a parameter description is omitted, or consists entirely of
whitespace, or contains the special directive [undocumented]
, then
the parameter is still parsed as normal, but will not appear in the
automatically generated usage information (see Usage information).
Apart from allowing for ``secret'' parameters (a dubious benefit), this feature enables the programmer to specify some undocumented action which is to be taken on encountering an otherwise unknown argument. For example:
<unknown> { handle_unknown($unknown); }
Sometimes it is desirable to provide two or more alternate flags for
the same behaviour (typically, a short form and a long form). To
reduce the burden of specifying such pairs, the special directive
[ditto]
is provided. If the description of a parameter begins with
a [ditto]
directive, that directive is replaced with the
description for the immediately preceding parameter (including any
other directives). For example:
-v Verbose mode --verbose [ditto] (long form)
In the automatically generated usage information this would be displayed as:
-v Verbose mode --verbose " " (long form)
Furthermore, if the ``dittoed'' parameter has no action(s)
specified, the
action(s)
of the preceding parameter are reused. For example, the
specification:
-v Verbose mode { $::verbose = 1; } --verbose [ditto]
would result in the --verbose
option setting $::verbose
just like the
-v
option. On the other hand, the specification:
-v Verbose mode { $::verbose = 1; } --verbose [ditto] { $::verbose = 2; }
would give separate actions to each flag.
It is often desirable or necessary to defer actions taken in response to particular flags until the entire command-line has been parsed. The most obvious case is where modifier flags must be able to be specified after the command-line arguments they modify.
To support this, Getopt::Declare provides a local operator (defer
) which
delays the execution of a particular action until the command-line processing
is finished. The defer
operator takes a single block, the execution of which
is deferred until the command-line is fully and successfully parsed. If
command-line processing fails for some reason (see DIAGNOSTICS), the
deferred blocks are never executed.
For example:
<files>... Files to be processed { defer { foreach (@files) { proc($_); } } }
-rev[erse] Process in reverse order { $::ordered = -1; }
-rand[om] Process in random order { $::ordered = 0; }
With the above specification, the -rev
and/or -rand
flags can be
specified after the list of files, but still affect the processing of
those files. Moreover, if the command-line parsing fails for some reason
(perhaps due to an unrecognized argument), the deferred processing will
not be performed.
Like some other Getopt:: packages, Getopt::Declare allows parameter flags to be ``clustered''. That is, if two or more flags have the same ``flag prefix'' (one or more leading non-whitespace, non-alphanumeric characters), those flags may be concatenated behind a single copy of that flag prefix. For example, given the parameter specifications:
-+ Swap signs -a Append mode -b Bitwise compare -c <FILE> Create new file +del Delete old file +e <NICE:i> Execute (at specified nice level) when complete
The following command-lines (amongst others) are all exactly equivalent:
-a -b -c newfile +e20 +del -abc newfile +dele20 -abcnewfile+dele20 -abcnewfile +e 20del
The last two alternatives are correctly parsed because Getopt::Declare allows flag clustering at any point where the remainder of the command-line being processed starts with a non-whitespace character and where the remaining substring would not otherwise immediately match a parameter flag.
Hence the trailing ``+dele20'' in the third command-line example is parsed as ``+del +e20'' and not ``-+ del +e20''. This is because the previous ``-'' prefix is not propagated (since the leading ``+del'' is a valid flag).
In contrast, the trailing ``+e 20del'' in the fourth example is parsed as
``+e 20 +del'' because, after the `` 20'' is parsed (as the integer
parameter variable <NICE>
), the next characters are ``del'',
which do not form a flag themselves unless prefixed with the
controlling ``+''.
In some circumstances a clustered sequence of flags on the command-line might also match a single (multicharacter) parameter flag. For example, given the specifications:
-a Blood type is A -b Blood type is B -ab Blood type is AB -ba Donor has a Bachelor of Arts
A command-line argument ``-aba'' might be parsed as
``-a -b -a'' or ``-a -ba'' or ``-ab -a''. In all such
cases, Getopt::Declare prefers the longest unmatched flag first.
Hence the previous example would be parsed as ``-ab -a'', unless
the -ab
flag had already appeared in the command-line (in which
case, it would be parsed as ``-a -ba'').
These rules are designed to produce consistency and ``least surprise'',
but (as the above example illustrates) may not always do so. If the
idea of unconstrained flag clustering is too libertarian for a particular
application, the feature may be restricted (or removed completely),
by including a [cluster:...]
directive anywhere in the specification string.
The options are:
[cluster: any]
[cluster: flags]
[cluster: singles]
[cluster: none]
For example:
$args = Getopt::Declare->new(<<'EOSPEC'); -a Append mode -b Back-up mode -bu [ditto] -c <file> Copy mode -d [<file>] Delete mode -e[xec] Execute mode
[cluster:singles] EOSPEC
In the above example, only the -a
and -b
parameters may be clustered.
The -bu
parameter is excluded because it consists of more than one
letter, whilst the -c
and -d
parameters are excluded because they
take (or may take, in -d
's case) a variable. The -e[xec]
parameter
is excluded because it may take a trailing punctuator ([xec]
).
By comparison, if the directive had been [cluster: flags]
, then
-bu
could be clustered, though -c
, -d
and -e[xec]
would
still be excluded since they are not ``pure flags'').
``Strictness'' in Getopt::Declare refers to the way in which unrecognized
command-line arguments are handled. By default, Getopt::Declare is
``non-strict'', in that it simply skips silently over any unrecognized
command-line argument, leaving it in @ARGV
at the conclusion of
command-line processing (but only if they were originally parsed
from @ARGV
).
No matter where they came from, the remaining arguments are also available
by calling the unused
method on the Getopt::Declare object, after it
has parsed. In a list context, this method returns a list of the
unprocessed arguments; in a scalar context a single string with the unused
arguments concatenated is returned.
Likewise, there is a used
method that returns the arguments that were
successfully processed by the parser.
However, if a new Getopt::Declare object is created with a
specification string containing the [strict]
directive (at any
point in the specification):
$args = Getopt::Declare->new(<<'EOSPEC');
[strict]
-a Append mode -b Back-up mode -c Copy mode EOSPEC
then the command-line is parsed ``strictly''. In this case, any
unrecognized argument causes an error message (see DIAGNOSTICS) to
be written to STDERR, and command-line processing to (eventually)
fail. On such a failure, the call to Getopt::Declare::new()
returns
undef
instead of the usual hash reference.
The only concession that ``strict'' mode makes to the unknown is that,
if command-line processing is prematurely terminated via the
finish
operator, any command-line arguments which have not yet
been examined are left in @ARGV
and do not cause the parse to fail (of
course, if any unknown arguments were encountered before the
finish
was executed, those earlier arguments will cause
command-line processing to fail).
The ``strict'' option is useful when all possible parameters can be specified in a single Getopt::Declare object, whereas the ``non-strict'' approach is needed when unrecognized arguments are either to be quietly tolerated, or processed at a later point (possibly in a second Getopt::Declare object).
Getopt::Declare provides five other directives which modify the
behaviour of the command-line parser in some way. One or more of these
directives may be included in any parameter description. In addition,
the [mutex:...]
directive may also appear in any usage ``decoration''
(see Usage information).
Each directive specifies a particular set of conditions that a
command-line must fulfil (for example, that certain parameters may not
appear on the same command-line). If any such condition is violated,
an appropriate error message is printed (see DIAGNOSTICS).
Furthermore, once the command-line is completely parsed, if any
condition was violated, the program terminates
(whilst still inside Getopt::Declare::new()
).
The directives are:
[required]
If an argument matching a ``required'' flag is not found in the
command-line, an error message to that effect is issued,
command-line processing fails, and Getopt::Declare::new()
returns
undef
.
[repeatable]
However, it is sometimes useful to allow a particular parameter to match
more than once. Any parameter whose description includes the directive
[repeatable]
is never excluded as a potential argument match, no matter
how many times it has matched previously:
-nice Increase nice value (linearly if repeated) [repeatable] { set_nice( get_nice()+1 ); }
-w Toggle warnings [repeatable] for the rest of the command-line { $warn = !$warn; }
As a more general mechanism is a [repeatable]
directive appears in a
specification anywhere other than a flag's description, then all parameters
are marked repeatable:
[repeatable]
-nice Increase nice value (linearly if repeated) { set_nice( get_nice()+1 ); }
-w Toggle warnings for the rest of the command-line { $warn = !$warn; }
[mutex: <flag list>]
[mutex:...]
directive specifies that the parameters whose
flags it lists are mutually exclusive. That is, no two or more of them
may appear in the same command-line. For example:
-case set to all lower case -CASE SET TO ALL UPPER CASE -Case Set to sentence case -CaSe SeT tO "RAnSom nOTe" CasE
[mutex: -case -CASE -Case -CaSe]
The interaction of the [mutex:...]
and [required]
directives is
potentially awkward in the case where two ``required'' arguments are
also mutually exclusive (since the [required]
directives insist
that both parameters must appear in the command-line, whilst the
[mutex:...]
directive expressly forbids this).
Getopt::Declare resolves such contradictory constraints by relaxing the meaning of ``required'' slightly. If a flag is marked ``required'', it is considered ``found'' for the purposes of error checking if it or any other flag with which it is mutually exclusive appears on the command-line.
Hence the specifications:
-case set to all lower case [required] -CASE SET TO ALL UPPER CASE [required] -Case Set to sentence case [required] -CaSe SeT tO "RAnSom nOTe" CasE [required]
[mutex: -case -CASE -Case -CaSe]
mean that exactly one of these four flags must appear on the command-line, but that the presence of any one of them will suffice to satisfy the ``requiredness'' of all four.
It should also be noted that mutual exclusion is only tested for after a parameter has been completely matched (that is, after the execution of its actions, if any). This prevents ``rejected'' parameters (see Termination and rejection) from incorrectly generating mutual exclusion errors. However, it also sometimes makes it necessary to defer the actions of a pair of mutually exclusive parameters (for example, if those actions are expensive or irreversible).
[excludes: <flag list>]
[excludes:...]
directive provides a ``pairwise'' version of
mutual exclusion, specifying that the current parameter is mutually exclusive
with all the other parameters lists, but those other parameters are not
mutually exclusive with each other. That is, whereas the specification:
-left Justify to left margin -right Justify to right margin -centre Centre each line
[mutex: -left -right -centre]
means that only one of these three justification alternatives can ever be used at once, the specification:
-left Justify to left margin -right Justify to right margin -centre Centre each line [excludes: -left -right]
means that -left
and -right
can still be used together
(probably to indicate ``left and right'' justification), but that
neither can be used with -centre
. Note that the [excludes:...]
directive also differs from the [mutex:...]
in that it is always
connected with a paricular parameter, implicitly
using the flag of that parameter as the target of exclusion.
[requires: <condition>]
[requires]
directive specifies a set of flags which
must also appear in order for a particular flag to be permitted in the
command-line. The condition is a boolean expression, in which the
terms are the flags or various parameters, and the operations are
&&
, ||
, !
, and bracketting. For example, the specifications:
-num Use numeric sort order -lex Use "dictionary" sort order -len Sort on length of line (or field)
-field <N:+i> Sort on value of field <N>
-rev Reverse sort order [requires: -num || -lex || !(-len && -field)]
means that the -rev
flag is allowed only if either the -num
or the
-lex
parameter has been used, or if it is not true that
both the -len
and the -field
parameters have been used.
Note that the operators &&
, ||
, and !
retain their normal
Perl precedences.
Getopt::Declare normally parses the contents of @ARGV
, but can
be made to parse specified files instead. To accommodate this,
Getopt::Declare::new()
takes an optional second parameter, which specifies
a file to be parsed. The parameter may be either:
IO::Handle
reference or a filehandle GLOB referenceGetopt::Declare::new()
reads the corresponding handle until
end-of-file, and parses the resulting text (even if it is an empty string).
'-CONFIG'
Getopt::Declare::new()
looks for the files
$ENV{HOME}/.${progname}rc and $ENV{PWD}/.${progname}rc,
concatenates their contents, and parses that.
If neither file is found (or if both are inaccessible)
Getopt::Declare::new()
immediately returns zero. If a
file is found but the parse subsequently fails, undef
is returned.
'-BUILD'
Getopt::Declare::new()
builds a parser from the
supplied grammar and returns a reference to it, but does not parse anything.
See The Getopt::Declare::code() method and
The Getopt::Declare::parse() method.
'-SKIP'
or the single value undef
or nothingGetopt::Declare::new()
immediately returns zero.
This alternative is useful when using a FileHandle
:
my $args = Getopt::Declare->new($grammar, new FileHandle ($filename) || -SKIP);
because it makes explicit what happens if FileHandle::new()
fails. Of course,
if the -SKIP
alternative were omitted, <Getopt::Declare::new> would
still return immediately, having found undef
as its second argument.
Getopt::Declare::new()
treats the array elements as a
list of filenames, concatenates the contents of those files, and parses that.
If the list does not denote any accessible file(s)
Getopt::Declare::new()
immediately returns zero. If matching files
are found, but not successfully parsed, undef
is returned.
Getopt::Declare::new()
parses that string directly.
Note that when Getopt::Declare::new()
parses from a
source other than @ARGV
, unrecognized arguments are not
placed back in @ARGV
.
After command-line processing is completed, the object returned by
Getopt::Declare::new()
will have the following features:
The value of the element will be a reference to another hash which contains the names and values of each distinct parameter variable and/or punctuator which was matched by the parameter. Punctuators generate string values containing the actual text matched. Scalar parameter variables generate scalar values. List parameter variables generate array references.
As a special case, if a parameter consists of a single component (either a single flag or a single parameter variable), then the value for the corresponding hash key is not a hash reference, but the actual value matched.
The following example illustrates the various possibilities:
$args = Getopt::Declare->new( q{
-v <value> [etc] One or more values <infile> Input file [required] -o <outfiles>... Output files } );
if ( $args->{'-v'} ) { print "Using value: ", $args->{'-v'}{'<value>'}; print " (et cetera)" if $args->{'-v'}{'etc'}; print "\n"; }
open INFILE, $args->{'<infile>'} or die; @data = <INFILE>;
foreach $outfile ( @{$args->{'-o'}{'<outfiles>'}} ) { open OUTFILE, ">$outfile" or die; print OUTFILE process(@data); close OUTFILE; }
The values which are assigned to the various hash elements are copied from the corresponding blocked-scoped variables which are available within actions. In particular, if the value of any of those block-scoped variables is changed within an action, that changed value is saved in the hash. For example, given the specification:
$args = Getopt::Declare->new( q{
-ar <R:n> Set aspect ratio (will be clipped to [0..1]) { $R = 0 if $R < 0; $R = 1 if $R > 1; } } );
then the value of $args->{'-ar'}{'<R>'}
will always be between zero and one.
@ARGV
array@ARGV
(whereas
all recognized arguments will have been removed).
Note that these remaining arguments will be in sequential elements
(starting at $ARGV[0]
), not in their original positions in
@ARGV
.
Getopt::Declare::usage()
methodusage()
method may be called
to explicitly print out usage information corresponding to the specification
with which it was built. See Usage information for more details.
If the usage()
method is called with an argument, that argument is passed
to exit
after the usage information is printed (the no-argument version of
usage()
simply returns at that point).
Getopt::Declare::version()
methodversion()
,
which prints out the name of the enclosing program, the last time it
was modified, and the value of $::VERSION
, if it is defined.
Note that this implies that all Getopt::Declare objects in a
single program will print out identical version information.
Like the usage()
method, if version
is passed an argument, it
will exit with that value after printing.
Getopt::Declare::parse()
methodGetopt::Declare::new()
is called with a second parameter '-BUILD'
(see Parsing from other sources, it constructs and returns a
parser, without parsing anything.
The resulting parser object can then be used to parse multiple sources,
by calling its parse()
method.
Getopt::Declare::parse()
takes an optional parameter which specifies
the source of the text to be parsed (it parses @ARGV
if the
parameter is omitted). This parameter takes the same set of values as the
optional second parameter of Getopt::Declare::new()
(see Parsing from other sources).
Getopt::Declare::parse()
returns true if the source is located and
parsed successfully. It returns a defined false (zero) if the source is
not located. An undef
is returned if the source is located, but not
successfully parsed.
Thus, the following code first constructs a parser for a series of alternate configuration files and the command line, and then parses them:
# BUILD PARSERS my $config = Getopt::Declare->new($config_grammar, -BUILD); my $cmdline = Getopt::Declare->new($cmdline_grammar, -BUILD);
# TRY STANDARD CONFIG FILES $config->parse(-CONFIG)
# OTHERWISE, TRY GLOBAL CONFIG or $config->parse('/usr/local/config/.demo_rc')
# OTHERWISE, TRY OPENING A FILEHANDLE (OR JUST GIVE UP) or $config->parse(new FileHandle (".config") || -SKIP);
# NOW PARSE THE COMMAND LINE
$cmdline->parse() or die;
Getopt::Declare::code()
methodGetopt::Declare::new()
. The Getopt::Declare::code()
method returns a string containing the complete command-line processing
code, as a single do
block plus a leading package
declaration.
Getopt::Declare::code()
takes as its sole argument a string
containing the complete name of the package (for the leading
package
declaration in the generated code). If this string is empty
or undefined, the package name defaults to ``main''.
Since the default behaviour of Getopt::Declare::new()
is to execute
the command-line parsing code it generates, if the goal is only to
generate the parser code, the optional second '-BUILD' parameter
(see Parsing from other sources) should be specified when calling
<Getopt::Declare::new()>.
For example, the following program ``inlines'' a Getopt::Declare
specification, by extracting it from between the first ``=for
Getopt::Declare'' and the next ``=cut'' appearing on STDIN
:
use Getopt::Declare;
sub encode { return Getopt::Declare->new(shift,-BUILD)->code() || die }
undef $/; if (<>) { s {^=for\s+Getopt::Declare\s*\n(.*?)\n=cut} {'my (\$self,$source) = ({});'.encode($1).' or die "\n";'} esm; }
print;
Note that the generated inlined version expects to find a lexical variable
named $source
, which tells it what to parse (this variable is
normally set by the optional parameters of Getopt::Declare::new()
or
Getopt::Declare::parse()
).
The inlined code leaves all extracted parameters in the lexical
variable $self
and does not autogenerate help or version flags
(since there is no actual Getopt::Declare object in the inlined code
through which to generate them).
The specification passed to Getopt::Declare::new
is used (almost
verbatim) as a ``usage'' display whenever usage information is
requested.
Such requests may be made either by specifying an argument matching
the help parameter (see Help parameters) or by explicitly calling
the Getopt::Declare::usage()
method (through an action or after
command-line processing):
$args = Getopt::Declare->new( q{
-usage Show usage information and exit { $self->usage(0); }
+usage Show usage information at end of program } );
# PROGRAM HERE
$args->usage() if $args->{'+usage'};
The following changes are made to the original specification before it is displayed:
[ditto]
directive is converted to an appropriate set of ``ditto'' marks,
any text in matching square brackets (including any directive) is deleted,
any parameter variable type specifier (``:i'', ``:n'', ``:/pat/'', etc.) is deleted.
Otherwise, the usage information displayed retains all the formatting present in the original specification.
In addition to this information, if the input source is @ARGV, Getopt::Declare displays three sample command-lines: one indicating the normal usage (including any required parameter variables), one indicating how to invoke help (see Help parameters), and one indicating how to determine the current version of the program (see Version parameters).
The usage information is printed to STDOUT
and (since Getopt::Declare
tends to encourage longer and better-documented parameter lists) if
the IO::Pager package is available, an IO::Pager object is used to
page out the entire usage documentation.
It is sometimes convenient to add other ``decorative'' features to a program's usage information, such as subheadings, general notes, separators, etc. Getopt::Declare accommodates this need by ignoring such items when interpreting a specification string, but printing them when asked for usage information.
Any line which cannot be interpreted as either a parameter
definition, a parameter description, or a parameter action, is treated
as a ``decorator'' line, and is printed verbatim (after any square
bracketted substrings have been removed from it). If your decoration needs
square brackets, you need to escape the opening square bracket with a
backslash, e.g. \[decoration]
.
The key to successfully decorating Getopt::Declare usage information is to ensure that decorator lines are separated from any preceding parameter specification, either by an action or by an empty line. In addition, like a parameter description, a decorator line cannot contain a tab character after the first non-whitespace character (because it would then be treated as a parameter specification).
The following specification demonstrates various forms of usage
decoration. In fact, there are only four actual parameters (-in
,
-r
, -p
, and -out
) specified. Note in particular that leading
tabs are perfectly acceptible in decorator lines.
$args = Getopt::Declare->new(<<'EOPARAM');
============================================================ Required parameter:
-in <infile> Input file [required]
------------------------------------------------------------
Optional parameters:
(The first two are mutually exclusive) [mutex: -r -p]
-r[and[om]] Output in random order -p[erm[ute]] Output all permutations
---------------------------------------------------
-out <outfile> Optional output file
------------------------------------------------------------ Note: this program is known to run very slowly of files with long individual lines. ============================================================ EOPARAM
By default, Getopt::Declare automatically defines all of the following parameters:
-help Show usage information [undocumented] { $self->usage(0); } -Help [ditto] -HELP [ditto] --help [ditto] --Help [ditto] --HELP [ditto] -h [ditto] -H [ditto]
Hence, most attempts by the user to get help will automatically work successfully.
Note however that, if a parameter with any of these flags is
explicitly specified in the string passed to Getopt::Declare::new()
,
that flag (only) is removed from the list of possible help flags. For
example:
-w <pixels:+i> Specify width in pixels -h <pixels:+i> Specify height in pixels
would cause the -h
help parameter to be removed (although help
would still be accessible through the other seven alternatives).
Getopt::Declare also automatically creates a set of parameters which can be used to retrieve program version information:
-version Show version information [undocumented] { $self->version(0); } -Version [ditto] -VERSION [ditto] --version [ditto] --Version [ditto] --VERSION [ditto] -v [ditto] -V [ditto]
As with the various help commands, explicitly specifying a parameter with any of the above flags removes that flag from the list of version flags.
Getopt::Declare may issue the following diagnostics whilst parsing a command-line. All of them are fatal (the first five, instantly so):
[requires:...]
directive was not a well-formed boolean expression. Common problems
include: omitting a &&
/||
operator between two flags,
mismatched brackets, or using and
/or
instead of &&
/||
.
[requires:...]
directive, which
was not satisfied.
Damian Conway <damian@conway.org>
There are undoubtedly serious bugs lurking somewhere in this code.
If nothing else, it shouldn't take 1500 lines to explain a package that was designed for intuitive ease of use!
Bug reports and other feedback are most welcome at: https://rt.cpan.org/Public/Bug/Report.html?Queue=Getopt-Declare
Copyright (c) 1997-2000, Damian Conway. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the terms of the Perl Artistic License (see http://www.perl.com/perl/misc/Artistic.html)
Getopt::Declare - Declaratively Expressed Command-Line Arguments via Regular Expressions |