lex

lexical analyzer generator 

Command


SYNOPSIS

lex [-achlntTvw] [-LC] [-D defnfile] [-o outfile] [-p prefix] [-P proto] [-W filename] [file.l ...]


DESCRIPTION

The lex utility reads a description of a lexical syntax, in the form of regular expressions and actions, from file.l, or the standard input if no file.l is provided, or if the file is named -. It produces a set of tables that, together with additional prototype code, constitute a lexical analyzer to scan those expressions. The resulting scanner is suitable for use with yacc. For detailed information regarding the use of LEX, see the

LEX Programming Guide

.

By default, lex generates C code. You can generate C++ output with the -LC option. You can also generate Microsoft Windows compatible resource files by using the -w option.

The code prototype is taken from a different file depending on what language you want to use. The generated code is placed in a file named lex_yy with an appropriate language extension.

Language Codefile Definitions Prototype

C lex_yy.c none ROOTDIR/etc/yylex.c
C++ lex_yy.cpp lex_yy.hpp ROOTDIR/etc/yylex.cpp

You can use the -o option to change the default code file, and the -P option to change the prototype file. For C++, both a code and a definitions file are generated. The definitions go into lex_yy.hpp. To use a different definition file for C++, use the -D option.

Options

-a 

allows character classes to refer to the 8-bit ASCII characters (0200 through 0377). Normally, to save table space, character classes only apply to the 7-bit ASCII character set.

-c 

generates C code (lex only). As this is the default, we provide it only for compatibility with other implementations.

-D defnfile 

with -LC, outputs the C++ header and definitions into defnfile instead of lex_yy.hpp

-h 

prints a brief list of the options, and quits.

-l 

suppresses #line directives in the generated code.

-LC 

generates C++ code into lex_yy.cpp and C++ headers and definitions into lex_yy.hpp.

-n 

prevents changing the table sizes from turning on the -v option.

-o outfile 

writes the lexical analyzer onto the named outfile, instead of the default code file.

-p prefix 

uses the given prefix instead of the prefix yy in the generated code.

-P proto 

uses the named code file, instead of the default prototype file.

-t 

writes the lexical analyzer onto standard output, instead of the default code file. With -LC, the header is still placed in lex_yy.hpp.

-t 

writes a description of the analyzer to the l.out file.

-v 

prints the amount of space used by the various tables on the standard error stream.

-w 

produces a Microsoft Windows compatible resource output file (with the default name lex_yy.rc) in addition to the default output file.

-W filename 

writes a Microsoft Windows compatible resource output file to the specified filename.

LEX Tables

Some lex programs may cause one or more tables within lex to overflow. The ones most likely to be affected are: the NFA table, the DFA table, and the move table. You can change table sizes by inserting the appropriate line into the definition section of the LEX input, with the number size giving the number of entries to use.

Line Table Size Affected Default Maximum

%e size #NFA entries 800 4678
%n size #DFA entries 400 2977
%p size #move entries 2500 7000

Often, the NFA and DFA space may be reduced to make room for more move entries.


DIAGNOSTICS

Possible exit status values are:

0 

Successful completion.

1 

Failure because of any of the following:

— Cannot create output file
— Cannot open file
— Missing output file name after -D, -o, -P, -W
— Missing prefix after -p
— No lex rules
— No memory for DFA moves
— Out of NFA state space
— Out of DFA move space
— Out of DFA state space
— Push-back buffer overflow
— Read error on file
— Table too large for machine
— Too many character classes
— Too many translations
— Unknown option
— Write error on file
— Incomplete '%{' declaration
— No lex rules
— Token buffer overflow


LOCALE

A locale is the subset of a user's environment that depends on language and cultural conventions. A locale defines things such as the definition of characters, and the collation sequence of those characters. POSIX.2 defines a POSIX Locale, which is essentially USASCII, so that, for example, the characters a through z include all the lowercase letters.

Since LEX generates code that is then compiled before being executed, it is difficult for LEX to act properly on collation information. The POSIX.2 standard therefore does not require lex to accept any locales other then the POSIX Locale. MKS LEX accepts regular expressions in this locale only.


FILES

l.out 

scanner machine description

lex_yy.c 

generated C code

lex_yy.cpp 

generated C++ code

lex.yy.hpp 

generated C++ header file

lex_yy.rc 

generated resource file name

ROOTDIR/etc/yylex.c 

prototype LEX scanner for C

ROOTDIR/etc/yylex.cpp 

prototype LEX scanner for C++


PORTABILITY

Windows 10. Windows Server 2016. Windows Server 2019. Windows 11. Windows Server 2022. Windows Server 2025. All UNIX systems. POSIX.2.

The -T option writes the internal state tables to the l.out file.

The -a, -D, -h, -l, -LC, -o, -p, -P, -T, -w, and -W options are specific to this version of lex.

The ability to generate Microsoft Windows resource files is specific to MKS LEX.


SEE ALSO

Commands:
yacc

Miscellaneous:
lex, mks_lexplus

LEX Programming Guide


PTC MKS Toolkit 10.5 Documentation Build 40.