NAME EP3 - The Extensible Perl PreProcessor SYNOPSIS use EP3; [use EP3::{Extension}] # Language Specific Modules my $object = new EP3 file; $object->ep3_execute; [other methods that can be invoked] $object->ep3_process([$filename, [$condition]]); $object->ep3_output_file([$filename]); $object->ep3_parse_command_line; $object->ep3_modules([@modules]); $object->ep3_includes([@include_directories]); $object->ep3_reset; $object->ep3_end_comment([$string]); $object->ep3_start_comment([$string]); $object->ep3_line_comment([$string]); $object->ep3_delimeter([$string]); $object->ep3_gen_depend_list([$value]); $object->ep3_keep_comments([$value]); $object->ep3_protect_comments([$value]); $object->ep3_defines($string1=$string2); DESCRIPTION EP3 is a Perl5 program that preprocesses STDIN or some set of input files and produces an output file. EP3 only works on input files and produces output files. It seems to me that if you want to preprocess arrays or somesuch, you should be using perl. EP3 was first developed to provide a flexible preprocessor for the Verilog hardware description language. Verilog presents some problems that were not easily solved by using cpp or m4. I wanted to be able to use a normal preprocessor, but extend its functionality. So I wrote EP3 - the Extensible Perl PreProcessor. The main difference between EP3 and other preprocessors is its built-in extensibility. Every directive in EP3 is really a method defined in EP3, one of its submodules, or embedded in the file that is being processed. By linking the directive name to the associated methods, other methods could be added, thus extending the preprocessor. Many of the features of EP3 can be modified via command line switches. For every command line switch, there is an also accessor method. Directives and Method Invocation Directives are preceded with the a user defined delimeter. The default delimeter is `@'. This delimeter was chosen to avoid conflicts with other preprocessor delimeters (`#' and the Verilog backtick), as well as Verilog syntax that might be found a the beginning of a line (`$', `&', etc.). A directive is defined in Perl as the beginning of the line, any amount of whitespace, and the delimeter immediately followed by Perl word characters (0-9A-Za-z_). EP3 looks for directives, strips off the delimeter, and then invokes a method of the same name. The standard directives are defined within the EP3 program. Library or user defined directives may be loaded as perl modules either via the use command or from a command line switch for inclusion at the beginning of the EP3 run. Using the "include" directive coupled with the "perl_begin/end" directives perl subroutines (and hence EP3 directives) may be dynamically included during the EP3 run. Directive Extension Method 1: The use command. A module may be included with the use statement provided that it pushes its package name onto EP3's @ISA array (thus telling EP3 to inherit its methods). For a Verilog module whose filename is Verilog.pm and has the package name Text::EP3::Verilog, the following line must be included ... push (@Text::EP3::ISA, qw(Text::EP3::Verilog)); This package can then be simply included in whatever script you are using to call EP3 with the line: use Text::EP3::Verilog; All methods within the module are now available to EP3 as directives. Directive Extension Method 2: The command line switch. A module can be included at run time with the -module modulename switch on the command line (assuming the ep3_parse_command_line method is invoked). The modulename is assumed to have a .pm extension and exist somewhere in the directories specified in @INC. All methods within the module are now available to EP3 as directives. Directive Extension Method 3: The ep3_modules accessor method. Modules can be added by using the accessor method ep3_modules. $object->ep3_modules("module1","module2", ....); All methods within the module are now available to EP3 as directives. Directive Extension Method 4: Embedded in the source code or included files. Using the perl_begin and perl_end directives to delineate perl sections, subroutines can be declared (as methods) anywhere in a processed file or in a file that the process file includes. In this way, runtime methods are made available to EP3. For example ... 1 Text to be printed ... @perl_begin sub hello { my $self = shift; print "Hello there\n"; } @perl_end 2 Text to be printed ... @hello 3 Text to be printed ... would result in 1 Text to be printed ... 2 Text to be printed ... Hello there 3 Text to be printed ... Using this method, libraries of directives can be built and included with the include directive (but it is recommended that they be moved into a module when they become static). Input Files and Processing Input files are processed one line at a time. The EP3 engine attempts to perform substitutions with elements stored in macro/define/replace lists. All directive lines are preprocessed before being evaluated (the only exception being the key portions of the if[n]def and define directives). Directive lines can be extended across multiple lines by placing the `\' character at the end of each line. Comments are normally protected from the preprocessor, but protection can be dynamically turned off and then back on. From a command line switch, comments can also be deleted from the output. Output Files EP3 typically writes output to Perl's STDOUT, but can be assigned to any output file. EP3 can also be run in "dependency check" mode via a command line switch. In this mode, normal output is suppressed, and all dependent files are output in the order accessed. Most parameters can be modified before invoking EP3 including directive string, comment delimeters, comment protection and inclusion, include path, and startup defines. Standard Directives EP3 defines a standard set of preprocessor directives with a few special additions that integrate the power of Perl into the coded language. The define directive @define key definition The define directive assigns the definition to the key. The definition can contain any character including whitespace. The key is searched for as an individual word (i.e the input to be searched is tokenized on Perl word boundaries). The definition contains everything from the whitespace following the key until the end of the line. The replace directive @replace key definition The replace directive is identical to the define directive except that the substitution is performed if the key exists anywhere, not just on word boundaries. The macro directive @macro key(value[,value]*) definition The macro directive tokenizes as the define directive, replacing the key(value,...) text with the definition and saving the value list. The definition is then parsed and the original macro values are replaced with the saved values. The eval directive @eval key expr The eval directive first evaluates the expr using Perl. Any valid Perl expr is accepted. This key is then defined with the result of the evaluation. The include directive @include or "file" [condition] The include directive looks for the "file" in the present directory, and anywhere in the include path (definable via command line switch). Included files are recursively evaluated by the preprocessor. If the optional condition is specified, only those lines in between the text strings "@mark condition_BEGIN" and "@mark condition_END" will be included. The condition can be any string. For example if the file "file.V" contains the following lines: 1 Stuff before @mark PORT_BEGIN 2 Stuff middle @mark PORT_END 3 Stuff after Then any file with the following line: @include "file.V" PORT will include the following line from file.V 2 Stuff middle This is useful for partial inclusion of files (like port list specifications in Verilog). The enum directive @enum a,b,c,d,... enum generates multiple define's with each sequential element receiving a 1 up count from the previous element. Default starts at 0. If any element is a number, the enum value will be set to that value. The ifdef and ifndef directives @ifdef and @ifndef key Conditional compilation directives. The key is defined if it was placed in the define/replace list by define, replace, or any command that generates a define or replace. The if directive @if expr The expression is evaluated using Perl. The expression can be any valid Perl expression. This allows for a wide range of conditional compilation. The elif [elsif] directive @[elif|elsif] key | expr The else if directive. Used for either "if[n]def" or "if". The else directive @else The else directive. Used for either "if[n]def" or "if". The endif directive @endif The conclusion of any "if[n]def" or "if" block. The comment directive @comment on|off|default|previous The comment switch can be one of "on", "off", "default", or "previous". This is used to turn comments on or off in the resultant file. This directive is very useful when including other files with commented header descriptions. By using "comment off" and "comment previous" surrounding a header the output will not see the included files comments. Using "comment on" with "comment previous" insures that comments are included (as in an attached synthesis directive file). The default comment setting is on. This can be altered by a command line switch. The "comment default" directive will restore the comment setting to the EP3 invocation default. The ep3 directive @ep3 on|off The "ep3 off" directive turns off preprocessing until the "ep3 on" directive is encountered. This can greatly speed up processing of large files where postprocessing is only necessary in small chunks. The perl_begin and perl_end directives @perl_begin perl code here .... (Single line and multi-line output mechanisms are available) @> text to be output after variable interpolation or @>> text to be output after variable interpolation @<< @perl_end The "perl" directives provide the underlying language with all of the power of perl, embedded in the preprocessed code. Anything enclosed within the "perl_begin" and "perl_end" directives will be evaluated as a Perl script. This can be used to include a subroutine that can later be called as a directive. Using this type of extension, directive libraries can be developed and included to perform a variety of powerful source code development features. This construct can also be used to mimic and expand the VHDL generate capabilities. The "@>" and "@>> @<<" directives from within a perl_[begin|end] block directs ep3 to perform variable interpolation on the given line and then print it to the output. The debug directive @debug on|off|value The debug directive enables debug statements to go to the output file. The debug statements are preceded by the Line Comment string. Currently the debug values that will enable printouts are the following: 0x01 1 - Primary messages (Entering Subroutines) 0x02 2 - ep3_process Engine 0x04 4 - define (replace, macro, eval, enum) 0x08 8 - include 0x10 16 - if (else, ifdef, etc.) 0x20 32 - perl_begin/end EP3 Methods EP3 defines several methods that can be invoked by the user. ep3_execute Execute sets up EP3 to act like a perl script. It parses the command line, includes any modules specified on the command line, loads in any specified modules, does any preexisting defines, sets up the output files, and then processes the input. Sort of the whole shebang. ep3_parse_command_line ep3_parse_command_line does just that - parses the command line looking for EP3 options. It uses the GetOpt::Long module. ep3_modules This method will find and include any modules specified as arguments. It expects just the name and will append .pm to it before doing a require. The module returns the methods specified in the objects methods array. ep3_output_file ep3_output_file determines what the output should be (either the processed text or a list of dependencies) and where it should go. It then proceeds to open the required output files. The module returns the output filename. ep3_reset ep3_reset resets all of the internal EP3 lists (defines, replaces, keycounts, etc.) so that a user can do multiple files independently from within one script. ep3_process([$filename [$condition]]) ep3_process is the guts of the whole thing. It takes a filename as input and produces the specified output. This is the method that is iteratively called by the include directive. A null filenam will cause ep3_process to look for filenames in ARGV. ep3_includes([@include_directories]) This method will add the specified directories to the ep3 include path. ep3_defines($string1=$string2); This method will initialize defines with string1 defined as string 2. It initializes all of the defines in the objects Defines array. ep3_end_comment([$string]); This method sets the end_comment string to the value specifed. If null, the method returns the current value. ep3_start_comment([$string]); This method sets the start_comment string to the value specifed. If null, the method returns the current value. ep3_line_comment([$string]); This method sets the end_commenline string to the value specifed. If null, the method returns the current value. ep3_delimeter([$string]); This method sets the delimeter string to the value specifed. If null, the method returns the current value. ep3_gen_depend_list([$value]); This method enables/disables dependency list generation. When gen_depend_list is 1, a dependency list is generated. When it is 0, normal operation occurs. If null, the method returns the current value. ep3_keep_comments([$value]); This method sets the keep_comments variable to the value specifed. If null, the method returns the current value. ep3_protect_comments([$value]); This method sets the protect_comments variable to the value specifed. If null, the method returns the current value. EP3 Options EP3 Options can be set from the command line (if ep3_execute or ep3_parse_command_line is invoked) or the internal variables can be explicitly set. [-no]protect Should comments be protected from substution? Default: 1 [-no]comment Should comments be passed to the output? Default: 1 [-no]depend Are we generating a dependency list or simply processing? Default: 0 -delimeter string The directive delimeter - can be a string Default: @ -define string1=string2 Defines from the command line. Multiple -define options can be specified Default: () -includes directory Where to look for include files. Multiple -include options can be specified Default: () -output_filename filename Where to place the output. Default: STDOUT -modules filename Modules to load (just the module name, expecting to find module.pm somewhere in @INC. Multiple -modules options can be specified Default: () -line_comment string The Line Comment string. Default: // -start_comment string The Start Comment string. Default: /* -end_comment string The End Comment string. Default: */ AUTHOR Gary Spivey, Dept. of Defense, Ft. Meade, MD. spivey@romulus.ncsc.mil Many thanks to Steve Bresson for his help, ideas, and code ... SEE ALSO perl(1).