I have a narrow question and a broader one.
The narrow question is how to get SmaCC into the prepackaged moose
image.
I tried to file in SmaCCDev-lr.23.mcz from squeak source (version 24 is
for Pharo 1.1, which I don't think the moose image is using). It failed
because it needed SmaCCParser and SmaCCScanner classes.
http://www.squeaksource.com/SmaccDevelopment.html says the run time is
already in the image; perhaps it's been stripped out?
The broader question is whether anyone has any advice on approaching my
problem of parsing SAS files. Advice could be pointers to other
lexer/scanners or the news that PEG (i.e., PetitParser) works OK for my
purposes (I may want to use it for the parsing, but this is about
dealing with the macro language).
To give a taste of the detailed issues: a macro variable reference, &V,
can occur almost anywhere in a SAS program (but not inside of some
comments and some quotes). It is immediately expanded; this may occur
in the middle of what looks like a token, e.g.
(1) data run&V;
becomes
(2) data run04 (start=5);
In perverse cases one could even have
(3) da&v
become (2), including the semicolon.
Macro variables obey scoping rules.
There are also macro invocations like %mymacro(3, abc) which expand at
the closing parenthesis. %INCLUDE brings a whole file into the source.
And the macro language itself has conditional and looping constructs.
As an added bonus, SAS macros are not simple preprocessors, since their
expansion can depend on information obtained in the main language at
runtime (in fact, macros can be written at run time). It's unlikely
I'll ever attempt to handle that case, however.
Essentially, there are 2 or 3 different syntaxes operating in the same
program (the main syntax, the macro language, and the expansion of macro
variables via &). This was the setup for the initial Conway paper on
coroutines. I don't currently see any gain from using coroutines.