On Tue, 2010-07-13 at 08:24 +0200, Lukas Renggli wrote:
I tried to
file in SmaCCDev-lr.23.mcz from squeak source (version 24 is
for Pharo 1.1, which I don't think the moose image is using). It failed
because it needed SmaCCParser and SmaCCScanner classes.
http://www.squeaksource.com/SmaccDevelopment.html says the run time is
already in the image; perhaps it's been stripped out?
The runtime is not part of the image anymore. You need to load it
(SmaCC) before you load the development tools (SmaCCDev). I've update
the comment on SqueakSource.
Thank you.
The broader question is whether anyone has any
advice on approaching my
problem of parsing SAS files. Advice could be pointers to other
lexer/scanners or the news that PEG (i.e., PetitParser) works OK for my
purposes (I may want to use it for the parsing, but this is about
dealing with the macro language).
PetitParser is more general in what it can parse than SmaCC. So I
don't see a reason that wouldn't work.
I was thinking of using the lexer in
SmaCC.
To give a taste of the detailed issues: a macro
variable reference, &V,
can occur almost anywhere in a SAS program (but not inside of some
comments and some quotes). It is immediately expanded; this may occur
in the middle of what looks like a token, e.g.
(1) data run&V;
becomes
(2) data run04 (start=5);
In perverse cases one could even have
(3) da&v
become (2), including the semicolon.
Macro variables obey scoping rules.
Looks to me like you don't want to do that during parsing, but in a
separate step after parsing the AST.
That's too late; macro expansion
determines what text needs to get
parsed. The text without macro expansion may not even be well-formed.
Essentially, there are 2 or 3 different syntaxes
operating in the same
program (the main syntax, the macro language, and the expansion of macro
variables via &). This was the setup for the initial Conway paper on
coroutines. I don't currently see any gain from using coroutines.
PetitParser allows you to define and test these 3 syntaxes separately
and then combine them later on.
Could you say a bit more about the nature of the
combination? A simple
sequential approach does not seem feasible. For example
%let V = data one;
%macro foo(input);
&V &input;
a= b+c;
run;
%mend foo;
%foo(two)
The macro processor needs to register the definition of V in the %let so
it can be used later.
During the macro definition (between %macro and %mend) &V should not be
expanded.
%foo invokes the macro. At this point the macro processor produces the
desired text, and the macro variable processor in turn needs to expand
&V and &input, the latter bound to the local argument. Only then does
the plain SAS parsing happen.
Ross
The coroutines sound like an optimization to combine the parsing and
macro expansion steps into one single step. I would implement and test
them separately in the beginning.
Lukas