Hi Camillo,
A first-class omit would be better especially given that you could package it as an
extension.
In any case, the current code should be reverted. While you play with another
implementation, could you move the current versions from the PetitParser repository?
Cheers,
Doru
On 11 Aug 2011, at 15:14, Camillo Bruni wrote:
Thanks for the vast answers and pointing out critical
parts of the implementation.
On 2011-08-11, at 08:14, Tudor Girba wrote:
Hi,
Camillo, thanks for taking an interest and for letting us know that you committed.
Lukas, thanks for the answer. I would also like to add one comment: omit is maybe
interesting for small and quick parsers in which you have the grammar and the parser
together. However, for anything significant, if you want to split the grammar and the
various parsers, it all of a sudden becomes less useful because you do not want to
restrict the grammar.
Yeah and that's exactly the reason why I think this is useful. Since we want to use
PetitParser also for small "dirty" projects where portability is a secondary
problem. And I still disagree that it becomes less useful, it is just a way of filtering
your output from parser in a very simple way ;). Furthermore having the omit doesn't
mean that you will have to use it. t mean there is also the #flatten message which goes
somewhat into the same direction IMO, since it is there for modifying directly in the
parser the data. It could be omitted as well and put into the grammar.
I also took a quick look at the implementation.
There are simply too many ifs that add complexity to an already non-trivial code.
Indeed the implementation is not very nice, and more of a preliminary hack to see if it
works, I will definitely do a second pass with lukas' suggestion for a first-class
object to denote the omit property.
All in all, I think we should retire this
add-on.
Cheers,
Doru
On 11 Aug 2011, at 07:31, Lukas Renggli wrote:
> Hi Camillo,
>> I quickly hacked (with a lot of tests) support for omit in PetitParser.
>> This should simplify some parsing efforts as an additional filtering step could
be avoided.
>>
>> parser := ($" asParser omit, $" asParser negate star flatten, $"
asParser omit)
>> parser parse: '"asdfsd"' "yields directly
#('asdfsd') "
>>
>> vs
>>
>> parser := ($" asParser, $" asParser negate star flatten, $"
asParser)
>> parser parse: '"asdfsd"'
>>
>> which yields #($" 'asdfsd' $")
>
> I don't really see use of #omit. The element is still in a collection
> and you need to extract it from there. What is the benefit other than
> it is now at index 1 instead of 2?
>
> Also the implementation has some problems:
>
> 1. #omit makes it really hard to compose grammars, because parsers
> suddenly get very asymmetric and start to affect each others behavior
> depending on how you compose them.
it shouldn't be about the behavior of the grammes, I might have not done everything
correctly in my implementation, but the way the parsing rules are evaluation is not
changed, its should be only affecting the output of the parsers (in this sense a special
PPActionParser).
> 2. For the same reason grammar transformation
becomes impossible, one
> of the design goals of PetitParser.
sure, but having the feature doesn't mean you have to use it. And in my case it would
simply some of the parsers I use.
Actually in the end the Omit is nothing but a special PPActionParser right? (Just a
curious question, how do you handle grammar transformation in this case?)
> 3. #omit makes a typical grammar (the
smalltalk one) roughly 15%
> slower without even using the new feature.
yep thats an implementation issue, as you mentioned addressing this with a first-class
object won't affect existing grammars at all.
> 4. There are various methods (#-, #match...,
#copy..., ...) that need
> to be fixed when you add state to a parser.
>
> 5. There is no need for eager (#initialize) as well as lazy
> initialization (kills the possibility for a quick invocation).
right, thats indeed an ugly property of my hack (=> first class objects will do)
> 6. Please do not remove PPParser
class>>new and do not change how
> initialization works, otherwise you break GemStone, VisualWorks, and
> GNU Smalltalk. The Seaside wiki has some nice documentation of how to
> correctly initialize objects (basically following the pattern of
> Objective-C).
ok, that was a bit hasty
> I think, problems 1, 3 (!), 4, 5 and to some
extent also 2 could be
> solved if the implementation was a bit more object-oriented. That is,
> if you added a PPOmitParser that would dispatch from #parseOn: to some
> other parsing method #parseOmittedOn: you wouldn't need all those
> #ifTrue:ifFalse:.
>
>> I pushed my solution to the petitparser repos. I didn't add full support for
flatten.
>
> Did you see #map: and #permutation:? Both allow you do the same in a
> less invasive way:
Of course in my example this is straight forward, but in a bit more complex parser with a
couple of options or intermediate tokens you don't want it would be nicer to directly
specify it on the parser nodes. And in such a sense only affect how flatten works...
> parser := ($" asParser , $"
asParser negate star flatten, $" asParser).
> parser map: [ :begin :inner :end | inner ]
>
> or
>
> parser := ($" asParser , $" asParser negate star flatten, $"
asParser).
> parser permutation: #(2)
>
> Additionally, did you see the experimental binding/capturing
> functionality in PetitBeta? This kind of provides something much more
> general and (I believe) much more powerful without the above
> drawbacks. Bind-Capture allows you to bind certain parse results
> anywhere down the tree to names and directly access them later on in a
> capture block. It also deals with unpacking and reshuffling
> collections automatically, check the tests for that.
>
> Your example would look like:
>
> parser := ($" asParser , ($" asParser negate star flatten bind:
> #inner) , $" asParser).
> parser capture: [ :inner | inner ]
Ah cool, this is indeed a bit nicer, didn't have at the new branch yet… will do ;)