Re: [Pharo-project] omit support in PetitParser

List overview All Threads
Download

newer

older

EyeSee: setup font size & color...

Re: Issue 464 in moose-technology:...

Lukas Renggli

11 Aug 2011 11 Aug '11

7:31 a.m.

Hi Camillo,

...

I quickly hacked (with a lot of tests) support for omit in PetitParser. This should simplify some parsing efforts as an additional filtering step could be avoided.

parser := ($" asParser omit, $" asParser negate star flatten, $" asParser omit) parser parse: '"asdfsd"' "yields directly #('asdfsd') "

vs

parser := ($" asParser, $" asParser negate star flatten, $" asParser) parser parse: '"asdfsd"'

which yields #($" 'asdfsd' $")

I don't really see use of #omit. The element is still in a collection and you need to extract it from there. What is the benefit other than it is now at index 1 instead of 2?

Also the implementation has some problems:

1. #omit makes it really hard to compose grammars, because parsers suddenly get very asymmetric and start to affect each others behavior depending on how you compose them.

2. For the same reason grammar transformation becomes impossible, one of the design goals of PetitParser.

3. #omit makes a typical grammar (the smalltalk one) roughly 15% slower without even using the new feature.

4. There are various methods (#-, #match..., #copy..., ...) that need to be fixed when you add state to a parser.

5. There is no need for eager (#initialize) as well as lazy initialization (kills the possibility for a quick invocation).

6. Please do not remove PPParser class>>new and do not change how initialization works, otherwise you break GemStone, VisualWorks, and GNU Smalltalk. The Seaside wiki has some nice documentation of how to correctly initialize objects (basically following the pattern of Objective-C).

I think, problems 1, 3 (!), 4, 5 and to some extent also 2 could be solved if the implementation was a bit more object-oriented. That is, if you added a PPOmitParser that would dispatch from #parseOn: to some other parsing method #parseOmittedOn: you wouldn't need all those #ifTrue:ifFalse:.

...

I pushed my solution to the petitparser repos. I didn't add full support for flatten.

Did you see #map: and #permutation:? Both allow you do the same in a less invasive way:

parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser map: [ :begin :inner :end | inner ]

parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser permutation: #(2)

Additionally, did you see the experimental binding/capturing functionality in PetitBeta? This kind of provides something much more general and (I believe) much more powerful without the above drawbacks. Bind-Capture allows you to bind certain parse results anywhere down the tree to names and directly access them later on in a capture block. It also deals with unpacking and reshuffling collections automatically, check the tests for that.

Your example would look like:

parser := ($" asParser , ($" asParser negate star flatten bind: #inner) , $" asParser). parser capture: [ :inner | inner ]

Cheers, Lukas

-- Lukas Renggli www.lukas-renggli.ch

Show replies by date

Tudor Girba

11 Aug 11 Aug

8:14 a.m.

New subject: [Pharo-project] omit support in PetitParser

Hi,

Camillo, thanks for taking an interest and for letting us know that you committed.

Lukas, thanks for the answer. I would also like to add one comment: omit is maybe interesting for small and quick parsers in which you have the grammar and the parser together. However, for anything significant, if you want to split the grammar and the various parsers, it all of a sudden becomes less useful because you do not want to restrict the grammar.

I also took a quick look at the implementation. There are simply too many ifs that add complexity to an already non-trivial code.

All in all, I think we should retire this add-on.

Cheers, Doru

On 11 Aug 2011, at 07:31, Lukas Renggli wrote:

...

Hi Camillo,

...
I quickly hacked (with a lot of tests) support for omit in PetitParser. This should simplify some parsing efforts as an additional filtering step could be avoided.

parser := ($" asParser omit, $" asParser negate star flatten, $" asParser omit) parser parse: '"asdfsd"' "yields directly #('asdfsd') "

vs

parser := ($" asParser, $" asParser negate star flatten, $" asParser) parser parse: '"asdfsd"'

which yields #($" 'asdfsd' $")

I don't really see use of #omit. The element is still in a collection and you need to extract it from there. What is the benefit other than it is now at index 1 instead of 2?

Also the implementation has some problems:

#omit makes it really hard to compose grammars, because parsers

suddenly get very asymmetric and start to affect each others behavior depending on how you compose them.

For the same reason grammar transformation becomes impossible, one

of the design goals of PetitParser.

#omit makes a typical grammar (the smalltalk one) roughly 15%

slower without even using the new feature.

There are various methods (#-, #match..., #copy..., ...) that need

to be fixed when you add state to a parser.

There is no need for eager (#initialize) as well as lazy

initialization (kills the possibility for a quick invocation).

Please do not remove PPParser class>>new and do not change how

initialization works, otherwise you break GemStone, VisualWorks, and GNU Smalltalk. The Seaside wiki has some nice documentation of how to correctly initialize objects (basically following the pattern of Objective-C).

I think, problems 1, 3 (!), 4, 5 and to some extent also 2 could be solved if the implementation was a bit more object-oriented. That is, if you added a PPOmitParser that would dispatch from #parseOn: to some other parsing method #parseOmittedOn: you wouldn't need all those #ifTrue:ifFalse:.

...
I pushed my solution to the petitparser repos. I didn't add full support for flatten.

Did you see #map: and #permutation:? Both allow you do the same in a less invasive way:

parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser map: [ :begin :inner :end | inner ]

or

parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser permutation: #(2)

Additionally, did you see the experimental binding/capturing functionality in PetitBeta? This kind of provides something much more general and (I believe) much more powerful without the above drawbacks. Bind-Capture allows you to bind certain parse results anywhere down the tree to names and directly access them later on in a capture block. It also deals with unpacking and reshuffling collections automatically, check the tests for that.

Your example would look like:

parser := ($" asParser , ($" asParser negate star flatten bind: #inner) , $" asParser). parser capture: [ :inner | inner ]

Cheers, Lukas

-- Lukas Renggli www.lukas-renggli.ch

-- www.tudorgirba.com

"Obvious things are difficult to teach."

Camillo Bruni

3:14 p.m.

New subject: [Pharo-project] omit support in PetitParser

Thanks for the vast answers and pointing out critical parts of the implementation.

On 2011-08-11, at 08:14, Tudor Girba wrote:

...

Hi, Camillo, thanks for taking an interest and for letting us know that you committed.

Lukas, thanks for the answer. I would also like to add one comment: omit is maybe interesting for small and quick parsers in which you have the grammar and the parser together. However, for anything significant, if you want to split the grammar and the various parsers, it all of a sudden becomes less useful because you do not want to restrict the grammar.

Yeah and that's exactly the reason why I think this is useful. Since we want to use PetitParser also for small "dirty" projects where portability is a secondary problem. And I still disagree that it becomes less useful, it is just a way of filtering your output from parser in a very simple way ;). Furthermore having the omit doesn't mean that you will have to use it. t mean there is also the #flatten message which goes somewhat into the same direction IMO, since it is there for modifying directly in the parser the data. It could be omitted as well and put into the grammar.

...

I also took a quick look at the implementation. There are simply too many ifs that add complexity to an already non-trivial code.

Indeed the implementation is not very nice, and more of a preliminary hack to see if it works, I will definitely do a second pass with lukas' suggestion for a first-class object to denote the omit property.

...

All in all, I think we should retire this add-on.

Cheers, Doru

On 11 Aug 2011, at 07:31, Lukas Renggli wrote:

...
Hi Camillo,

...
I quickly hacked (with a lot of tests) support for omit in PetitParser. This should simplify some parsing efforts as an additional filtering step could be avoided.

parser := ($" asParser omit, $" asParser negate star flatten, $" asParser omit) parser parse: '"asdfsd"' "yields directly #('asdfsd') "

vs

parser := ($" asParser, $" asParser negate star flatten, $" asParser) parser parse: '"asdfsd"'

which yields #($" 'asdfsd' $")

I don't really see use of #omit. The element is still in a collection and you need to extract it from there. What is the benefit other than it is now at index 1 instead of 2?

Also the implementation has some problems:

#omit makes it really hard to compose grammars, because parsers

suddenly get very asymmetric and start to affect each others behavior depending on how you compose them.

it shouldn't be about the behavior of the grammes, I might have not done everything correctly in my implementation, but the way the parsing rules are evaluation is not changed, its should be only affecting the output of the parsers (in this sense a special PPActionParser).

...

...

For the same reason grammar transformation becomes impossible, one

of the design goals of PetitParser.

sure, but having the feature doesn't mean you have to use it. And in my case it would simply some of the parsers I use. Actually in the end the Omit is nothing but a special PPActionParser right? (Just a curious question, how do you handle grammar transformation in this case?)

...

...

#omit makes a typical grammar (the smalltalk one) roughly 15%

slower without even using the new feature.

yep thats an implementation issue, as you mentioned addressing this with a first-class object won't affect existing grammars at all.

...

...

There are various methods (#-, #match..., #copy..., ...) that need

to be fixed when you add state to a parser.

There is no need for eager (#initialize) as well as lazy

initialization (kills the possibility for a quick invocation).

right, thats indeed an ugly property of my hack (=> first class objects will do)

...

...

Please do not remove PPParser class>>new and do not change how

initialization works, otherwise you break GemStone, VisualWorks, and GNU Smalltalk. The Seaside wiki has some nice documentation of how to correctly initialize objects (basically following the pattern of Objective-C).

ok, that was a bit hasty

...

...
I think, problems 1, 3 (!), 4, 5 and to some extent also 2 could be solved if the implementation was a bit more object-oriented. That is, if you added a PPOmitParser that would dispatch from #parseOn: to some other parsing method #parseOmittedOn: you wouldn't need all those #ifTrue:ifFalse:.

...
I pushed my solution to the petitparser repos. I didn't add full support for flatten.

Did you see #map: and #permutation:? Both allow you do the same in a less invasive way:

Of course in my example this is straight forward, but in a bit more complex parser with a couple of options or intermediate tokens you don't want it would be nicer to directly specify it on the parser nodes. And in such a sense only affect how flatten works...

...

...
parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser map: [ :begin :inner :end | inner ]

or

parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser permutation: #(2)

Additionally, did you see the experimental binding/capturing functionality in PetitBeta? This kind of provides something much more general and (I believe) much more powerful without the above drawbacks. Bind-Capture allows you to bind certain parse results anywhere down the tree to names and directly access them later on in a capture block. It also deals with unpacking and reshuffling collections automatically, check the tests for that.

Your example would look like:

parser := ($" asParser , ($" asParser negate star flatten bind: #inner) , $" asParser). parser capture: [ :inner | inner ]

Ah cool, this is indeed a bit nicer, didn't have at the new branch yet… will do ;)

Tudor Girba

3:26 p.m.

New subject: [Pharo-project] omit support in PetitParser

Hi Camillo,

A first-class omit would be better especially given that you could package it as an extension.

In any case, the current code should be reverted. While you play with another implementation, could you move the current versions from the PetitParser repository?

Cheers, Doru

On 11 Aug 2011, at 15:14, Camillo Bruni wrote:

...

Thanks for the vast answers and pointing out critical parts of the implementation.

On 2011-08-11, at 08:14, Tudor Girba wrote:

...
Hi, Camillo, thanks for taking an interest and for letting us know that you committed.

Lukas, thanks for the answer. I would also like to add one comment: omit is maybe interesting for small and quick parsers in which you have the grammar and the parser together. However, for anything significant, if you want to split the grammar and the various parsers, it all of a sudden becomes less useful because you do not want to restrict the grammar.

Yeah and that's exactly the reason why I think this is useful. Since we want to use PetitParser also for small "dirty" projects where portability is a secondary problem. And I still disagree that it becomes less useful, it is just a way of filtering your output from parser in a very simple way ;). Furthermore having the omit doesn't mean that you will have to use it. t mean there is also the #flatten message which goes somewhat into the same direction IMO, since it is there for modifying directly in the parser the data. It could be omitted as well and put into the grammar.

...
I also took a quick look at the implementation. There are simply too many ifs that add complexity to an already non-trivial code.

Indeed the implementation is not very nice, and more of a preliminary hack to see if it works, I will definitely do a second pass with lukas' suggestion for a first-class object to denote the omit property.

...
All in all, I think we should retire this add-on.

Cheers, Doru

On 11 Aug 2011, at 07:31, Lukas Renggli wrote:

...
Hi Camillo,

...
I quickly hacked (with a lot of tests) support for omit in PetitParser. This should simplify some parsing efforts as an additional filtering step could be avoided.

parser := ($" asParser omit, $" asParser negate star flatten, $" asParser omit) parser parse: '"asdfsd"' "yields directly #('asdfsd') "

vs

parser := ($" asParser, $" asParser negate star flatten, $" asParser) parser parse: '"asdfsd"'

which yields #($" 'asdfsd' $")

I don't really see use of #omit. The element is still in a collection and you need to extract it from there. What is the benefit other than it is now at index 1 instead of 2?

Also the implementation has some problems:

#omit makes it really hard to compose grammars, because parsers

suddenly get very asymmetric and start to affect each others behavior depending on how you compose them.

it shouldn't be about the behavior of the grammes, I might have not done everything correctly in my implementation, but the way the parsing rules are evaluation is not changed, its should be only affecting the output of the parsers (in this sense a special PPActionParser).

...
...

For the same reason grammar transformation becomes impossible, one

of the design goals of PetitParser.

sure, but having the feature doesn't mean you have to use it. And in my case it would simply some of the parsers I use. Actually in the end the Omit is nothing but a special PPActionParser right? (Just a curious question, how do you handle grammar transformation in this case?)

...
...

#omit makes a typical grammar (the smalltalk one) roughly 15%

slower without even using the new feature.

yep thats an implementation issue, as you mentioned addressing this with a first-class object won't affect existing grammars at all.

...
...

There are various methods (#-, #match..., #copy..., ...) that need

to be fixed when you add state to a parser.

There is no need for eager (#initialize) as well as lazy

initialization (kills the possibility for a quick invocation).

right, thats indeed an ugly property of my hack (=> first class objects will do)

...
...

Please do not remove PPParser class>>new and do not change how

initialization works, otherwise you break GemStone, VisualWorks, and GNU Smalltalk. The Seaside wiki has some nice documentation of how to correctly initialize objects (basically following the pattern of Objective-C).

ok, that was a bit hasty

...
...
I think, problems 1, 3 (!), 4, 5 and to some extent also 2 could be solved if the implementation was a bit more object-oriented. That is, if you added a PPOmitParser that would dispatch from #parseOn: to some other parsing method #parseOmittedOn: you wouldn't need all those #ifTrue:ifFalse:.

...
I pushed my solution to the petitparser repos. I didn't add full support for flatten.

Did you see #map: and #permutation:? Both allow you do the same in a less invasive way:

Of course in my example this is straight forward, but in a bit more complex parser with a couple of options or intermediate tokens you don't want it would be nicer to directly specify it on the parser nodes. And in such a sense only affect how flatten works...

...
...
parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser map: [ :begin :inner :end | inner ]

or

parser := ($" asParser , $" asParser negate star flatten, $" asParser). parser permutation: #(2)

Additionally, did you see the experimental binding/capturing functionality in PetitBeta? This kind of provides something much more general and (I believe) much more powerful without the above drawbacks. Bind-Capture allows you to bind certain parse results anywhere down the tree to names and directly access them later on in a capture block. It also deals with unpacking and reshuffling collections automatically, check the tests for that.

Your example would look like:

parser := ($" asParser , ($" asParser negate star flatten bind: #inner) , $" asParser). parser capture: [ :inner | inner ]

Ah cool, this is indeed a bit nicer, didn't have at the new branch yet… will do ;)

-- www.tudorgirba.com

"Some battles are better lost than fought."

Lukas Renggli

4:10 p.m.

New subject: [Pharo-project] omit support in PetitParser

...

In any case, the current code should be reverted. While you play with another implementation, could you move the current versions from the PetitParser repository?

I will remove it. I assume that the Helvetia image doesn't work anymore, because it always uses the latest version. Maybe changes (other than simple fixes) should be posted somewhere else to give some time for discussion before an eventual merge.

Lukas

-- Lukas Renggli www.lukas-renggli.ch

Stéphane Ducasse

4:25 p.m.

New subject: [Pharo-project] omit support in PetitParser

for smalltalkhub I think that everyproject will have an inbox so that people can deal the way thye want with incoming code.

On Aug 11, 2011, at 4:10 PM, Lukas Renggli wrote:

...

...
In any case, the current code should be reverted. While you play with another implementation, could you move the current versions from the PetitParser repository?

I will remove it. I assume that the Helvetia image doesn't work anymore, because it always uses the latest version. Maybe changes (other than simple fixes) should be posted somewhere else to give some time for discussion before an eventual merge.

Lukas

Lukas Renggli www.lukas-renggli.ch

5398

Age (days ago)

5398

Last active (days ago)

moose-dev@list.inf.unibe.ch

5 comments

4 participants

tags (0)

participants (4)

Camillo Bruni
Lukas Renggli
Stéphane Ducasse
Tudor Girba