Re: [Pharo-dev] Pratt Parsers for PetitParser
by Jan Kurš
Hi,
Did you tried to inspect the PPFailure in the Moose? There is a tab with a
tree structure giving you a pretty good overview, what is going wrong... Or
as an alternative, one can call:
myParser enableDebug parse:myInput
Otherwise, PetitParser really needs some nice error reporting system. I
would have integrated one, but I am not aware of any suitable solution :(
Cheers,
Jan
On Fri, Jun 12, 2015 at 8:07 AM Francisco Garau <francisco.garau(a)gmail.com>
wrote:
>
> Hi Camile
>
> On 11 Jun 2015, at 10:47, Camille <camille.teruel(a)gmail.com> wrote:
>
>
> On 11 Jun 2015, at 09:30, Jan Kurš <kurs(a)iam.unibe.ch> wrote:
>
> That sounds really cool and useful extension.
>
> Thank you Jan!
>
> Regarding the furthest failure, the core of the problem is the distinction
> between an error and a failure. Error reports on a problem in the input,
> while failure is information for choice parser combinator. In general, the
> furthest failure is a better approximation of an error than the last
> failure, so we use it.
>
> I am not sure what exactly is the problem in case of PrattParser. I guess
> the last failure gives better results for a user?
>
> Yes, with furthest failure I get errors from tokenization instead of my
> errors.
> For exemple with the calculator grammar I gave in my first mail when I
> parse ‘1+’ the furthestFailure is ‘$- expected at 2’ whereas I return
> ‘expression expected at 2’ because there's a whole expression missing. Same
> thing with ‘(1+2’ that returns ‘digit expected at 4’ instead of ‘$)
> expected at 4’.
>
> But furthest failure gives wrong messages in other cases to.
> Consider this sequence parser:
>
> keyword := #letter asParser plus , $: asParser.
> keyword parse: ‘foo'
>
> This returns 'letter expected at 3’, but no matter how many letters I add
> to the end I’ll still get ‘letter expected’.
> I want to get what is really missing: '$: expected at 3’.
>
>
> Any of those error messages wouldn't be too bad if the failing production
> rule were also mentioned. Something like "keyword rule failed with xx
> expected at 3"
>
> But yeah, that would require annotating the production rules which can
> pollute the clarity of the grammar definition...
>
>
> Maybe returning the “latest furthest failure” instead of the “first
> furthest failure” could solves the problem here (i.e. replacing > with >=
> in PPContext>>#noteFailure:)?
>
> One has to consider a pratt parser included in the normal parser, e. g.
> Expressions parsed by pratt in a Java Grammar. Depending where an error
> occurs, different strategy for choosing the proper failure is necessary :-/
>
> Indeed, my hack (redefining #parseWithContext:) works only when the Pratt
> parser is the top parser, but a soon as I compose it I’m screwed because
> only parseOn: is sent to the Pratt parser.
> That’s why I wonder if letting the parser decide what to return wouldn’t
> solve the problem: by default the furthest failure but special parsers can
> still decide.
>
> Regarding tokenization, there is a message token, that returns
> PPTokenParser, which transforms a parsed input into the PPToken object.
> Perhaps this might be helpful?
>
> The Pratt tokens are special: a token points back to the parser that
> generated it (its “kind”).
> PPTokenKind subclasses PPFlatteningParser and generates instances of
> PPPrattToken.
> A PPTokenKind stores the precedence, the action to be executed when a
> token of this kind is met at the start of an expression (for terminals and
> prefixes) and the action to be executed when a token is met in the middle
> of an expression (for postfixes and infixes).
>
> Cheers,
> Camille
>
> Cheers Jan
>
> On Wed, Jun 10, 2015, 20:52 Richard Sargent <
> richard.sargent(a)gemtalksystems.com> wrote:
>
>> camille teruel wrote
>> >> On 10 Jun 2015, at 19:11, Chris Cunningham <
>>
>> > cunningham.cb@
>>
>> > > wrote:
>> >>
>> >> Inteteresting....
>> >>
>> >> On Wed, Jun 10, 2015 at 9:36 AM, Camille <
>>
>> > camille.teruel@
>>
>> > <mailto:
>>
>> > camille.teruel@
>>
>> > >> wrote:
>> >> Hello Pharoers and Moosers,
>> >>
>> >> I did a Pratt parser extension for PetitParser.
>> >>
>> >>
>> > <snip>
>> >
>> >> @PP Devs:
>> >> I had trouble with the PPContext furthestFailure that is taken into
>> >> account instead of the failures I return, so I had to redefine
>> >> #parseWithContext: to return the failures I want. The results given by
>> >> furthestFailure were not very meaningful in my case (the same is true
>> for
>> >> PPExpressionParser btw).
>> >> But I guess it was introduced because it gives good results in other
>> >> cases.
>> >> So would it be possible to change this behavior to let the parser
>> decide
>> >> if it returns the furthestFailure or the original failure?
>> >>
>> >> The intent behind the furthestFailure is that it give the failure that
>> >> gets the furthest into the source stream. It is most useful when there
>> >> are embedded choice operators in the parser - the original/pre furthest
>> >> behaviour would return the last failure, which depending on the
>> incoming
>> >> stream and the order of the choice options could be significantly not
>> >> useful.
>> >>
>> >> I ran into this when working with the sql parser, which started off
>> with
>> >> the outer choice of (by memory):
>> >> ^ selectStatement / insertStatement / updateStatement /
>> >> deleteStatement
>> >> If I was trying to part a select statement that had an error at the
>> very
>> >> end of the statement, the parser would return an error talking about
>> how
>> >> the incoming stream failed in deleteStatement. Not useful.
>> >>
>> >> I would be saddened if this further failure was not available.
>> >
>> > Yes in that case returning the furthest failure gives better results.
>> > However, this don’t give meaningful messages in all cases.
>> > For exemple with the calculator I gave in my previous message, if I
>> parse
>> > ‘1+’ I want to get ‘expression expected at: 2’ but instead it returns
>> ‘$-
>> > expected at 2'.
>> > I’m not proposing to remove this feature but to let parsers decide to
>> use
>> > it or not.
>> > Something like (changes in bold):
>> >
>> > PPParser>>parseWithContext: context
>> > | result |
>> > context initializeFor: self.
>> > result := self parseOn: context.
>> >
>> > "Return the furthest failure, it gives better results than the
>> last
>> > failure"
>> > (result isPetitFailure and: [ self wantsFurthestFailure and: [
>> context
>> > furthestFailure notNil ] ])
>> > ifTrue: [ ^ context furthestFailure ].
>> > ^ result
>>
>> This screams at me. Why not just delegate to the context and use a context
>> that returns the preferred failure? e.g. end with:
>> ^context preferredResultFor: result.
>>
>>
>> >
>> > PPParser>>wantsFurthestFailure
>> > ^ true
>> >
>> > Like this, one can return the failures he wants.
>> >
>> > PPPrattParser>>wantsFurthestFailure
>> > ^ false
>> >
>> >
>> > Camille
>> >
>> >>
>> >> -cbc
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://forum.world.st/Pratt-Parsers-for-PetitParser-tp4831456p4831486.html
>> Sent from the Pharo Smalltalk Developers mailing list archive at
>> Nabble.com.
>>
>>
>
8 years, 6 months
PetitMarkdown
by Jan Kurš
Hi All!
I have implemented a PetitMarkdown grammar for PetitParser according to the
CommonMark 0.19 specification (http://spec.commonmark.org/0.19/). The
current implementation can be loaded like this:
Gofer new
url: 'http://smalltalkhub.com/mc/JanKurs/PetitParser/main';
package: 'PetitMarkdown';
load.
The result of the parsing is AST, so I guess there is a possibility to
write a CommonMark -> Pilar AST transformation. I wrote the AST->HTML
visitor only.
CommonMark uses two pass strategy. The block elements are recognized in the
first phase and the inline elements in the second phase. My implementation
handles the first phase without any problems, there are some issues with
the priorities of inline elements and emphasis (I don't understand, why
more than 50% of the whole specification covers this). There are in total
550 example, the current implementation handles 420 of them so far. I guess
for the most users it handles more than enough,...
If anyone is interested in using it/fixing it/improving performance, I am
more than willing to help you. Some of the rules are pretty crazy, so I
guess it will take some time before we find best solutions.
Cheers,
Jan
8 years, 6 months
Re: [Pharo-dev] Pratt Parsers for PetitParser
by Jan Kurš
That sounds really cool and useful extension.
Regarding the furthest failure, the core of the problem is the distinction
between an error and a failure. Error reports on a problem in the input,
while failure is information for choice parser combinator. In general, the
furthest failure is a better approximation of an error than the last
failure, so we use it.
I am not sure what exactly is the problem in case of PrattParser. I guess
the last failure gives better results for a user? One has to consider a
pratt parser included in the normal parser, e. g. Expressions parsed by
pratt in a Java Grammar. Depending where an error occurs, different
strategy for choosing the proper failure is necessary :-/
Regarding tokenization, there is a message token, that returns
PPTokenParser, which transforms a parsed input into the PPToken object.
Perhaps this might be helpful?
Cheers Jan
On Wed, Jun 10, 2015, 20:52 Richard Sargent <
richard.sargent(a)gemtalksystems.com> wrote:
> camille teruel wrote
> >> On 10 Jun 2015, at 19:11, Chris Cunningham <
>
> > cunningham.cb@
>
> > > wrote:
> >>
> >> Inteteresting....
> >>
> >> On Wed, Jun 10, 2015 at 9:36 AM, Camille <
>
> > camille.teruel@
>
> > <mailto:
>
> > camille.teruel@
>
> > >> wrote:
> >> Hello Pharoers and Moosers,
> >>
> >> I did a Pratt parser extension for PetitParser.
> >>
> >>
> > <snip>
> >
> >> @PP Devs:
> >> I had trouble with the PPContext furthestFailure that is taken into
> >> account instead of the failures I return, so I had to redefine
> >> #parseWithContext: to return the failures I want. The results given by
> >> furthestFailure were not very meaningful in my case (the same is true
> for
> >> PPExpressionParser btw).
> >> But I guess it was introduced because it gives good results in other
> >> cases.
> >> So would it be possible to change this behavior to let the parser decide
> >> if it returns the furthestFailure or the original failure?
> >>
> >> The intent behind the furthestFailure is that it give the failure that
> >> gets the furthest into the source stream. It is most useful when there
> >> are embedded choice operators in the parser - the original/pre furthest
> >> behaviour would return the last failure, which depending on the incoming
> >> stream and the order of the choice options could be significantly not
> >> useful.
> >>
> >> I ran into this when working with the sql parser, which started off with
> >> the outer choice of (by memory):
> >> ^ selectStatement / insertStatement / updateStatement /
> >> deleteStatement
> >> If I was trying to part a select statement that had an error at the very
> >> end of the statement, the parser would return an error talking about how
> >> the incoming stream failed in deleteStatement. Not useful.
> >>
> >> I would be saddened if this further failure was not available.
> >
> > Yes in that case returning the furthest failure gives better results.
> > However, this don’t give meaningful messages in all cases.
> > For exemple with the calculator I gave in my previous message, if I parse
> > ‘1+’ I want to get ‘expression expected at: 2’ but instead it returns ‘$-
> > expected at 2'.
> > I’m not proposing to remove this feature but to let parsers decide to use
> > it or not.
> > Something like (changes in bold):
> >
> > PPParser>>parseWithContext: context
> > | result |
> > context initializeFor: self.
> > result := self parseOn: context.
> >
> > "Return the furthest failure, it gives better results than the last
> > failure"
> > (result isPetitFailure and: [ self wantsFurthestFailure and: [
> context
> > furthestFailure notNil ] ])
> > ifTrue: [ ^ context furthestFailure ].
> > ^ result
>
> This screams at me. Why not just delegate to the context and use a context
> that returns the preferred failure? e.g. end with:
> ^context preferredResultFor: result.
>
>
> >
> > PPParser>>wantsFurthestFailure
> > ^ true
> >
> > Like this, one can return the failures he wants.
> >
> > PPPrattParser>>wantsFurthestFailure
> > ^ false
> >
> >
> > Camille
> >
> >>
> >> -cbc
>
>
>
>
>
> --
> View this message in context:
> http://forum.world.st/Pratt-Parsers-for-PetitParser-tp4831456p4831486.html
> Sent from the Pharo Smalltalk Developers mailing list archive at
> Nabble.com.
>
>
8 years, 6 months
Pratt Parsers for PetitParser
by Camille
Hello Pharoers and Moosers,
I did a Pratt parser extension for PetitParser.
A Pratt parser (a.k.a top-down operator precedence parser) handles left-recursion and operator precedence.
It handles grouping, prefix, postfix, infix (right- or left-associative) and "multifix” operators (e.g. "if ... then ... else ...", "... ? ... : ...", Smalltalk keyword messages).
Normally Pratt Parsing needs a tokenization phase but here tokenization is done on the fly with other PP parsers.
Apart from tokenization, no backtracking is needed so parsing is quite fast (approximatively 2 times faster than PPExpressionParser).
Here is an exemple of a calculator:
parser := PPPrattParser new.
"Numbers"
parser terminal: #digit asParser plus do: [ :token | token inputValue asNumber ].
parser skip: #space asParser plus.
"Parentheses"
parser groupLeft: $( asParser right: $) asParser.
"Addition, substraction, multiplication, division: all left infix, * and / have higher precedence than + and -"
parser leftInfix: $+ asParser precedence: 1 do: [ :left :op :right | left + right ].
parser leftInfix: $- asParser precedence: 1 do: [ :left :op :right | left - right ].
parser leftInfix: $* asParser precedence: 2 do: [ :left :op :right | left * right ].
parser leftInfix: $/ asParser precedence: 2 do: [ :left :op :right | left / right ].
"Power: right infix with higher precedence than multiplication and division"
parser rightInfix: $^ asParser precedence: 3 do: [ :left :op :right | left raisedTo: right ].
"Unary minus: prefix with highest precedence"
parser prefix: $- asParser precedence: 4 do: [ :op :right | right negated ].
parser parse: '2*3 + 4^(1/2)*3' ----> 12
To try it:
Gofer it
smalltalkhubUser: 'CamilleTeruel' project: 'PetitPratt';
package: 'PetitPratt';
load
Note that it is in beta stage so it might still change drastically.
@PP Devs:
I had trouble with the PPContext furthestFailure that is taken into account instead of the failures I return, so I had to redefine #parseWithContext: to return the failures I want. The results given by furthestFailure were not very meaningful in my case (the same is true for PPExpressionParser btw).
But I guess it was introduced because it gives good results in other cases.
So would it be possible to change this behavior to let the parser decide if it returns the furthestFailure or the original failure?
Cheers,
Camille
8 years, 6 months
Feedbacks from Moose Course
by Anne Etien
Hi,
Last week, we made (Guillaume Larchevêque and my self) 6 hours about Moose. Guillaume was using Moose 5.0 and me the latest stable 5.1 version.
The purpose was, after an introduction to Moose to be able to analyse a real project with a reasonable size (35 to 420 FAMIXClasses) in order to evaluate its quality.
We met some issues:
- MSE file generated by VerveineJ could not be used in Version 5.0 (FileAnchor issue)
- From the MoosePanel, right clicking on a FAMIXClassesGroup to build the UML diagram:
- sometimes, the panel is completely white (since no error, or exception are raised, it is difficult to know why)
- when the UML class diagram appears, it is no zoomable and there is no way to export it. We had to hack the visualization to inspect the view in order to get access to the zooms and the exports.
- sometimes, I reorganized the figure before exportation, but in that cases, the resulting figure didn’t contain the whole diagram.
- When querying a collection of FAMIXEntity, it would be great to get an answer even if the result is empty.
Where can I put some new issues? Have we migrated from google code?
Am I the only one teaching Moose? Having these issues?
Cheers,
Anne
8 years, 6 months
FAMIXReference not managed as other associations
by Jean-Christophe Bach
Hi,
We are working on enhancing MooseChef. For that purpose, we have looked
at the FAMIXAssociation. FAMIXReference is not managed as the other: its
source and its target are respectively a ContainerEntity. This enables
to compute references between packages. However we believe that such
computation on the contained entities should be let to MooseChef and not
directly to FAMIX. Indeed, FAMIXInvocation is an association between two
behavioral entities; FAMIXAccess is an association between a behavioral
entity and a structural one.
Would you agree to modify FAMIXReference in order to be consistent with
the other associations?
Regards,
Anne & JC
8 years, 6 months
Pharo sprint & Moose dojo this Friday
by Jean-Christophe Bach
Hello Pharoers,
We propose a Pharo sprint / Moose dojo this Friday, 12th June, starting at
10:00am. (Local Time Lille).
It will be at the Inria Lille, Building B, third floor (RMoD offices).
Remotely, you can join us on the official IRC channel #pharo on
irc.freenode.net server. During the sprint, we will try to synchronize
local and remote Pharo sprinters.
As the building is not open to the public, please contact us before if
you plan to come.
JC
8 years, 6 months
why ellipse is so special?
by Natalia Tymchuk
Hello,
I noticed that RTEllipse doesn’t have the border. Why it is like that?
Best regards,
Natalia
8 years, 6 months