[Moose-dev] Re: XML mapping DSL (SimpleXO)

10 Nov 2011

It seems I've produced a couple of typos in the mail before. At least this
needs to be clearified:
...
  Yes that's the idea. #defineElement: and
#defineCData: have in common
 that both create an new object. While #defineElement: uses no data from
 the node at all, #defineCData: gets the string-value and hands it over
 to the constructor. 
Ciao, Steffen
Am 09.11.2011, 22:25 Uhr, schrieb Steffen Märcker &lt;merkste(a)web.de&gt;de>:
...
  Hi Norbert!
  Well, sort of. I think I'm starting to get
it. defineElement: and
 defineCData: are meant that you get the content _from_ the element or
 cdata? So defineCData: extracts the text from the Node? 
 Yes that's the idea. #defineElement: and #defineCData: have in common
 that both create an new object. While uses not data from the node at
 all, #defineCData: gets the string-value and hands it over to the
 constructor. (*) The node at which a type is applied can be viewed as a
 pivot node from where further processing, namely the types mappings,
 starts. Thus, it serves as context node for the xpath-expressions.
 (*) Actually, the given class name can refer to any binding. In VW this
 can be classes (of course), Shared Variables and namespaces as well.
  From that point of view I think that Cdata is the
wrong name anyway.
 CData and PCData are mainly present in the textual form of XML. 
 I see your point. =) Perhaps we can go even further and use
 #defineNode:, because a type can be applied not only to elements, but
 all kinds of xml nodes. And how about #defineStringValue: instead of
 #defineText:?
 But there's is actually another type, available via #defineStruct:. It
 behaves similar to the element type but requires that the created
 objects respond to #at:put:. The default class is Dictionary here.
 Remember the rectangle example:
 builder defineStruct: 'Rect')
        mapPath: ('pos' /@ 'x') toType: 'Int';
        "... and so on"
 Since we use struct now, we get:
 (Dictionary new)
        at: 'x' put: 2;
        "... and so on"
 This third (and last type) proofed to be very useful for rapid
 prototyping of a mapping. Later in development, the structs can be
 easily replaced be the actual domain objects.
  Did you consider instead of having a constructor:
setter to provide a
 block instead. Usually this adds a little complexity but opens a whole
 set of possible use cases. 
 Do you think of something like a factory block that takes a node and
 produces an new object? E.g. something like
 (defineFactory: 'Hypothetical' block: [:node | node copy])
 This idea looks promising. If it takes the context node, there are
 indeed plenty of new use cases. =) Perhaps we should allow here the
 binding approach too, to avoid wrapping existing facilities in blocks,
 e.g.
 (defineFactory: 'Hypothetical' class: 'MyFactoryClass' call:
 #processNode:)
 This would enable to call SimpleXO parsers for specific nodes and thus a
 more modular design in complex situations...
   [Example
on ID resolving]
  That sounds interesting but I don't get the example. Can you elaborate
 on this? Or provide a more concrete example. 
 The basic idea is the following: A document may contain nodes with ids
 and other nodes that refer to them by their id. To parse this, we first
 put all elements with ids in a dictionary at the respective key. Setting
 referenced values is delayed until all nodes has been parsed. This
 allows forward references. In fact, a document in general, may have
 several categories of ids, e.g. the attributes 'id' and 'domain:id'.
 Thus we want to have a separate keychain for each category. When we call
 #key: or #reference:, the argument is the name of a keychain.
 <ex>
        <list>
                <geo ref="1"/>
                <geo ref="2"/>
        </list>
        <geo id="1">
                <comment value="1"/>
                <rect>
                        <pos x="2" y="3"/>
                        <width>4</width>
                        <height>5</height>
                </rect>
        </geo>
        <geo id="2">
                <comment value="2">
                <rect>
                        <pos x="6" y="7"/>
                        <width>8</width>
                        <height>9</height>
                </rect>
        </geo>
        <comments>
                <comment cid="1">First Rectangle</comment>
                <comment cid="2">Second Rectangle</comment>
        </comments>
 </ex>
 Now consider:
 rect := builder defineElement: 'Rect' class: 'Rectangle'
 rect
        mapPath: ('pos' /@ 'x') toType: 'Int';
        "... and so on"
 (rect mapPath: (ParentAxis /@ 'id') toType: 'Int')
        key: 'geo-keychain'.
 (rect mapPath: (ParentAxis / 'comment' /@ 'value') toType: 'Int')
        reference: 'comment-keychain'.
 comment := builder defineCData: 'Comment'.
 (comment mapPath: (AttributeAxis ? 'cid') toType: 'Int')
        key: 'comment-keychain'.
 doc := builder defineElement: 'doc' class: 'Set'.
 (doc mapPath: ('ex' / 'list' / 'geo' /@ 'ref') toType:
'Int')
        reference: 'geo-keychain';
        setter: #add:.
 (doc mapPath: ('ex' / 'geo' / 'rect') toType: 'Rect')
        transient.
 (doc mapPath: ('ex' / 'comments' / 'comment') toType:
'Comment')
        transient.
 Ignoring my potential typos, we get:
 (Set new)
        add: ((Rectangle new "...") comment: 'First Rectangle');
        add: ((Rectangle new "...") comment: 'Second Rectangle').
 Please note #transient in the doc's definition. This setting is used to
 parse the matched nodes but without setting the created objects in their
 parent. When configuring an id via #key:, the mapping is transient by
 default, since we rarely want to preserve the xml ids.
 Although this example is a bit bigger, I think it illustrates how
 SimpleXO manages to map a complex xml straight to a much simpler object
 tree.
 Hope this gives you further insights!
 Best regards,
 Steffen
 PS: Using the external DSL, the mapping can be written as follows:
 'element Rect {
        class: Rectangle
        pos/@x >> Int
 #... and so on
        ../@id >> Int (key: geo-keychain)
        ../comment/@value >> Int (ref: comment-keychain)
 }
 cdata Comment {
        @cid >> Int (key: comment-keychain)
        }
 root element Doc {
        class: Set
        ex/list/geo/@ref >> Int (ref: geo-keychain setter: #add:)
        ex/geo/rect >> Rect (transient)
        ex/comments/comment >> Comment (transient)
        }'
 _______________________________________________
 Moose-dev mailing list
 Moose-dev(a)iam.unibe.ch
 https://www.iam.unibe.ch/mailman/listinfo/moose-dev

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

[Moose-dev] Re: XML mapping DSL (SimpleXO)