Am 01.11.2011 um 18:58 schrieb Steffen Märcker:
Hi,
I am currently working on an XML to object mapping library. I came up with
an early version for a project, since I had to map weird/complex/poor XML
files. I'm in contact with Stephané Ducasse about a planned Pharo port.
He suggested to ask here for your opinions and ideas on this DSL/library.
SimpleXO consists of two parts, a Smalltalk Builder API that constructs a
parser object and a non-Smalltalk syntax meant for external binding
configuration. This post is on the API.
SimpleXO uses two concepts: types and mappings. A type describes how to
construct an object. It is configured with a class, a constructor and
mappings. A mapping defines which type a set of nodes is mapped to. The
nodes are given by a expression similar to XPath.
Short Example:
<geo id="1">
<rect>
<pos x="2" y="3" />
<width>4</width>
<height>5</height>
</rect>
</geo>
"Smalltalk'ish way:"
builder := SimpleXOParserBuilder new.
(builder defineElement: 'Rect')
class: 'Rectangle'; "given as string to postpone resolving"
mapPath: ('pos' /@ 'x') toType: 'Int';
mapPath: ('pos' /@ 'y') toType: 'Int';
mapPath: ('width') toType: 'Int';
mapPath: ('height') toType: 'Int';
mapPath: (AnyNodeTest /@ 'id') toType: 'Int'.
(builder defineCData: 'Int')
class: 'Integer';
constructor: #fromString:.
parser := builder buildParser: 'rect'.
parser mapNode: xmlNode.
"this gives the following object:"
(Rectangle new)
x: 2; y: 2;
width: 4; height: 5.
Actually we see two kinds of types: element and cdata. The only difference is that
elements are constructed using an unary constructor and cdata with a one-argument message
that is sent with the string-value of the current node. Of course, SimpleXO supports id
resolving, collecting values and tokenizing attribute values. You can find further
information on
https://wiki.aleturo.com/alpha/simplexo:start
I don't know the simpleXO stuff. But something came to my mind:
I would be really careful to get the mental mapping right in first place before mapping
the XML :) Your coarse dimensions are element and cdata. Cdata is associated with
primitive types. But then in your "width" mapping you map an element to a native
type. That makes only sense if this a convenient way of writing it.
Basically I would rather think of the schema types SimpleType and ComplexType instead of
element and cdata. Or as primitive and complex/composed type. Element in XML can be both a
SimpleType and a ComplexType.
If you then map SimpleType to primitive classes (e.g. Integer) and ComplexType to complex
classes (e.g. Rectangle) then you might think that the "width" case is a
problem. Well, it isn't really a problem. What you actually map is not
width -> Int
but
width / text() ->Int
text() retrieves the text nodes from an element. And voila your mapping is straight again.
To make your example above valid again you would need to add auto coercion meaning your
mapping is still SimpleType -> primitive and ComplexType -> complex type. So
width -> Int
=> ComplexType -> primitve
*coerce widht to SimpleType by adding /text()*
=> width/text() -> Int
If we solve the SimpleType -> SimpleType case what's about the ComplexType ->
ComplexType case? From your example I cannot see if this all is supposed to support
recursion.
As we fixed the mapping we should be able to map
mapPath: ('width') toType: 'Rectangle';
Now we have the ComplexType element width mapping to ComplexType Rectangle. Everything
needed now should be a type lookup that includes all elements created via defineElement:
and defineCData:
hope this adds something,
Norbert