---- On Thu, 04 Mar 2010 12:35:39 -0800 Alexandre Bergel alexandre@bergel.eu wrote ----
Hi,
I exchanged a number of emails with Jaayer and Norbert regarding some improvements of XMLSupport and its port to Gemstone. It may be a bit difficult for people to follow this, but I think it is important to not discuss privately.
I already changed
XMLTokenizer>>nextName .... ^ self fastStreamStringContents: nameBuffer
to
XMLTokenizer>>nextName .... ^ (self fastStreamStringContents: nameBuffer) asSymbol
in the gemstone parser to be more consistent.
Have you noticed any slow down for this?
No I didn't do any tests. But if internally all names are symbols than I guess converting it while reading is the best way to do.
I added benchmark1 in XMLParserTest. Really simple way to measure progress (or slowdown). On my machine, I have: XMLParserTest new benchmark1 => 2097
Adding "(self fastStreamStringContents: nameBuffer) asSymbol" increase the bench to 2273
I don't believe this ;) you read them as string from the stream. If they are managed as symbols somehow they need to be converted. If not at this place then on some other. I would suspect that there are doubled calls to asSymbol. Could you check the sources?
There is indeed a slowdown. I am not sure where it comes from however. Executing twice "XMLParserTest new benchmark1" does not return the same result. Actually, it increases at each execution! I thought that a garbage collect before running the bench would help, does apparently it does not.
Calling asSymbol on a symbol should not be perceptible I believe.
Cheers, Alexandre
The most recent version is completely string-based. The performance of the symbol-based predecessors was erratic; it would be terribly slow at first in a clean image and improve only after saving the image at least once. Even then my tests indicated that a purely string-based version would be faster. This is likely due to the initial overhead of interning symbols and then the subsequent overhead of looking them up. I did honestly prefer the #symbol syntax for naming things, but considering that the things so-named and the names themselves all begin life (from the tokenizer) as strings, it makes more sense (and requires less code) to keep them that way. It is also more portable, as we no longer need assume that Symbol is a subclass of String. I don't think this will cause issues in Squeak/Pharo code, as both systems (for now) assume #name = 'name'. Hopefully it will not cause too much trouble for Norbert.
Alexandre, you should see an improvement in your benchmarks now.
anElement attributes class (I wrote species but that will fail in gemstone too I guess. Hell!)
I just committed with species. Let me know. This is easy to adjust.
Ok.
The gemstone XML Parser decides somehow to use IdentityDictionary internally. I think this should be allowed. I just changed Dictionary to IdentityDictionary so the = test reflects the right type. If you change it it is clear that it fails. Because pharo uses Dictionary internally. So you might see that it is not a question of using Dictionary or IdentityDictionary but a question of the wrongness of using =
How the XMLNodeTest should look like to accommodate your situation?
I need to recheck this. The problem is really that the XML Parser creatios instances of class Association but { #key->'value' } creates an instance of class SymbolAssociation. That means I would know how to fix the test but I want to understand the implications of all of this. I'll get to you if I know anything new.
Ok.
My mail to the gemstone list led to a ticket about removing class checks from Association. That would be easing the handling a lot.
Norbert
Alexandre
>> Here there is an assumption about the allAttributes collection >> while using = as comparsion. But there is also an assumption >> about the order of the content. I changed this to >> >> self assert: (firstPerson allAttributes includesAllOf: >> #(#'first-name' #'employee-number' #'family-name')). >> self assert: (firstPerson allAttributeAssociations asArray >> includesAllOf: {(#'first-name'->'Bob'). (#'employee-number'- >> >'A0000'). (#'family-name'->'Gates')}). > > Very right. My mistake. But wouldn't an asSortedCollection do > the thing? Do you not test the size of the array. > >> This is not the best way to do because the check is only in >> one direction but for this test it is ok. Somehow the second >> assert fails and I have to check what is going on here. > > Yeah, my mistake. Sorry. The elements may be differently > ordered. Would a asSortedCollection help? > > I have now granted you an access to the repository. You should > be able to directly commit in it. > > Jaayer, what is your Squeaksource account? > > Cheers, > Alexandre > >> >> >> >> On 28.02.2010, at 02:01, Alexandre Bergel wrote: >> >>>> thanks for now. I did a first merge attempt. It will be >>>> quite a bit of work. For me the xml parser is an important >>>> component. With the newest changes it became biased towards >>>> pharo. There are things like ClassTestCase, Unicode >>>> CharacterSet. These are for sure improvements/changes in >>>> pharo you like to use. But they make porting a lot more >>>> difficult. I would be glad if we could find some way to >>>> lower the porting barrier. The necessary class I could put >>>> in the squeak compat package in gemstone. But then the xml >>>> parser will depend on the squeak package which I don't like. >>> >>> >>> Hi Norbert, >>> >>> XMLParser effectively depends on Squeak specific classes. I >>> wrote a small script that identify the squeak classes used in >>> XML-Support. Here is the list: LanguageEnvironment, Unicode, >>> LocaleID, CharacterSet >>> >>> I guess that porting the whole multilingual support may not >>> be that easy. The tag xml:lang is used to select the proper >>> support. It should be easy for you to ignore it I guess. >>> >>> CharacterSet seems to be one that has to be ported. It is not >>> a big class. It depends on WideCharacterSet. I am not sure >>> whether this is useful in your case however. >>> >>> Cheers, >>> Alexandre >>> >>> -- >>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>> Alexandre Bergel http://www.bergel.eu >>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>> >>> >>> >>> >>> >> > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > >
-- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
-- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
-- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.