[Moose-dev] Re: XML Parser / Pastell

5 Mar 2010


      ---- On Thu, 04 Mar 2010 12:35:39 -0800 Alexandre Bergel alexandre@bergel.eu wrote ----
...
Hi,
I exchanged a number of emails with Jaayer and Norbert regarding some 
improvements of XMLSupport and its port to Gemstone. 
It may be a bit difficult for people to follow this, but I think it is 
important to not discuss privately.
...
...
...
...
...
I already changed
XMLTokenizer>>nextName 
.... 
^ self fastStreamStringContents: nameBuffer
to
XMLTokenizer>>nextName 
.... 
^ (self fastStreamStringContents: nameBuffer) asSymbol
in the gemstone parser to be more consistent.
Have you noticed any slow down for this?
No I didn't do any tests. But if internally all names are symbols 
than I guess converting it while reading is the best way to do.
I added benchmark1 in XMLParserTest. Really simple way to measure 
progress (or slowdown). 
On my machine, I have: 
XMLParserTest new benchmark1 
=> 2097
Adding "(self fastStreamStringContents: nameBuffer) asSymbol" 
increase the bench to 2273
I don't believe this ;) you read them as string from the stream. If 
they are managed as symbols somehow they need to be converted. If 
not at this place then on some other. I would suspect that there are 
doubled calls to asSymbol. Could you check the sources?
There is indeed a slowdown. I am not sure where it comes from however. 
Executing twice "XMLParserTest new benchmark1" does not return the 
same result. Actually, it increases at each execution! I thought that 
a garbage collect before running the bench would help, does apparently 
it does not.
Calling asSymbol on a symbol should not be perceptible I believe.
Cheers, 
Alexandre
You should run those benchmarks longer, perhaps 600 times instead of 300, to get a more stable result. I loaded your most-recent package into a clean image and got similar results to what you got, with the current non-converting version being slightly faster. However, in my development image (with all of the changes I have made since my last release), the converting version is slightly faster, and both versions are overall faster. I haven't been able to work much on the parsers and tokenizer yet, but it appears they are still largely string-based, so I am not sure if making changes like this is good idea at this point.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

[Moose-dev] Re: XML Parser / Pastell