Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
Serge, let us know how it goes. What you are trying to do is really important.
Alexandre
On Jun 7, 2014, at 4:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote: I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote: Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow" _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Thank you Doru for the help.
First case: 74Mb XML file. I'm able to load the string in memory but the XML parser crash the image before the end. Second case: 10Mb XML file. I'm able to load the string in memory, parse the XML file. When I start to process the XML file in order to create an object structure, the panel "Space is too low" appear.
I would like to give a try with the XMLPullParser. I guess this is working correctly, because I find it with the Configuration Browser.
On Sat, Jun 7, 2014 at 10:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Serge, you are using a mac don’t you? Have you tried to augment the memory of the image?
Open using a text editor the Info.plist file, contained in the VM folder. By checking on the internet, 1880000000 is apparently the biggest value possible. http://forum.world.st/OSX-squeak-crash-maximal-size-of-image-td2952312.html
Cheers, Alexandre
On Jun 8, 2014, at 4:35 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Thank you Doru for the help.
First case: 74Mb XML file. I'm able to load the string in memory but the XML parser crash the image before the end. Second case: 10Mb XML file. I'm able to load the string in memory, parse the XML file. When I start to process the XML file in order to create an object structure, the panel "Space is too low" appear.
I would like to give a try with the XMLPullParser. I guess this is working correctly, because I find it with the Configuration Browser.
On Sat, Jun 7, 2014 at 10:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Thank you for the trick Alex. We should put these information somewhere, maybe in a tuning chapter of Pharo Enterprise book ? Unfortunately even with the trick, I was unable to parse a 10Mb XML file ... Pharo memory grows up until 1.9 Go and crash after that.
I will try now with the pull parser.
On Sun, Jun 8, 2014 at 11:05 PM, Alexandre Bergel alexandre.bergel@me.com wrote:
Serge, you are using a mac don’t you? Have you tried to augment the memory of the image?
Open using a text editor the Info.plist file, contained in the VM folder. By checking on the internet, 1880000000 is apparently the biggest value possible. http://forum.world.st/OSX-squeak-crash-maximal-size-of-image-td2952312.html
Cheers, Alexandre
On Jun 8, 2014, at 4:35 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Thank you Doru for the help.
First case: 74Mb XML file. I'm able to load the string in memory but the XML parser crash the image before the end. Second case: 10Mb XML file. I'm able to load the string in memory, parse the XML file. When I start to process the XML file in order to create an object structure, the panel "Space is too low" appear.
I would like to give a try with the XMLPullParser. I guess this is working correctly, because I find it with the Configuration Browser.
On Sat, Jun 7, 2014 at 10:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
I think you are doing something wrong. Could you pass me the XML file?
Doru
On Mon, Jun 9, 2014 at 9:49 PM, Serge Stinckwich <serge.stinckwich@gmail.com
wrote:
Thank you for the trick Alex. We should put these information somewhere, maybe in a tuning chapter of Pharo Enterprise book ? Unfortunately even with the trick, I was unable to parse a 10Mb XML file ... Pharo memory grows up until 1.9 Go and crash after that.
I will try now with the pull parser.
On Sun, Jun 8, 2014 at 11:05 PM, Alexandre Bergel <alexandre.bergel@me.com
wrote:
Serge, you are using a mac don’t you? Have you tried to augment the memory of the image?
Open using a text editor the Info.plist file, contained in the VM folder. By checking on the internet, 1880000000 is apparently the biggest value possible.
http://forum.world.st/OSX-squeak-crash-maximal-size-of-image-td2952312.html
Cheers, Alexandre
On Jun 8, 2014, at 4:35 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Thank you Doru for the help.
First case: 74Mb XML file. I'm able to load the string in memory but the XML parser crash the image before the end. Second case: 10Mb XML file. I'm able to load the string in memory, parse the XML file. When I start to process the XML file in order to create an object structure, the panel "Space is too low" appear.
I would like to give a try with the XMLPullParser. I guess this is working correctly, because I find it with the Configuration Browser.
On Sat, Jun 7, 2014 at 10:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
You can try in a fresh Moose 5.0 image :
================================================================== |graph builder contents document|
Gofer it url: 'http://smalltalkhub.com/mc/SergeStinckwich/Moose-XES/main'; package: 'Moose-XES'; load. contents := ((GZipReadStream on: ' http://data.3tu.nl/repository/uuid:f5ea9bc6-536f-4744-9c6f-9eb45a907178/DATA...' asUrl retrieveContents) upToEnd) asString. document := (XMLDOMParser on: contents) documentReadLimit: contents size; parseDocument. graph := (XESAlphaAlgorithm on:(XESParser new parseDocument: document)) run. builder := RTGraphBuilder new. builder nodes if:[:m| m type = #transition]; shape:(RTEllipse new size: 20) + RTLabel. builder nodes if:[:m| m type = #place]; shape:(RTBox new size: 20). builder edges seed: graph edges; connectFrom: #from; connectTo: #to; useInLayout. builder layout forceCharge: -350. builder addAll: graph nodes. builder open. ==================================================================
The problem is not parsing the XML file but generating the object tree after that.
Thank you.
On Mon, Jun 9, 2014 at 10:15 PM, Tudor Girba tudor@tudorgirba.com wrote:
I think you are doing something wrong. Could you pass me the XML file?
Doru
On Mon, Jun 9, 2014 at 9:49 PM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
Thank you for the trick Alex. We should put these information somewhere, maybe in a tuning chapter of Pharo Enterprise book ? Unfortunately even with the trick, I was unable to parse a 10Mb XML file ... Pharo memory grows up until 1.9 Go and crash after that.
I will try now with the pull parser.
On Sun, Jun 8, 2014 at 11:05 PM, Alexandre Bergel < alexandre.bergel@me.com> wrote:
Serge, you are using a mac don’t you? Have you tried to augment the memory of the image?
Open using a text editor the Info.plist file, contained in the VM folder. By checking on the internet, 1880000000 is apparently the biggest value possible.
http://forum.world.st/OSX-squeak-crash-maximal-size-of-image-td2952312.html
Cheers, Alexandre
On Jun 8, 2014, at 4:35 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Thank you Doru for the help.
First case: 74Mb XML file. I'm able to load the string in memory but the XML parser crash the image before the end. Second case: 10Mb XML file. I'm able to load the string in memory, parse the XML file. When I start to process the XML file in order to create an object structure, the panel "Space is too low" appear.
I would like to give a try with the XMLPullParser. I guess this is working correctly, because I find it with the Configuration Browser.
On Sat, Jun 7, 2014 at 10:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
I am not sure I understand.
The issue is not the XML loading at all. Instead, it seems to be in the code that produces the model out of the DOM tree.
In this case, why would you look for the PullXMLParser?
Doru
On Mon, Jun 9, 2014 at 10:41 PM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
You can try in a fresh Moose 5.0 image :
================================================================== |graph builder contents document|
Gofer it url: 'http://smalltalkhub.com/mc/SergeStinckwich/Moose-XES/main'; package: 'Moose-XES'; load. contents := ((GZipReadStream on: ' http://data.3tu.nl/repository/uuid:f5ea9bc6-536f-4744-9c6f-9eb45a907178/DATA...' asUrl retrieveContents) upToEnd) asString. document := (XMLDOMParser on: contents) documentReadLimit: contents size; parseDocument. graph := (XESAlphaAlgorithm on:(XESParser new parseDocument: document)) run. builder := RTGraphBuilder new. builder nodes if:[:m| m type = #transition]; shape:(RTEllipse new size: 20) + RTLabel. builder nodes if:[:m| m type = #place]; shape:(RTBox new size: 20). builder edges seed: graph edges; connectFrom: #from; connectTo: #to; useInLayout. builder layout forceCharge: -350. builder addAll: graph nodes. builder open. ==================================================================
The problem is not parsing the XML file but generating the object tree after that.
Thank you.
On Mon, Jun 9, 2014 at 10:15 PM, Tudor Girba tudor@tudorgirba.com wrote:
I think you are doing something wrong. Could you pass me the XML file?
Doru
On Mon, Jun 9, 2014 at 9:49 PM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
Thank you for the trick Alex. We should put these information somewhere, maybe in a tuning chapter of Pharo Enterprise book ? Unfortunately even with the trick, I was unable to parse a 10Mb XML file ... Pharo memory grows up until 1.9 Go and crash after that.
I will try now with the pull parser.
On Sun, Jun 8, 2014 at 11:05 PM, Alexandre Bergel < alexandre.bergel@me.com> wrote:
Serge, you are using a mac don’t you? Have you tried to augment the memory of the image?
Open using a text editor the Info.plist file, contained in the VM folder. By checking on the internet, 1880000000 is apparently the biggest value possible.
http://forum.world.st/OSX-squeak-crash-maximal-size-of-image-td2952312.html
Cheers, Alexandre
On Jun 8, 2014, at 4:35 PM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
Thank you Doru for the help.
First case: 74Mb XML file. I'm able to load the string in memory but the XML parser crash the image before the end. Second case: 10Mb XML file. I'm able to load the string in memory, parse the XML file. When I start to process the XML file in order to create an object structure, the panel "Space is too low" appear.
I would like to give a try with the XMLPullParser. I guess this is working correctly, because I find it with the Configuration Browser.
On Sat, Jun 7, 2014 at 10:56 PM, Tudor Girba tudor@tudorgirba.com wrote:
Hi,
Ok, I looked a bit, and here is a way to get it to work. It is not at all ideal given that it loads the entire string before parsing it, but it surely will work with an xml of your size:
contents := fileReference readStreamDo: [ :stream | stream contents ]. svnlog := (XMLDOMParser on: contents ) documentReadLimit: contents size; parseDocument.
Cheers, Doru
On Fri, Jun 6, 2014 at 12:50 PM, Tudor Girba tudor@tudorgirba.com wrote:
I encountered a similar situation before, but there is a way to go beyond that limit.
I do not have the code at my disposal right now, but look at the XMLDOMParser constructor, and at some point you will see a hardcoded limit value. You should be able to pass another one in.
Cheers, Doru
On Fri, Jun 6, 2014 at 11:22 AM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
Dear all,
I want to process large XML file (typically event logs with more than 80Mb data) and I'm not able to do that at the moment with the XML DOM parser (it says I reach the read limit after 3094 XML lines) and I guess I will have problem to manage such a large file in memory after that.
Should I switch to an event-driven XML parser in order to avoid loading all the XML file in memory ? Do we have such a parser for Pharo ?
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
On Wed, Jun 11, 2014 at 4:22 PM, Tudor Girba tudor@tudorgirba.com wrote:
I am not sure I understand.
The issue is not the XML loading at all. Instead, it seems to be in the code that produces the model out of the DOM tree.
Yes, you are right.
In this case, why would you look for the PullXMLParser?
Saving some memory space for my model ;-)
I see. But, I think your problem is of a different nature given that you said that it went to 1.9 Gb :)
Doru
On Wed, Jun 11, 2014 at 4:38 PM, Serge Stinckwich < serge.stinckwich@gmail.com> wrote:
On Wed, Jun 11, 2014 at 4:22 PM, Tudor Girba tudor@tudorgirba.com wrote:
I am not sure I understand.
The issue is not the XML loading at all. Instead, it seems to be in the code that produces the model out of the DOM tree.
Yes, you are right.
In this case, why would you look for the PullXMLParser?
Saving some memory space for my model ;-)
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
hum, you are right, I have to think how to reduce the footprint of my model creation ... I don't have time to wait for Spur ;-)
On Wed, Jun 11, 2014 at 4:53 PM, Tudor Girba tudor@tudorgirba.com wrote:
I see. But, I think your problem is of a different nature given that you said that it went to 1.9 Gb :)
Doru
On Wed, Jun 11, 2014 at 4:38 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
On Wed, Jun 11, 2014 at 4:22 PM, Tudor Girba tudor@tudorgirba.com wrote:
I am not sure I understand.
The issue is not the XML loading at all. Instead, it seems to be in the code that produces the model out of the DOM tree.
Yes, you are right.
In this case, why would you look for the PullXMLParser?
Saving some memory space for my model ;-)
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
I am *very* suprised of this thread given that I have reported several times in the pharo mailing lists that XML DOM parsing is not the right solution for processing large XML files.
XMLDOM parsing is not even adequate for files larger than 10 Mbytes. Why some people insist using it?
The XMLPullParser is the right way to go.
Cheers,
Hernán
2014-06-11 12:01 GMT-03:00 Serge Stinckwich serge.stinckwich@gmail.com:
hum, you are right, I have to think how to reduce the footprint of my model creation ... I don't have time to wait for Spur ;-)
On Wed, Jun 11, 2014 at 4:53 PM, Tudor Girba tudor@tudorgirba.com wrote:
I see. But, I think your problem is of a different nature given that you said that it went to 1.9 Gb :)
Doru
On Wed, Jun 11, 2014 at 4:38 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
On Wed, Jun 11, 2014 at 4:22 PM, Tudor Girba tudor@tudorgirba.com wrote:
I am not sure I understand.
The issue is not the XML loading at all. Instead, it seems to be in the code that produces the model out of the DOM tree.
Yes, you are right.
In this case, why would you look for the PullXMLParser?
Saving some memory space for my model ;-)
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Hi,
I would distinguish between use cases.
XMLDOM is extremely convenient for prototyping in particular in conjunction with GTInspector. I do not have to leave the inspection context to write code somewhere else and come back at a later time. Instead I focus on extracting what I want. I worked like this with XML files of ~100MB without any problems other than the funny default size limit that you have to overpass explicitly.
However, it is true that I would not recommend it for long term usage.
Cheers, Doru
On Sat, Jun 14, 2014 at 11:00 PM, Hernán Morales Durand < hernan.morales@gmail.com> wrote:
I am *very* suprised of this thread given that I have reported several times in the pharo mailing lists that XML DOM parsing is not the right solution for processing large XML files.
XMLDOM parsing is not even adequate for files larger than 10 Mbytes. Why some people insist using it?
The XMLPullParser is the right way to go.
Cheers,
Hernán
2014-06-11 12:01 GMT-03:00 Serge Stinckwich serge.stinckwich@gmail.com:
hum, you are right, I have to think how to reduce the footprint of my model creation ... I don't have time to wait for Spur ;-)
On Wed, Jun 11, 2014 at 4:53 PM, Tudor Girba tudor@tudorgirba.com
wrote:
I see. But, I think your problem is of a different nature given that you said that it went to 1.9 Gb :)
Doru
On Wed, Jun 11, 2014 at 4:38 PM, Serge Stinckwich serge.stinckwich@gmail.com wrote:
On Wed, Jun 11, 2014 at 4:22 PM, Tudor Girba tudor@tudorgirba.com
wrote:
I am not sure I understand.
The issue is not the XML loading at all. Instead, it seems to be in
the
code that produces the model out of the DOM tree.
Yes, you are right.
In this case, why would you look for the PullXMLParser?
Saving some memory space for my model ;-)
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- www.tudorgirba.com
"Every thing has its own flow"
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
-- Serge Stinckwich UCBN & UMI UMMISCO 209 (IRD/UPMC) Every DSL ends up being Smalltalk http://www.doesnotunderstand.org/ _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev