----- Original Message ----- | From: "Stéphane Ducasse" stephane.ducasse@inria.fr | To: "Moose-related development" moose-dev@iam.unibe.ch | Sent: Tuesday, August 20, 2013 2:30:18 PM | Subject: [Moose-dev] Re: persisting moose models | | How GS manage pointers to a saved objects?
GemStone uses an object table, so every object reference in the body of an object on disk is via an oop (unique object id) reference instead of a direct pointer.
In the vm these oop references are turned into a memory pointer on demand as an instance variable is referenced. The vm caches persistent objects and can flush unmodified persistent objects under memory pressure. Dirty persistent objects and non-persistent objects are stored in a separate memory space ... the only way to flush these objects is to do a commit ... in practice you end up with a working set of a portion of the entire object graph with new object references faulted in on demand and old objects flushed from memory...
So you can operate on a million element collection without having the whole collection in memory at any one time. Large objects (like a million element collection) is broken up into a tree of large object nodes (~2000 oops per node) ...
| Because I could see a scenario (a la marea and fuel) where we store | packages (kind of roots of famix graphs) into GS (but it could be | Fuel) | the problem is how to handle pointers pointing from one of these | packages to another one that it not loaded.
The "natural" way to store things in GemStone would be as a complete object graph ... you've got live objects and behavior that you can leverage, so one could perform the queries like I've mentioned to dynamically define an object graph that is then shipped to pharo (using Fuel?) ... We can use oops to reference the "external objects" at the boundaries of the object subgraph.
This is the area where we'd need to work out the details..
| | Second problem: how do you prevent that one query like allClasses | reloads everything in memory (which you cannot because you do not | have enough memory | else everything would be in memory).
If you have a query whose result set will be larger than memory, you make the result set persistent itself and do intermediate commits when "memory fills up". Because we don't use memory pointers for persistent objects the whole result set doesn't have to be in memory at the same time ...
Dale