----- Original Message -----
| From: "Stéphane Ducasse" <stephane.ducasse(a)inria.fr>
| To: "Moose-related development" <moose-dev(a)iam.unibe.ch>
| Sent: Tuesday, August 20, 2013 2:30:18 PM
| Subject: [Moose-dev] Re: persisting moose models
|
| How GS manage pointers to a saved objects?
GemStone uses an object table, so every object reference in the body of an object on disk
is via an oop (unique object id) reference instead of a direct pointer.
In the vm these oop references are turned into a memory pointer on demand as an instance
variable is referenced. The vm caches persistent objects and can flush unmodified
persistent objects under memory pressure. Dirty persistent objects and non-persistent
objects are stored in a separate memory space ... the only way to flush these objects is
to do a commit ... in practice you end up with a working set of a portion of the entire
object graph with new object references faulted in on demand and old objects flushed from
memory...
So you can operate on a million element collection without having the whole collection in
memory at any one time. Large objects (like a million element collection) is broken up
into a tree of large object nodes (~2000 oops per node) ...
| Because I could see a scenario (a la marea and fuel) where we store
| packages (kind of roots of famix graphs) into GS (but it could be
| Fuel)
| the problem is how to handle pointers pointing from one of these
| packages to another one that it not loaded.
The "natural" way to store things in GemStone would be as a complete object
graph ... you've got live objects and behavior that you can leverage, so one could
perform the queries like I've mentioned to dynamically define an object graph that is
then shipped to pharo (using Fuel?) ... We can use oops to reference the "external
objects" at the boundaries of the object subgraph.
This is the area where we'd need to work out the details..
|
| Second problem: how do you prevent that one query like allClasses
| reloads everything in memory (which you cannot because you do not
| have enough memory
| else everything would be in memory).
If you have a query whose result set will be larger than memory, you make the result set
persistent itself and do intermediate commits when "memory fills up". Because we
don't use memory pointers for persistent objects the whole result set doesn't have
to be in memory at the same time ...
Dale