Hi Doru,
When thinking about scalability of Moose, what scenarios do you have
in mind? Up to about half a terabyte, you can run out of main memory
on a single machine cost-effectively. The main limitation there is
the lack of a 64-bit vm and image. As far as I understand the access
patterns involved, a main memory based or distributed main memory
solution is far preferable for actually analyzing systems. What do you
hope to achieve by going to disk? When we did the data conversion project,
we thought about partitioning over multiple images but finally managed
with partial loading.
Stephan