Hi Doru,
When thinking about scalability of Moose, what scenarios do you have in mind? Up to about half a terabyte, you can run out of main memory on a single machine cost-effectively. The main limitation there is the lack of a 64-bit vm and image. As far as I understand the access patterns involved, a main memory based or distributed main memory solution is far preferable for actually analyzing systems. What do you hope to achieve by going to disk? When we did the data conversion project, we thought about partitioning over multiple images but finally managed with partial loading.
Stephan