Clustering

List overview All Threads
Download

newer

older

[famix 3.0] uml diagram of memoria

Re: distribution map in 6 lines?

Toon Verwaest

5 Jul 2007 5 Jul '07

7:11 a.m.

Hi,

I am the student working with Gabriela on adopting the Koschke clustering algorithms to OO. Of the basic algorithms I have almost implemented all of them (type-based is not really possible, and have no idea yet if same-expression is adoptable using famix).

Most of the algorithms can be adopted quite straightforward, but sometimes using them automatically as was the case, is out of the question. For that reason I have build a browsable interface around the basic steps of the clustering so the user has a bigger input. It's main task is helping the user to wade through the mass of information all the clustering techniques provide, as fast as possible.

Most of the algorithms aren't based on distance metrics, but there are a few which are. I hadn't come around to implementing the hierarchical clustering on top of the basic steps yet, so those algorithms come quite in handy. I will check how I can incorporate them, thanks for that Andrian.

Further questions are welcome...

Best regards,

Toon

Show replies by date

Stéphane Ducasse

5 Jul 5 Jul

9:17 a.m.

Sounds excellent! Thanks for sharing that with us.

Stef On 5 juil. 07, at 07:11, Toon Verwaest wrote:

...

Hi,

I am the student working with Gabriela on adopting the Koschke clustering algorithms to OO. Of the basic algorithms I have almost implemented all of them (type-based is not really possible, and have no idea yet if same-expression is adoptable using famix).

Most of the algorithms can be adopted quite straightforward, but sometimes using them automatically as was the case, is out of the question. For that reason I have build a browsable interface around the basic steps of the clustering so the user has a bigger input. It's main task is helping the user to wade through the mass of information all the clustering techniques provide, as fast as possible.

Most of the algorithms aren't based on distance metrics, but there are a few which are. I hadn't come around to implementing the hierarchical clustering on top of the basic steps yet, so those algorithms come quite in handy. I will check how I can incorporate them, thanks for that Andrian.

Further questions are welcome...

Best regards,

Toon _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Tudor Girba

10:17 a.m.

Hi,

Indeed, it sounds very nice. Perhaps you can send us some pictures you produced with Mondrian :).

As for types, they are in FAMIX, only the information is not available for Smalltalk. For Java case studies, you should have the information, but I just noticed that unfortunately iPlasma does not export type information for methods and attributes. We should fix that.

Cheers, Doru

On Jul 5, 2007, at 9:17 AM, Stéphane Ducasse wrote:

...

Sounds excellent! Thanks for sharing that with us.

Stef On 5 juil. 07, at 07:11, Toon Verwaest wrote:

...
Hi,

I am the student working with Gabriela on adopting the Koschke clustering algorithms to OO. Of the basic algorithms I have almost implemented all of them (type-based is not really possible, and have no idea yet if same-expression is adoptable using famix).

Most of the algorithms can be adopted quite straightforward, but sometimes using them automatically as was the case, is out of the question. For that reason I have build a browsable interface around the basic steps of the clustering so the user has a bigger input. It's main task is helping the user to wade through the mass of information all the clustering techniques provide, as fast as possible.

Most of the algorithms aren't based on distance metrics, but there are a few which are. I hadn't come around to implementing the hierarchical clustering on top of the basic steps yet, so those algorithms come quite in handy. I will check how I can incorporate them, thanks for that Andrian.

Further questions are welcome...

Best regards,

Toon _______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev

-- www.iam.unibe.ch/~girba www.iam.unibe.ch/~girba/blog/

"Being happy is a matter of choice."

Gabriela Arevalo

3:04 p.m.

Hi guys,

...

Indeed, it sounds very nice. Perhaps you can send us some pictures you produced with Mondrian :).

Ok. We will do it!

...

As for types, they are in FAMIX, only the information is not available for Smalltalk. For Java case studies, you should have the information, but I just noticed that unfortunately iPlasma does not export type information for methods and attributes. We should fix that.

That can be a future work ... :) with Toon or another student :). Toon has to finish soon (including his writing :) ) My idea was just take Koschke algorithms that were applied to procedural programs and adapt them to object-oriented applications using purely oo mechanisms. From my viewpoint, the use of type can bring us several additional problems, that will give us only noise in the beginning. That's the reason I wanted to test firstly in Smalltalk-based cases studies. Koschke algos use simple relationships between entities in a program. We want to extrapolate that to OO apps, and see how adaptable or not. I have to talk to Toon, because I am not sure how far or close we are of algos that Adrian implemented. I am sure that we are far from doing semantic clustering, because we have read about it in the beginning of the master thesis.

Cheers,

Gabi

Toon Verwaest

10:35 p.m.

...

Indeed, it sounds very nice. Perhaps you can send us some pictures you produced with Mondrian :).

The most spectacular ones are:

http://infogroep.be/~tverwaes/visualization.gif

which shows different clusters and the parts of the hierarchies in a program represented by those clusters. Clusters have reasons to cluster elements together, and those boxes outside the box with the hierarchy represent those reasons. The bigger the reason-box, the bigger theamount of classes it represents. The darker the classes in the hierarchies are, the more reasons support this class in the cluster. If a class is green it means that a superclass as well as a subclass are present in the cluster, but not the class itself. I call these ghostclasses since they probably should be present in the cluster as well, but are not clustered by the algorithms. The white boxes in the top with yellow rays are classes not available in the cluster, but classes which are needed for the cluster to be usable (superclasses from which some classes in the cluster need to inherit). The gray ray means a cluster supports a class, while a blue ray means the whole subhierarchy is supported (a way to limit the amount of rays..)

http://infogroep.be/~tverwaes/visualization2.gif

something I am currently working on, which places clusters in the bigger picture. This specific picture clusters classes according to the namespace in which they were defined. The "best" layout available for something like this appeared the horizontalDominance, that's the reason for the funky layout.

Greetz.

6610

Age (days ago)

6610

Last active (days ago)

moose-dev@list.inf.unibe.ch

4 comments

4 participants

tags (0)

participants (4)

Gabriela Arevalo
Stéphane Ducasse
Toon Verwaest
Tudor Girba