2015-04-01 13:12 GMT+02:00 Nicolai Hess <nicolaihess@web.de>:
Hi Offray,

2015-03-31 18:17 GMT+02:00 Offray Vladimir Luna Cárdenas <offray@riseup.net>:
Hi,

Following the advice of Peter Uhnák on tag clouds and avatars I made some progress on my intended visualization. If you run the code at [1] you will get something similar to [2] (the difference is that screenshot is for code inside a grafoscopio document instead of a simple playground).


[1] http://ws.stfx.eu/9G5PEGYFL1MW
[2] http://mutabit.com/deltas/repos.fossil/datapolis/doc/tip/Figures/personal-tagcloud.png

I will prioritize working on scrapping and cleaning the data, leaving the position of the avatar to the end (hopefully Alexandre will read this and in his attempt to make Roassal the best visualization engine in the universe and its users happier, he will implement my suggestion at the end).

So in my attempt to clean the data I'm trying to process originalText (look at [1]) to split it to single words. For that I start copying that text and replacing any occurrence of punctuation characters and parenthesis by spaces and then applying #splitOn: ' ' to the new string. I made this by the chunk of code at [3], but seems inelegant and trying to use cascades and ending in #yourself didn't make the trick.

=[3]==========================

cookedText1 := originalText.
cookedText1 := cookedText1 copyReplaceAll: ',' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: ';' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: '.' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: ':' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: ')' with: ' '.
cookedText1 := cookedText1      copyReplaceAll: '(' with: ' '.
==============================

So here come my questions:

a) There is any form to replace code at [3] by a more elegant Smalltalk-ish way so I can have only words no matter if they are separated by spaces, punctuation marks of starting/ending with parenthesis?

Did you try RxMatcher? Probably much slower, but more flexible.
cookedText1 := (RxMatcher forString:'\w+') matchesIn:originalText.

Another way:

cookedText1 := originalText splitOn:[:x| x isLetter not].

and for removing empty and uninteresting words:

cookedText1 := cookedText1 reject:[:k | k size < 2 or:[uninterestingWords includes:k asLowercase]].

and finally create a new space delimited string:

cookedText2 := String streamContents:[:s| cookedText1 asStringOn:s delimiter: String space].


nicolai