New subject: Tagcloud around avatar part 2: Advances, questions and suggestions

31 Mar 2015


      Hi,
Following the advice of Peter Uhnák on tag clouds and avatars I made 
some progress on my intended visualization. If you run the code at [1] 
you will get something similar to [2] (the difference is that screenshot 
is for code inside a grafoscopio document instead of a simple playground).
[1] http://ws.stfx.eu/9G5PEGYFL1MW
[2] 
http://mutabit.com/deltas/repos.fossil/datapolis/doc/tip/Figures/personal-ta...
I will prioritize working on scrapping and cleaning the data, leaving 
the position of the avatar to the end (hopefully Alexandre will read 
this and in his attempt to make Roassal the best visualization engine in 
the universe and its users happier, he will implement my suggestion at 
the end).
So in my attempt to clean the data I'm trying to process originalText 
(look at [1]) to split it to single words. For that I start copying that 
text and replacing any occurrence of punctuation characters and 
parenthesis by spaces and then applying #splitOn: ' ' to the new string. 
I made this by the chunk of code at [3], but seems inelegant and trying 
to use cascades and ending in #yourself didn't make the trick.
=[3]==========================
cookedText1 := originalText.
cookedText1 := cookedText1 copyReplaceAll: ',' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: ';' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: '.' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: ':' with: ' '.
cookedText1 := cookedText1 copyReplaceAll: ')' with: ' '.
cookedText1 := cookedText1	copyReplaceAll: '(' with: ' '.
==============================
So here come my questions:
a) There is any form to replace code at [3] by a more elegant 
Smalltalk-ish way so I can have only words no matter if they are 
separated by spaces, punctuation marks of starting/ending with parenthesis?
b) Why some uninteresting words like the Spanish 'La' or 'Se' are still 
getting their way in the final visualization even if I try to evade them 
with the code at [4]
=[4]==========================
(cookedText1 splitOn: ' ') do: [:word |
    ((word size > 1) & (uninterestingWords includes: word asLowercase) not) 
  ifTrue: [cookedText2 := cookedText2, word, ' ']].
==============================
And my suggestion:
Please consider making tag clouds with variable layouts and forms. 
Python has something similar with [5]
[5] http://sebastianraschka.com/Articles/2014_twitter_wordcloud.html
I will be waiting for your suggestions and thanks for keeping 
Pharo/Moose awesome!
Cheers,
Offray