this is a nice analysis.
Now I think that if different lib are needed for people then starting to build some of
them is important.
I know that Jannik also implemented and optimize tarjan and other graph algorithms for
moose so
cedric it would be probably a good start to check that and probably to make the moose
graph algo lib a separate entity. We (the moose people) also need a good graph lib.
Stef
1. Scalability
2. Data structures
3. Algorithms
4. Performance
5. Visualization
I think you need to at least consider all the above when creating a library.
In fact, you probably should break it up as many libraries do into several
components:
1. Common, useful data structures (Vertex, Edge, Adjacency List, Adjaceny
Matrix, etc.) that are optimized for your language - hopefully smalltalk
2. Computations / Measurements
3. Algorithms - layout, common graph problems, etc.
4. Persistence - database or otherwise, likely several supporting libs
specific to the persistence mechanism or implementing some kind of pattern
to be agnostic. The data structures should at least be designed in such a
way that they are persistence friendly to put in something like Gemstone,
Magma, Image, etc. See for example InfiniteGraph which has a Java interface
built on Objectivity.
5. Visualization - desktop, web, and static output.
The problem here is that creating a graph library is a huge undertaking. It
might not sound like it, but they can grow to epic proportions. From my
comments, you can see that it is really not the fault of any particular lib,
but rather the subject matter. You could spend a lifetime packing in
features. The real key is just to create a series of libs that work well
together and not constantly reinvent the wheel with node classes in each
library.
A secondary problem is doing graph analysis and even drawing graphs can take
a lot of horse power. Parallelism is a tough issue with graph libraries and
there's a multitude of approaches from pure threads to map/reduce to random
walking and stream processing. This is further compounded by persistence
where you need to start doing things like graph healing, partitioning, and
sharding to scale to massive levels and maintain performance.
I just want to mention to others that graphs are hugely useful in general to
solve a variety of problems from recommendations to meta programming. It's
not just pretty pictures and experiments with molecules. There's a lot of
potential in Smalltalk to do something since generally I feel Smalltalkers
aren't bound by or afraid of new approaches to old problems. Graph databases
in many cases could replace relational DBs for example and let your app do
all kinds of stuff you might never have thought of otherwise.
I could go on and on, but I'll stop myself here. Comments or thoughts?
Stéphane Ducasse wrote:
I've started looking at the exemples YossiDM gave to me and in particular
Lemon which was according to him his best experience. I found the model
quite clear and covering all what I expect for a generic graph lib
(directed, undirected, mapping concept, iterators, and algorithms of
course). Moreover and contrary to Boost, it's still developed in 2010.
To be more precise, here's what I expect for a generic graph lib in
smalltalk (note all in Lemon except visualization):
- data structure: directed graphs, undirected graphs, possible loop and
parallel edged, ..., trees (?)
- mapping: easily map objects, informations on nodes and/or edges (here,
don't know if I'd like subclassing nodes/eges instead...)
- iterators: efficient way to iterate over nodes and edges
- algorithms: basic algorithms implementation (bfs, dfs, ..., shortest
paths, ...), and plug-ability for specific ones...
- visualization: having an interactive graph visualization web/SVG and
eventually morphic (... graphviz, mondrian, .......)
then, I could use this for my research work...
- I need "belief" nodes with associated conditional beliefs tables
- I need to implement algorithms to propagate an information change
(change of a node state) in any nodes...
****mainly, I'd like to get junction trees from a graph [1] which are
rely useful for several domains in machine learning field ***
Actually, I don't know if I really need a graph lib as a simple
implementation directed to bayesian should be enough but it's the second
time I need graphs (last time was for planification) and I think that
would be great to have a nice and clean basic implementation.
Couldn't we start developing something similar to Lemon (regarding "API",
enitites, etc...) that would work for small scale project project in
smalltalk ?
It would be excellent.
Because now that you have a full time permanent position you can invest a
bit and in 2 years you can get something really sexy....
This is what we are doing all the time around pharo.
Yossi, what were the limitations you found with
Lemon ?
Cheers,
Cédrick
--
View this message in context:
http://forum.world.st/Graph-library-in-Smalltalk-Need-for-advices-tp3092747…
Sent from the Pharo Smalltalk mailing list archive at
Nabble.com.