Weaver is the first distributed transactional graph database, strong consistency is being achieved by using Hyperdex distributed storage.
There is not much information on how it is implemented, performance benchmarks are also a bit vague, but it states that Weaver performs several times faster than Titan and GraphLab. For pagerank-like benchmark (what does that mean?) Weaver performs slower.
It could be very intesting to try except that setting up Hyperdex is a pain. One has to start 3 different daemons with different configs, there are a bunch of scripts to support it and I do not know what will happen if one of them fails.
Weaver is in alpha stage so far, but looks interesting.
Facebook is a maze for developing data store and access algorithms. Their scale allows and actually requires to find out new ways of storing information.
For example social graph data – it is stored in MySQL and used to use aggressive memcached cache. Now Facebook uses Tao – graph-aware storage for social graph with its own graph cache.
Facebook uses 3 levels of persistent storage: hot data is stored in Haystack storage – it’s lowest level is the prototype I used for Eblob. I would describe Elliptics storage with DHT, i.e. multiple servers to store one data replica, as this level of abstraction.
Second comes so called WARM data – data which is accessed by 2 orders of magnitude less frequently than that at HOT level. Warm data is stored in F4 storage, Facebook describes this split and motivation in this article.
I saw Elliptics installation with one disk per group, i.e. without DHT, as a storage for this access level of the data.
And the last is a COLD storage, something weird and scientifically crazy like robotic hand with a stack of Blu-Ray disks.
Let me dilute elliptics distributed storage (argh, this brings us money, how can we stop talking about it?) posts with some high-level abstract talk.
I have an interesting idea in mind to play with, and to switch brain into different thinking model I decided to play a bit with functional programming.
I used to work with LISP some time ago, but was never actually in FP world and Common LISP has horrible standard library – it is virtually empty, you have to implement just everything. In particular CLISP had quite por external library set either.
So I decided to learn Clojure a bit for fun – it has quite spectacular standard library, superb external projects repository and so many things from java world. I do not like java, but clojure looks interesting.
And it is faster than Python.
To play with my ideas (semantic relations and so on) I believe graph database is the best suit, so I decided to work with Neo4j and use Borneo connector. Is there any better choice, since it happened to be a non-trivial task for the newcomer like me.
If things will resolve successfully, I will definitely need for larger data sets we have in elliptics, so who knows, what we will build on top of it :)