Documentation, automatic tests and 3.0 elliptics release
26% (38 votes)
SMTP/IMAP elliptics storage backend
7% (11 votes)
POHMELFS elliptics frontend
30% (45 votes)
SSL support and more advanced merge config resolve in elliptics network
3% (5 votes)
LISP (and eventually C) regexp and subsequent LR grammatics analyzer
3% (4 votes)
HTML validator (based on above code though)
1% (1 vote)
PAXOS library and distributed locking
18% (27 votes)
Write your own - vote or lose!
1% (2 votes)
More tight electronics projects (w1 sniffer, digital electronics, robotics)
10% (15 votes)
Total votes: 148
Jeff Garzik wrote a daemon that implements PAXOS:
http://git.kernel.org/?p=daemon/cld/cld.git;a=summary
The idea here is basically that a mere library is not good enough and it's better to hook to a daemon through an API that involves a "coarse lock" primitive. This way, your distributed service selects a master without actually running a PAXOS itself. It's not a bad idea for many circumstances, IMHO.
I played with CLD and it's not quite there yet, but it flops around at least and kinda works. Jeff works on it and accepts patches, as the changelog says.
Actually, I take it back. Looks like CLD doesn't implement anything yet, pure vapor.
CLD is now starting to work for clients, but it is not yet a distributed service itself.
We have patches to turn tabled (S3 clone, git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git) into a replicated database. The process is the same for CLD, so that it a "flip a switch" type operation.
-jgarzik
Elliptics (and POHMELFS) HOWTO (like other tldp's HOWTO's).
and some brief explaination on how example server works, but documentation should be extended for sure.
I vote for DST+mdadm HOWTO (or DST+mdadm+heartbeat to build an HA cluster ?)
You can setup DST node like shown in example, and then use the same
mdadmcommands as with usual block devices.Something like
# mdadm --create dst-raid --level=1 --chunk=128 --raid-devices=2 /dev/dst-connect-1 /dev/dst-connect-2But what to do on failures ?
If I run that command line in one node (say A), I create an ext3 on top of that and start writing. Writes will be on the other node too (say B), great.
But if the node A goes down, what should I do ? I should run a similar mdadm rule on node B, with one disk 'detached' (the one from node A) ?
And what if node A comes up ? It would not be a problem because it also have the raid created ? How to handle this situations ?
What you pointed is a problem, which can not be solved on the storage layer - it just does not know who is active right now. Block device raid is a single-client solution only - it not allowed to have multiple active nodes.
Usually this is solved on higher layer, when there is a reliable detection of the active/failed nodes, so only one of them is allowed to mount filesystem and perform IO operations. At the storage itself layer it can not be solved.
Yeah, it makes sense :) And I was referring to a howto that shows how to do that.
Just an example of a "real" use of DST+mdadm, for example :). I mean a "complete" example, not just how to set up a dst node. And as an intresting example is to integrate it with mdadm, I think a complete example of how to do that (correctly) taking care of all posible situations might be interesting.
And btw, how does this compares to DRBD ? have you done any comparison ? in features ? in performance/benchmarks ?
Its perfectly OK if you dont want to make such howto or anything else. I dont want to disturb you, really. But as you never answer anything I thought perhaps you missed the comment I'm replying to.
continue with the good work, thanks alot :)
I just did not get to it yet. But you can help providing a draft or list of topics to be covered.
RAID from network block devices is useless if devices are quite big. The time of synchronization may be more than the probability of another drive failure.
1 Gbps is about the speed of modern hard drive. And network does not depend on access pattern, so it is pretty unlikely that network will be a limiting factor here.
Another question is that it is a single-client solution only.
In order of personal importance:
4 votes for the new elliptics release with documentation (includes new IO model) and automatic tests (write spreading is already implemented and committed into git tree)
2 votes for POHMELFS elliptics frontend
1 vote for SMTP/IMAP elliptics backend, LISP regexp and LR grammatics analyzer, PAXOS and distributed locks library
Without it, you can't implement reliable and consistent replication, which is critical for distributed storage. It's probably the hardest/funnest item in your list as well.
Completely agree, what is the use of a distributed storage if you can't trust what you see on the client. I've done some simple testing with 2.6.30-rc3 and the server from the lastest git and I can create a file on one client and the other won't see it and similair problems. That can't be a good thing.
This is strange that object created on one client did not appear on another, it should be properly handled by the cache coherency protocol. Problem will arise when there are multiple servers and different lock request order from those servers.
In the elliptics network there will be no locks for this kind of tasks at all - transactional nature of the updates will only force system to aciquire a distributed lock when multiple nodes have to be updated in parallel when we use replication.
Forcing a cache flush on writer when readers are active will require some kind of centralized locking. By centralized I mean not only global per-cluster lock daemon(s), but maybe per-node state machine which will remember what objects were read/written from/to it.
Existing cache coherency mechanism implements the latter case, but I'm not sure it will scale well.
The same problem may exist on NFS too:
NFS heavily depends on properly running
statdand friends, so effectively implements centralized non-scalable locking, which I'm trying to avoid, but POSIX requirement forces to implement some kind of event notifications for the aciquired locks.Sorry for the late reply, but I was kind of disappointed that it didn't work, so I didn't check the site all that often anymore.
Here is the setting:
I have one server a 2.6.28-11-generic Ubuntu with 2 Debian-stable virtual-machines with an upgraded kernel to 2.6.30-rc3.
I run a fserver on the Ubuntu-machine, and on the Debian-stable virtual machines I use cfg to setup a mount point and I point it the server (obviously).
I did a really small test this time and it went wrong immediatly, I'll see if I can find out if I'm running the wrong server or something. Because this was really trivial.
# ./fserver -a 172.22.0.1 -r export/
on clients:
# modprobe -v pohmelfs
# ./cfg -A show -i 2
# ./cfg -A add -i 2 172.22.0.1 -p 1025
# ./cfg -A show -i 2
# mount pohmelfs /mnt -t pohmel -v -o idx2
on one client:
# cd /mnt
# mkdir test
# cd test
# ls -lA
on the other client:
# cd /mnt
# cd test
# error: no such file or directory
on server:
# cd export
# ls -lA
empty directory
# mount pohmelfs /mnt -t pohmel -v -o idx2
You should set
idx=2option, otherwise it will not connect to the specified address neither create there any directory.Also
# cfg -A show -i 2should show the specified address. Please check that setup command completed without errors (i.e. its exit status is zero).(ok, created a account)
When I take the lastest server from git, it doesn't compile (I use 64-bit Ubuntu):
In file included from coherency.c:31:
../include/coherency.h:32: error: expected specifier-qualifier-list before ‘uint64_t’
../include/coherency.h:39: error: expected declaration specifiers or ‘...’ before ‘uint64_t’
../include/coherency.h:39: error: expected declaration specifiers or ‘...’ before ‘uint64_t’
coherency.c: In function ‘fserver_flush_writers_callback’:
coherency.c:50: error: ‘struct coherency_user’ has no member named ‘intent’
coherency.c:56: error: ‘struct coherency_user’ has no member named ‘owner’
coherency.c: In function ‘coherency_user_add’:
coherency.c:181: error: ‘struct coherency_user’ has no member named ‘ino’
coherency.c:182: error: ‘struct coherency_user’ has no member named ‘intent’
coherency.c:183: error: ‘struct coherency_user’ has no member named ‘intent’
coherency.c:184: error: ‘struct coherency_user’ has no member named ‘id’
coherency.c:196: error: ‘struct coherency_user’ has no member named ‘id’
coherency.c:197: error: ‘struct coherency_user’ has no member named ‘ino’
coherency.c:198: error: ‘struct coherency_user’ has no member named ‘intent’
coherency.c: At top level:
coherency.c:206: error: conflicting types for ‘coherency_add_object’
../include/coherency.h:39: error: previous declaration of ‘coherency_add_object’ was here
coherency.c: In function ‘coherency_broadcast_hash’:
coherency.c:294: error: ‘struct coherency_user’ has no member named ‘id’
coherency.c:296: error: ‘struct coherency_user’ has no member named ‘ino’
coherency.c:304: error: ‘struct coherency_user’ has no member named ‘id’
coherency.c:304: error: ‘struct coherency_user’ has no member named ‘intent’
make[1]: *** [coherency.o] Error 1
make[1]: Leaving directory `/home/leen/pohmelfs/pohmelfs-server.git/server'
make: *** [all-recursive] Error 1
Maybe it's just something I did (wrong). ;-)
Apparently
stdint.his missed in coherency.h header.This patch should fix the problem, also in git.
I do think you did a good job with everything else zbr, just is I don't have a practicle use for it in it's current state.