In a meantime elliptics network got a little bit harden node crash management – all transactions which were not sent prior node crash will be resent to another nodes according to the changed routing table. This was only slightly tested though, since real-life examples are a bit hard to catch.
I will write a test script, which will write to the node which will forward all requests to another one, and then crash the second node. Test is expected to continue to work and all transactions should be present on one of the nodes. This is not always the case though, since transaction can be received into socket buffer but not processed by the remote node.
Ideal case should scan transaction tree and resend those unacked transactions which were routed to the failed node. Right now there is no timed tree scanning though, and only queued requests are resent to different node or processed locally.
Another interesting bit of functionality implemented is remote node statistics. It gathers load average, memory and filesystem stats for the specified node (sigh, as usual Linux and *BSD completely differs, Solaris is not supported at all, since I only know how to get filesystem stats, thanks to POSIX, and VM data is not updated) . This can be used to detect bottlenecks and update network configuration according to received data.
All those bits are simple enough, but the most interesting are update notifications of course, which is scheduled next. Stay tuned!