Automatic recovery in elliptics network

Eventually consistent systems are very scalable – its performance by orders of magnitude higher that that in synchronous systems, and its design allows true horizontal scalability without complexities and bugs.

But it comes for price – data recovery is postponed in time, and sometimes quite for a long period. All distributed systems provide some kind of data replication, so this is usually not a problem.
But having complete set of fully recovered replicas is not only safe, it also provides better performance, since clients may read data in parallel from different replicas (in those systems like elliptics which provide this functionality).

To close the gap between data recovery and current data requests, we implemented on-demand recovery in elliptics. This is simply a data writeback into the storage, when we have read the object and detected that one or more replicas are missing.

We preserve timestamps of the records. This method does not imply that there is no need to perform regular data check (with data recovery if needed) anymore, instead it speeds up recovery for the most actively used objects.