Kind of goes - there is a perfect task for this solution, which I can try to hook into this year. The New Year deadlines all deadlines, so there is about a month and a half for the task.
Task is quite simple actually - there is a huge library of files, which does not fit single storage machine. And although it is not that large, about 5-10 Tb of data for starters, next step is to suck in close to 200 Tb of data. Task is to allow on-demand reading without updates of the existing files, only new ones will be added with time. I expect millions of reads per day.
Files should be spread over multiple machines for read balancing, there should be multiple copies of each for redundacy. System should transparently handle failures (storage machines will be spread over multiple data centers). And the main request is to allow to fetch files over direct links, i.e. elliptics network provides data location and some usual HTTP server will give them away.
While I wrote this entry another cool task (re)appeared: clusterize some very popular monitoring system, which to date does not scale very well to existing amount of notification writers (about 200k small writes per second per small cluster). I need to provide fault-tolerant storage which will be able to suffer this load and allow simple horizontal scaling on demand.
Existing performance numbers show that elliptics network can easily handle all those tasks, but some obscure numbers created by the project author are usually not enough for those who deploy new system. As in any other business, people do not eager to try something new. New, shiny and likely buggy...
Well, let's show what we can do. I will post results and setup systems here.
> clusterize some very popular monitoring system
О, ты теперь вместо Макса будешь пилить Zabbix? :)
Чуви, я пишу распределенные файловые системы и парсеры регулярных выражений на Лиспе, Заббикс я не пилю :)