POHMELFS kernel client at its current state and by design is a parallel network client to the distributed filesystem (called elliptics network in the design notes) itself. So you can consider it like parallel NFS (but with fast protocol), where parallel means read balancing and redundancy writing to multiple nodes. POHMELFS heavily utilizes local coherent data and metadata cache, so it is also very high-performance, but that’s it: a simple client, which is able (with server’s help of course) to maintain a coherent cache of data and metadata on multiple clients, which work with the same servers.
So, right now it is effectively a way to create data servers, where client sees only set of the same nodes, and balance operations (when appropriate) between them. Server itself works with local filesystem, which can be built on top of whtever raid combinations you like of the disks or DST nodes. So effectively in this case POHMELFS mounted node can be extended by increasing local filesystem on the server.
That’s what the current state of the POHMELFS is. Next step in this direction is to extend server only (modulo some new commands to the client if needed and proper statistics) and add distributed facilities. By design there will be a cloud of servers, where each of which has own ID and data is distributed over this cloud, which has elliptics network working title. This name somewhat reflects the main algorithm of the ID distribution. Cloud will not have any dedicated metadata servers and every node will have exactly the same priority and usage case. POHMELFS kenel client (and thus usual user, which works with mounted POHMEL filesystem node) connects to the arbitrary server in the cloud and asks for needed data, this request is transformed into the elliptics network format and is forwarded to the server, which hosts it, data is returned to the client and it does not even know that it was stored elsewhere. In this case it is possible to infinitely extend the server space by adding new nodes, which will automatically join the network and will not require any work from client or administrator. Redundancy is maintained by having multiple nodes with the same id, so client will balance reading between them and write to them simultaneously. Node join protocol will maintain coherency of the updated and old data.
Proof-of-concept implementation is scheduled for the next month or so, this should be working (but simple enough for the start) library which can be used by other applications. Then I will integrate it with existing POHMELFS server. This is optimistic timings though, it depends on how many bugs will be found in all projects I maintain :)
DST is a network block device. It has a dumb protocol, which allows to connect two machines and use read/write commands between them (each command is effectively a pointer to where data should be stored and its size). So, there is always one machine, which exports some storage, and one which connects to it. There are no locks, no protection against parallel usage, nothing. Just plain IO commands. System can start several connections to the remote nodes and thus will have multiple block devices, which appear like local disks. Administrator can combine those block devices into single one via device mapper/lvm or mount btrfs on top of multiple nodes. And then export it via POHMELFS to some clients or work with it locally.
I consider DST as a completed project.
I did not write more detailed feature description of both POHMELFS and DST and how they are used in the failover cases or data integrity, it is always possible to grab those design notes from the appropriate homepages.