Here we go, the latest POHMELFS against drivers/staging DST and in-kernel async NFS.
Server hardware: 4-way Xeon (2 physical CPU + 2 HT CPU) server with 1 Gb of RAM (actually it has 8, but high-mem was disabled), scsi disk, default xfs 300 gb partition.
Client hardware: 4-way Core2 Xeon with 4 Gb of RAM (again no appropriate high-mem option).
Gigabit ethernet, in-kernel async NFS. 2.6.29-rc1 kernel.
Iozone tests for POHMELFS, NFS and XFS.






As we can see, read and write performance is way ahead of NFS, but random read is noticebly slower.
Bonnie++ benchmark for POHMELFS, NFS and DST.

Bonnie was not able to calculate object creation/removal time for POHMELFS, since with local data writeback cache this is very fast compared to write-through NFS case.
So, POHMELFS operates fast. Even in its basic network filesystem mode. But I refer to the random read performance, which is not something we can be proud of :)
But I will work on this, and likely will start with read-ahead games on the server.
Contrary dbench will not run very well on POHMELFS currently, since its rename operation is synchronous and rather slow (it forces inode sync to the server). After I switched to the system's dcache, there yet untested areas which I work on, so it is not yet pushed to the drivers/staging, but it will be there quite soon.
Have you considered authoring POHMELFS stubs for other platforms? (An OpenSolaris client, for example, would let me try out POHMELFS by changing only server platforms, and leaving [hundreds of] clients alone.)
- Айзея
I do not know Solaris VFS enough to implement a filesystem, so I do not plan to add a client support for this now.
Contrary server runs in the userspace and thus can be easily ported to Solaris, BSD or even Windows.
It is a little ironic to me that the server is a user space application.
At any rate, Linux does not have an application container platform which is as convenient to provision and allows for such thorough utilization of hardware as Solaris zones, so I suppose we will not afford the opportunity to try your software.
Good luck to you.
As you noticed, even userspace server allows to have noticebly higher performance than NFS kernel one. It is not a server implementation problem, but network and cache management protocol.
Moreover, modern CPUs are so fast and IO buses are so slow (in comparison and grow rate), that doing IO processing in the kernel or userspace does not really differ too much. With Linux'
splice()call this eliminates copy completely, but I tried first to implement as portable code as possible, so it is not used so far.Hi, DST looks very interesting, my question relates to network protocols, can DST be used the ATA over ethernet
protocol and cut down on TCP/IP processing and latency and maybe improve throughput.
Maybe a test to prove this would be nodes using memory ramdisk as storage and check throughput ect.
eliminating disk access as bottleneck, might have better specs than SSD's.
Anyway DST a very good project
Thank you.
DST uses quite generic and simplpe protocol and it can be done the way ATA-ober-Ethernet works, but this heavily limits its usage by the local network only. DST can work on top of any transport layer protocol like UDP to eliminate TCP issues, but I did not tested it heavily.
That would take some convincing for people to use for a networked FS.
There must be some reason NFS does write-through.
Not exactly write-through, but something like that. Page maybe flushed to the server during write begin/end time and not particulary when writeback happens.
This was implemented likely to avoid cache coherency protocol implementation.
POHMELFS uses pure writeback cache which is flushed to the server either because of local system requirement or cache coherency request.