Let's see in details how to implement read balancing between multiple nodes, create IO groups, write-only backup solutions and manage node IO priorities.
As described yesterday, IO priority is a feature which manages read order and IO importance of the appropriate node. By 'read' I mean all operations which do not modify content of the remote object, like reading itself, directory listing, (extended) attributes fetching and so on.
So, assigning a network state a higher IO priority means we prefer to fetch data from the given server. Usually having only a single server to read data from will end up with the huge load on the given machine, while others will slack. So we can add multiple servers with the same IO priority and system will try to balance requests between servers with the same highest priority. POHMELFS uses round-robin algorithm among machines with the selected highest IO priority. If there is at least one node with the higher prio, it will be used for all read requests.
Adding and modifying IO permissions is a rather simple task:
# cfg -A add -a devfs1 -p 1026 -i 0 -P 250 -I 3
# cfg -A add -a devfs2 -p 1026 -i 0 -P 250 -I 3
# cfg -A add -a devfs3 -p 1026 -i 0 -P 250 -I 3
...
# cfg -A modify -a devfs1 -p 1026 -i 0 -P 500 -I 3
# cfg -A modify -a devfs2 -p 1026 -i 0 -P 500 -I 3
Above commands will first add 3 nodes with devfs{1,2,3} addresses and priority 250 (-P option).
-I switch provides IO permission mask (1 - read, 2 - write, can be ORed).
So, after above steps are completed, we have two IO 'groups': the first one with the IO priority of 500, which contains devfs1 and devfs2 servers and the second one, which contains machine devfs3 and has IO priority of 250.
In this case every read request will be sent either to devfs1 or devfs2 machine. We can monitor states of the connections via /proc/$PID/mountstats file (addresses were replaced with the names for easier reading).
# cat /proc/1/mountstats
...
device none mounted on /mnt with fstype pohmel
idx addr(:port) socket_type protocol active priority permissions
0 devfs1:1026 1 6 1 500 3
0 devfs2:1026 1 6 0 500 3
0 devfs3:1026 1 6 1 250 3
Where the parameter before the priority (500 or 250) shows if given connection is active (1) or broken (0). When connection breaks, we can calculate number of the active servers with the highest priority and if number is less than some value, we can lower their priority and thus 'move' second group of machines with the second highest priority to the first place and start reading data from there.
Let's look at IO permission masks.
We have only two bits used - read and write operation, and it is possible to create read-write, read and write-only connections. While the first two are obvious, write-only may look somewhat questionable, but it is quite useful as a backup solution so that given node could not be used to perform read requests to minimize its load.
By default configuration utility uses priority 0 and read-write mask.
Writes are always sent to the all nodes which have write permission bit set, and priority here only means that packet will be sent to the nodes according to them (servers may receive data in a different order of course if multiple network pathes are used). Write transaction will be completed only when all nodes acked given request or it was finished with error after timeout and number of resends.
Recent comments
2 weeks 3 days ago
2 weeks 4 days ago
2 weeks 5 days ago
2 weeks 6 days ago
2 weeks 6 days ago
2 weeks 6 days ago
4 weeks 3 hours ago
4 weeks 3 hours ago
4 weeks 1 day ago
4 weeks 1 day ago