badness at work postponing

Let’s consider synchronous situation when object freeing does not only return memory to the cache, but also performs some additional state machine changes – like decreasing reference counters and so on – obviously postponing of such work likely will not be very good solution, since when we work with the fast path, we want the whole sequence be completed quickly, but not split to fast and slow parts. But even pure memory freeing – i.e. returning memory to the cache, if being postponed to RCU callback, can lead to very noticeble performance degradation.

Thinking about better and faster skb processing for netchannels I created simple patch which just postpones skb freeing (kfree_skbmem(), i.e. pure releasing memory back to skb cache) to RCU callback invoked from __kfree_skb(). This leads to the following performance degradation (receiving of small packets):

kfree_skb() RCU speed degradation
kfree_skb() RCU speed degradation.

Speed is about 2.5 times slower, although CPU usage is smaller too – likely due to the increased work of RCU tasklet and increased number of context switches.

As a conclusion: using RCU protected lists of skbs for sockets will lead to major performance degradation.
As a second conclusion: RCU is not a good solution for workloads which are sensible to delays.