Tag Archives: monitoring

React monitoring tool released

During last months we’ve been actively developing our monitoring tools and now we are ready to present you React!

React(Real-time Call Tree) is a library for measuring time consumption of different parts of your program. You can think of it as a real-time callgrind with very small overhead and user-defined call branches. Simple and minimalistic API allows you to collect per-thread traces of your system’s workflow.

Trace is represented as JSON that contains call tree description(actions) and arbitrary user annotations:

{
    "id": "271c32e9c21d156eb9f1bea57f6ae4f1b1de3b7fd9cee2d9cca7b4c242d26c31",
    "complete": true,
    "actions": [
        {
            "name": "READ",
            "start_time": 0,
            "stop_time": 1241,
            "actions": [
                {
                    "name": "FIND",
                    "start_time": 3,
                    "stop_time": 68
                },
                {
                    "name": "LOAD FROM DISK",
                    "start_time": 69,
                    "stop_time": 1241,
                    "actions": [
                        {
                            "name": "READ FROM DISK",
                            "start_time": 70,
                            "stop_time": 1130
                        },
                        {
                            "name": "PUT INTO CACHE",
                            "start_time": 1132,
                            "stop_time": 1240
                        }
                    ]
                }
            ]
        }
    ]
}

This kind of trace can be very informative, on top of it you can build complex analysis tools that on the one hand can trace specific user request and on the other hand can measure performance of some specific action in overall across all requests.

Also, for human readable representation of react traces we’ve build simple web-based visualization instrument. React trace React has already proved himself in practice. Under high load our caching system performance had degraded dramatically after cache overflow. Call tree statistics showed that during cache operations most of the time was consumed in function that was resizing cache pages. After careful examination, we optimized that function and immediately, performance significantly increased.

Now React is used in Elliptics and Eblob and we intent to use it in other Reverbrain products.

For more details check out documentation and official repository.

Elliptics monitoring spoiler alert

Here is Elliptics per-cmd monitoring

Cache/eblob command timing

Monitoring includes a very detailed statistics about the most interesting bits of the storage. Above picture shows write command execution flow (whole command and how time was spent within) – one can turn it on for own command if special trace bit has been set.

Other monitoring parts include queues (network and processing), cache, eblob (command timings, data stats, counters), VFS and other stats.
More details on how to use that data in json format is coming.