Measures reported by RabMQChannelTest
A connection is a TCP connection between your application and the RabbitMQ broker. A channel is a virtual connection inside a connection. In other words, a channel multiplexes a TCP connection. Typically, each process only creates one TCP connection, and uses multiple channels in that connection for different threads. When you are publishing or consuming messages from a queue, it's all done over a channel.
Since unacknowledged messages are a resource-drain, RabbitMQ limits the number of unacknowledged messages a channel can hold at any point in time, using a Prefetch count configuration. Depending upon the unacknowledged message traffic on your channels and the count of consumers for those messages, you may want to fine-tune this Prefetch count time and again. By setting the global flag to true, you may even decide to configure a Consumer Prefetch count, which will be shared across all consumers on a channel.
The RabMQChannelTest test reports useful metrics on the message traffic and message prefetching on a channel. This way, the test provides administrators with effective pointers on how to tweak the Prefetch count setting and optimize RabbitMQ performance.
Outputs of the test : One set of results for each channel on every node in the target cluster.
First-level descriptor: Node name
Second-level descriptor: Channel name
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| count |
Indicates the number of channels connected.
For the Summary descriptor, this measure will indicate the total number of channels connected. |
Number |
Use the detailed diagnosis of this measure to view the top-10 (by default) channels, in terms of their reduction count. |
| diffPrefetch |
Indicates the rate at which each consumer on this channel prefetched messages.
For the Summary descriptor, this measure will indicate the rate at which the consumers across all channels on all nodes prefetched messages. |
Messages/Sec |
If each consumer on a channel consumes messages at a steady rate at all times - i.e., if the value of this measure does not change much over time - it could be owing to a Prefetch setting per consumer.
Prefetch allows you to limit the number of unacknowledged messages for a channel and/or a consumer. Once the number reaches the configured count, RabbitMQ will stop delivering more messages on the channel and/or to that consumer unless at least one of the outstanding messages is acknowledged.
With the default Prefetch setting, which gives consumers an unlimited buffer, Rabbit will push all messages in a queue to a consumer as fast as the network and the consumer allow. The consumer will balloon in memory as they buffer all the messages in their own RAM. The queue may appear empty if you ask Rabbit, but there may be millions of messages unacknowledged as they sit in the consumers ready for processing by the client application. If you add a new consumer, there are no messages left in the queue to be sent to the new consumer. Messages are just being buffered in the existing consumer, and may be there for a long time, even if there are other consumers that become available to process such messages sooner. This means that with the default Prefetch setting, Rabbit performance will be poor. The goal is to keep the consumers saturated with work, but to minimise the client's buffer size so that more messages stay in Rabbit's queue and are thus available for new consumers or to just be sent out to consumers as they become free.
Let's say it takes 50ms for Rabbit to take a message from a queue, put it on the network and for it to arrive at the consumer. It takes 4ms for the client to process the message. Once the consumer has processed the message, it sends an ack back to Rabbit, which takes a further 50ms to be sent to and processed by Rabbit. So we have a total round trip time of 104ms. If we have a prefetch setting of 1 message then Rabbit will not send out the next message until after this round trip completes. Thus the client will be busy for only 4ms of every 104ms, or 3.8% of the time. The goal is to keep the client busy 100% of the time.
Here are some guidelines for setting the correct prefetch value:
If you have one single or few consumers processing messages quickly, we recommend prefetching many messages at once. Try to keep your client as busy as possible. If you have about the same processing time all the time and network behavior remains the same - you can simply take the total round trip time / processing time on the client for each message, to get an estimated prefetch value.
If you have many consumers, and a short processing time, we recommend a lower prefetch value than for one single or few consumers. A too low value will keep the consumers idling a lot since they need to wait for messages to arrive. A too high value may keep one consumer busy, while other consumers are being kept in an idling state.
If you have many consumers, and/or a long processing time, we recommend you to set prefetch count to 1 so that messages are evenly distributed among all your workers.
Please note that if your client auto-ack messages, the prefetch value will have no effect.
|
| diffGlobalPrefetch |
Indicates the rate at which this channel prefetched messages across all its consumers.
For the Summary descriptor, this measure will indicate the rate at which messages were prefetched across all channels and nodes. |
Messages/Sec |
If this measure reports a non-zero value, it could indicate that the global Prefetch count configuration is active.
By default, the Prefetch count configuration applies to each new consumer on a channel. If required, you can set the global flag in the basic.qos method to true, so that the Prefetch count configuration applies per channel (i.e., across all consumers of a channel).
For instance, say that there are two consumers of a channel, with a global prefetch count of 15. In this case, both these consumers will only ever have 15 unacknowledged messages between them.
Note that a per-channel Prefetch count and a per-consumer Prefetch count can even co-exist. For instance, say that there are two consumers of a channel. Also, assume that the per-channel Prefetch count is 15 and the per-consumer Prefetch count is 10. In this case, the two consumers will only ever have 15 unacknowledged messages between them, with a maximum of 10 messages for each consumer. This will be slower than the above example, due to the additional overhead of coordinating between the channel and the queues to enforce the global limit.
A global Prefetch setting is ideal because a single channel may consume from multiple queues, thus requiring coordination between the channel and the queue(s) for every message sent to ensure they don't go over the limit. This coordination can be slow on a single machine, and very slow when consuming across a cluster. With the global setting, there is no need for this coordination. |
| diffMsgUnacknowledged |
Indicates the rate of unacknowledged messages on this channel.
For the Summary descriptor, this measure will indicate the rate of unacknowledged messages across all channels and nodes. |
Messages/Sec |
Ideally, the channel should hold very few unacknowledged messages, as such messages are resource-hungry.
If the value of this measure keeps varying over time, it could imply that the default Prefetch count setting is at play. In this case, use the guidelines discussed in the Interpretation column of the diffPrefetch rate and diffGlobalPrefetch measures to know how to fine-tune the Prefetch count configuration and limit the unacknowledged messages on a channel. |
| diffMsgUnconfirmed |
Indicates the rate of unconfirmed messages on this channel.
For the Summary descriptor, this measure will indicate the rate of unconfirmed messages across all channels and nodes. |
Messages/Sec |
Unconfirmed messages refer to those messages for which the broker is yet to send a receipt confirmation to the producer. |
| diffMsgUncommitted |
Indicates the rate of uncommitted messages on this channel.
For the Summary descriptor, this measure will indicate the rate of uncommitted messages across all channels and nodes. |
Messages/Sec |
Uncommitted messages are those that are received by the consumer for transactions that are not yet committed. |
| diffAcksUncommitted |
Indicates the rate of uncommitted acknowledgements on this channel.
For the Summary descriptor, this measure will indicate the rate of uncommitted acknowledgement across all channels and nodes. |
Messages/Sec |
Uncommitted acknowledgements are acknowledgements that are received
by the node for transactions that are not yet committed. |
| diffReductions |
Indicates the rate at which reductions take place on this channel.
For the Summary descriptor, this measure will indicate the rate of reductions across all channels and nodes. |
Reductions/Sec |
The reduction is a counter per process that is normally incremented by one for each function call. It is used for preempting processes and context switching them when the counter of a process reaches the maximum number of reductions. For example in Erlang/OTP R12B this maximum number was 2000 reductions.
The value of this measure represents the rate at which a channel makes function calls. This is the real indicator of the workload generated by channel on a particular node. To understand the workload of the cluster as a whole, use the value this measure reports for the Summary descriptor. |
|