eG Monitoring
 

Measures reported by Exc2013DBStatTest

If a user experiences slow downs when accessing his/her Exchange mailbox, nine out of 10 times the reason would be the underlying ESE database. Various factors affect database performance. An under-sized / poorly configured database cache, latency in I/O processing, delay in reading/replaying log files, a session overload, can all cause an ESE database’s performance to degrade. To ensure that user productivity is not impacted by poor database performance, administrators should closely monitor each aspect of the performance of every ESE database on the Exchange server, capture anomalies before users notice, and resolve them before they affect the user experience with his/her mailbox. This is where the Exc2013DBStatTest test helps.

This test auto-discovers the ESE databases on the Mailbox server, and for each database so discovered, reports the following:

  • How effectively is the database using its cache?
  • Is the cache sized right?
  • How quickly is the database processing read/write requests to it?
  • Are log files read and replayed rapidly?
  • Are too many sessions to the database idle?

Using the insights provided by this test, administrators can accurately identify slow/overloaded databases, pinpoint why the database performance is bottlenecked, and clear the bottleneck well before users notice and complain.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Db_cac_dehy Indicates the percentage of pages in the cache of this database that are currently compressed. Percent In order to conserve the memory resources of the Exchange Mailbox server, the ‘Back Pressure’ feature of the server automatically performs ‘message dehydration’ if the server's memory usage crosses a configured threshold (by default, this is 94%). Message dehydration is the act of removing unnecessary elements of queued messages – i.e., compressing messages - that are cached in memory. Typically, complete messages are cached in memory for enhanced performance. Removal of the MIME content of queued messages from memory reduces the memory that is used.

A high value for this measure therefore indicates that messages in cache have been compressed to a large extent, thus freeing up a significant amount of system memory. On the other hand, a high value for this measure can also increase the latency of mailbox operations, because, in the absence of complete messages in the cache, the server will be forced to read the messages directly from the message queue database.

To avoid this, the mailbox server has to be sized with adequate memory and the memory usage threshold of the Backprssure feature should also be set high enough.

Db_cac_hit Indicates the percentage of page requests to this database that were fulfilled by the database cache without causing a file operation. Percent If this percentage is too low, it could be because the database cache size may be too small. You may want to consider increasing the cache size, so as to reduce the processing overheads that may be incurred on account of direct database accesses.
Db_cac_size Indicates the amount of system memory, in megabytes, used by the database cache manager to hold commonly used information from the database file(s) to prevent file operations. MB If the database cache size seems to be too small for optimal performance, and there is very little available memory on the system, an increase of memory in the system may increase performance. If there is a large amount of available memory on the system, and the database cache size is not growing beyond a certain point, the database cache size may be capped at an artificially low limit. An increase in this limit may increase performance.
Db_csc_siz_eff Indicates the amount of system memory, in megabytes, that, hypothetically, would be used by this database cache manager if all used dehydrated/compressed database cache buffers were rehydrated/uncompressed. MB  
Db_cac_siz_res Indicates the amount of system memory (in megabytes) used by this database cache that is currently part of the working set of the process. MB If the value of this measure is ever significantly smaller than that of the Db_cac_size measure then the operating system has chosen to reclaim that system memory for use in other parts of the system. The database cache will recover from this event but if this is a common occurrence then it can lead to significant performance problems.
Db_page_fal Indicates the rate of page faults that cannot be serviced because there are no pages available for allocation from the database cache. Stalls/Sec If the value of this measure is nonzero most of the time, the clean threshold may be too low.
Db_read_lat Indicates the average time taken by this database to perform read operations. Secs A low value is desired for this measure. A consistent increase in this value is indicative of a bottleneck when reading from the database. Compare the value of this measure across databases to know which database is the slowest when reading and why.
Db_read_sec Indicates the rate at which this database completes read operations. Reads/Sec A consistent drop in this value can indicate a reduction in the number of read requests. It can also indicate a reading bottleneck. In the case of the latter, compare the value of this measure across databases to know which database is the slowest when processing read requests.
Db_write_lat Indicates the average time taken by this database to perform write operations. Secs A high value is a cause for concern as it impacts how quickly transaction recovery is performed in the event of a transaction failure. If the contents of the transaction log file are not written to the database file quickly, transaction recovery will be delayed.
Db_write_sec Indicates the rate at which this database completes write operations. Writes/Sec A consistent drop in this value can indicate a reduction in the number of write requests. It can also indicate a writing bottleneck. In the case of the latter, compare the value of this measure across databases to know which database is the slowest when processing write requests.
Log_read_lat Indicates the average time taken by this database to read a log file. Secs A high value is a cause for concern as it impacts how quickly transaction recovery is performed in the event of a transaction failure. If the transaction log file is read slowly, then it will delay the replay of the unwritten transactions and the writing of the transactions to the database file. This in turn will significantly slow down transaction recovery.
Log_read_sec Indicates the rate at which this database completes log file read operations. Reads/Sec A consistent drop in the value of this measure is indicative of a slowdown when reading from a log file.
Log_write_lat Indicates the average time taken by this database to write from a log file. Secs A high value is a cause for concern as it impacts how quickly transaction recovery is performed in the event of a transaction failure. If the contents of the transaction log file are not written to the database file quickly, transaction recovery will be delayed.
Log_write_sec Indicates the rate at which this database completes log file write operations. Reads/Sec A consistent drop in the value of this measure is indicative of a slowdown when writing from a log file.
Ses_per_used Indicates the percentage of database sessions currently open for use by client threads. Percent A high value is desired for this measure. A very low value could indicate that too many open sessions are idle.
Ses_in_use Indicates the number of database sessions currently open for use by client threads. Number