|
Measures reported by RUMCityTest
The RUMCityTest test provides an aggregate of the experience of users from different cities. If the users to your web site/web application are spread across different cities, you can use this test to know:
Which cities your users are coming from;
Which are the cities in which your web site/web application is very popular
How is the experience of users from each of these cities.
Moreover, if multiple users to a web site/web application are seeing slow page loads or error responses at around the same time, administrators can use this test to figure out if the problem is specific to a particular city; if so, the test further pinpoints which exact city is impacted and why is the experience of users from that city poor - is the front end used by the users in this city inefficient? Is the WAN connection between the browser clients in this city and the backend server hosting the web site/web application latent? or is the problem with the backend? Detailed diagnostics provided by the test also point to the specific pages that are slow or have encountered JavaScript errors.
Note:
By default, the eG agent can send a maximum of 50 million characters to the eG manager, when reporting detailed diagnostics for a test for a single measurement period. If this limit is exceeded by a test during a measurement period, then the detailed diagnostics reported by that test will be automatically truncated and the additional characters dropped. Moreover, a message to this effect will also be logged in the eG agent's error log. If such errors are logged frequently for a particular test, you may want to seriously consider increasing this character limit of the detailed metrics collected by that test. For this purpose, do the following:
Edit the eg_tests.ini in the <EG_INSTALL_MANAGER}DIR>\manager\config directory of the eG manager installation.
Go to the [MAX_DD_UPLOAD_LENGTH] section of the file.
In this section, look for the parameter that corresponds to the <Internal_test_name> of the test for which the character limit has to be increased.
Once you find the parameter, set the value of that parameter to a number of your choice.
Finally, save the file.
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Page_Requests |
Indicates the total number of times pages were viewed by users from this city. |
Number |
This is a good measure of the traffic from a specific city.
Sudden, but significant spikes in the page view count could be a cause for concern, as it could be owing to a malicious virus attack or an unscrupulous attempt to hack your web site/web application. |
| Apdex_Score |
Indicates the apdex score of the web site/web application based on the experience of users from this city. |
Number |
Apdex (Application Performance Index) is an open standard developed by an alliance of companies. It defines a standard method for reporting and comparing the performance of software applications in computing. Its purpose is to convert measurements into insights about user satisfaction, by specifying a uniform way to analyze and report on the degree to which measured performance meets user expectations.
The Apdex method converts many measurements into one number on a uniform scale of 0-to-1 (0 = no users satisfied, 1 = all users satisfied). The resulting Apdex score is a numerical measure of user satisfaction with the performance of enterprise applications. This metric can be used to report on any source of end-user performance measurements for which a performance objective has been defined.
The Apdex formula is:
Apdext = (Satisfied Count + Tolerating Count / 2) / Total Samples
This is nothing but the number of satisfied samples plus half of the tolerating samples plus none of the frustrated samples, divided by all the samples.
A score of 1.0 means all responses were satisfactory. A score of 0.0 means none of the responses were satisfactory. Tolerating responses half satisfy a user. For example, if all responses are tolerating, then the Apdex score would be 0.50.
Ideally therefore, the value of this measure should be 1.0. A value less than 1.0 indicates that the experience of users to this page group has been less than satisfactory. |
| Avg_Page_Load_Time |
Indicates the average time that pages accessed by users from this city took to load completely on the browser. |
ms |
This is the average interval between the time that a user initiates a request and the completion of the page load of the response in the user's browser. In the context of an Ajax request, it ends when the response has been completely processed.
By comparing the value of this measure across cities, you will be able to tell if the page load time is significantly higher for any one city - this is the first sign that the problem could be city-specific. Next, check the detailed diagnosis of this measure for this city to find out which pages are slow. Then, look up the detailed measures for the other cities to ascertain whether users from those cities also experienced slowness when accessing the same pages. If not, it is a sure sign that the problem is specific to the city.
So, proceed to compare the values of the Avg_Front_End_Time, Avg_Network_Time, and Avg_Response_Avail_Time time measures for that city, to know why the users in that city are seeing slowness - is it the front end? network? or the backend?
If the Avg_Front_End_Time is the maximum, then the problem is the front end used by the users in that city. If the Avg_Network_Time is the highest, then the network connection between the user's browser and the web site/web application is the one contributing to the slowness. A very high Avg_Response_Avail_Time on the other hand, reveals that problem may have nothing to do with the city, but with the server that is hosting the web site/web application. |
| Unique_User_session |
Indicates the number of distinct users who are currently accessing the web site/web application from this city. |
Number |
|
| Request_Per_Minute |
Indicates the number of times the pages were viewed per minute by users from this city. |
Number |
An unusually high value for this measure may require investigation. |
| Percentage_Normal |
Indicates the percentage of page views that delivered a satisfactory experience to users from this city. |
Percent |
The value of this measure indicates the percentage of page views in which users from this city have neither experienced any slowness, nor encountered any Javascript errors.
Ideally, the value of this measure should be 100%. A value that is slightly less than 100% indicates that the user experience has not been up to the mark. A value less than 50% is indicative of a serious problem, where most of the page views are either slow or have encountered Javascript errors. Under such circumstances, to know what exactly is affecting the experience of users, compare the value of the Percentage_Slow with that of the Percentage_Error for that browser. This will reveal the reason for the poor user experience % slow pages? or Javascript errors?
If slow pages are the problem, use the detailed diagnosis of the Slow_Requests measure to know which pages are slow and where these pages are losing time - in the front end? network? or backend?.
If JavaScript errors are the problem, use the detailed diagnosis of the Percentage_Error measure to know what errors occurred in which pages. |
| Percentage_Slow |
Indicates the percentage of pages that were slow in loading when accessed by users in this city. |
Percent |
Ideally, the value of this measure should be 0. A value over 50% implies that you are in a spot of bother, with over half of the page views being slow. Use the detailed diagnosis of the Slow_Requests measure to identify the slow pages and isolate the root-cause of the slowness - is it the front end? the network? or the backend? |
| Percentage_Error |
Indicates the percentage of page views that have encountered JavaScript errors. |
Percent |
Ideally, the value of this measure should be 0. A value over 50% implies that you are in a spot of bother, with over half of the page views of this type are experiencing JavaScript errors. Use the detailed diagnosis of this measure to identify the error pages and to know what Javascript error has occurred in which page. This will greatly aid troubleshooting! |
| Satisfied_Requests |
Indicates the number of times pages were viewed by users in this city without any slowness. |
Number |
A page view is considered to be slow when the average time taken to load that page exceeds the SLOW TRANSACTION CUTOFF configured for this test. If this SLOW TRANSACTION CUTOFF is not exceeded, then the page view is deemed to be ‘satisfactory’. To know which page views are satisfactory, use the detailed diagnosis of this measure.
Ideally, the value of this measure should be the same as that of the Page_Requests measure. If not, then it indicates that one/more page views are slow - i.e., have violated the SLOW TRANSACTION CUTOFF.
If the value of this measure is much lesser than the value of the Tolerated_Requests and the Frustrated_Requests, it is a clear indicator that the experience of users in this city has been below-par. In such a case, use the detailed diagnosis of the Tolerated_Requests and the Frustrated_Requests measures to know which pages are slow and why. |
| Slow_Requests |
Indicates the number of page views that were slow when accessed from this city. |
Number |
A page view is considered to be slow when the average time taken to load that page exceeds the SLOW TRANSACTION CUTOFF configured for this test.
Ideally, a page should load quickly. The value 0 is hence desired for this measure. If the value of this measure is high, it indicates that users frequently experienced slowness when accessing pages in the web site/web application. To know which page views are slow and why, use the detailed diagnosis of this measure. |
| Error_Requests |
Indicates the number of times JavaScript errors occurred when accessing pages from this city. |
Number |
Ideally, the value of this measure should be 0. A high value indicates that many JavaScript errors are occurring when viewing pages in the web site/web application. Use the detailed diagnosis of this measure to identify the error pages and to know what Javascript error has occurred in which page. This will greatly aid troubleshooting! |
| Tolerated_Requests |
Indicates the number of tolerating page views experienced by users in this city. |
Number |
If the Avg_Page_Load_Time of a page exceeds the SLOW TRANSACTION CUTTOFF configuration of this test, but is less than 4 times the SLOW TRANSACTION CUTTOFF (i.e., < 4 * SLOW TRANSACTION CUTOFF), then such a page view is considered to be a Tolerating page view.
Ideally, the value of this measure should be 0. A value higher than that of the Satisfied_Requests measure is a cause for concern, as it implies that the overall experience of the users in this city is less than satisfactory. To know which pages are contributing to this sub-par experience, use the detailed diagnosis of this measure. The detailed metrics will also enable you to accurately isolate what is causing the tolerating page views - a problem with the front end? network? or backend? |
| Frustrated_Requests |
Indicates the number of frustrated page views experienced by users in this city. |
Number |
If the Avg_Page_Load_Time of a page is over 4 times the SLOW TRANSACTION CUTTOFF configuration of this test (i.e., > 4 * SLOW TRANSACTION CUTOFF), then such a page view is considered to be a Frustrated page view.
Ideally, the value of this measure should be 0. A value higher than that of the Satisfied_Requests measure is a cause for concern, as it implies that the experience of users in this city has been less than satisfactory. To know which pages are contributing to this sub-par experience, use the detailed diagnosis of this measure. The detailed metrics will also enable you to accurately isolate what is causing the frustrated page views - a problem with the front end? network? or backend? |
| Avg_Page_Rendering_Time |
Indicates the time taken to complete the download of remaining resources, including images, and to finish rendering the pages in the browser for users in this city. |
ms |
A high value of this measure indicates that page rendering is taking too long. This can be attributed to an in-optimal HTML document architecture, complex CSS' (eg., deeply nested rules, slow selectors, complicated effects such as round borders, gradients etc.), and large sized images.
If the Avg_Page_Load_Time measure reports an abnormally high value, then you may want to compare the value of this measure with that of the Avg_Brow_Init_Req_Time, Avg_Network_Time, Avg_Response_Avail_Time, and Avg_Dom_Ready_Time measures to nail the exact source of the bottleneck. |
| Avg_Dom_Ready_Time |
Indicates the time taken by the browser to make the complete HTML document (DOM) available for JavaScript to apply rendering logic on the pages accessed by users in this city. |
ms |
This is the time spent between the responseStart event and the documentContentLoadedEventStart event. The documentContentLoadedEventStart is typically fired just before the domContentLoaded event, which is just after browser has finished downloading and parsing all the scripts that had defer set and no async attribute.
In summary, the Avg_Dom_Ready_Time measure is the sum of the values of the Avg_Dom_Download_Time and Avg_Dom_proc_Time measures.
If content downloading takes unusually long for a request, then you must compare the values of the Avg_Dom_Download_Time and Avg_Dom_proc_Time measures to figure out what is causing the delay - is it because of the poor responsiveness of server, cache, or local resource? or is it because DOM processing took too long? |
| Avg_Dom_Download_Time |
Indicates the time taken to download the complete HTML document for requests received from users in this city. |
ms |
The value of this measure is the time that elapsed between the responseStart and responseEnd events.
Higher the download time of the document, longer will be the time taken to make the document available for page rendering. As a result, the overall user experience will be affected! This is why, a low value is desired for this measure at all times. |
| Avg_Dom_proc_Time |
Indicates the time taken by the browser to build the Document Object Model (DOM) for the pages requested by the users in this city and make it available for JavaScript to apply rendering logic. |
ms |
An unusually high value for this measure is a clear indicator that DOM building is taking longer than normal. In consequence, page rendering will be delayed, thus adversely impacting user experience when this page group is accessed. Ideally therefore, the value of this measure should be low. |
| Avg_Response_Avail_Time |
Indicates the interval between the start of processing of a request from a user in this city to when response is received. |
ms |
The Avg_Response_Avail_Time is the time spent between the requestStart event and responseStart event.
Ideally, a low value is desired for this measure, as high values will certainly hurt the Apdex_Score of the web site/web application.
The key factor that can influence the value of this measure is the request processing ability of the web server/web application server that is hosting the web site/web application being monitored.
Any slowdown in the backend web server/web application server - caused by the lack of adequate processing power in or improper configuration of the backend server - can significantly delay request processing by the server. In its aftermath, the Avg_Response_Avail_Time will increase, leaving users with an unsatisfactory experience with the web site/web application.
Note:
This test uses the Navigation Timing API to measure web site performance. The Navigation Timing API typically exposes several properties that offer information about the time at which different page load events happen - eg., the requestStart event, the responseStart event, etc. This test uses the time stamps provided by the Navigation Timing API to compute and report the duration of page load events, so you can accurately identify where page loading is bottlenecked.
The Navigation Timing API on Internet Explorer (IE) v11 reports an incorrect time stamp for the requestStart event of the page loading process. As a result, for page view requests initiated from IE 11 browsers alone, eG RUM will report incorrect values for this measure.
This issue was noticed in IE 11 in April 2019. It is recommended that you track hot fixes/patches released by Microsoft post April 2019, study the release notes of such fixes/patches, and determine if this bug has been fixed in any. If so, then you are advised to apply that fix/patch on IE 11 to resolve the issue.
Until then, we recommend that you use the following workaround to accurately measure the Average server time of a page view request.
Deploy eG BTM (Java/.NET, as the case may be) on the backend server hosting the target web site/web application.
Use eG BTM to trace the path of transactions to the target web site/web application and enable the (Java or .NET) Business Transactions test to capture metrics on transaction performance.
Next, use the detailed diagnostics reported by eG RUM to identify the page view requests coming from the IE 11 browser. Make a note of the values in the Request time, URL and Query Params columns of detailed diagnosis.
Then, search the detailed diagnostics of the (Java or .NET) Business Transactions test for transactions with the same URL, Request time, and Query Params as reported by eG RUM.
The response time that eG BTM reports for each of these transactions is the Avg_Response_Avail_Time of those transactions.
Note that this workaround applies only for those transaction URLs that are captured and reported as part of detailed diagnostics. |
| Avg_Network_Time |
Indicates the elapsed time since a user in this city initiates a request to the web site/web application and the start of fetching the response document from it. |
ms |
The time spent between navigationStart and requestStart makes up the Avg_Network_Time. This includes the time to perform DNS lookups and the time to establish a TCP connection with the server. In other words, the value of this measure is nothing but the sum of the Avg_DNS_Time and Avg_TCP_Time measures.
Ideally, the value of this measure should be low. A very high value will often end up delaying page loading and damaging the quality of the web site service. In the event that the server connection time is high therefore, simply compare the values of the Avg_DNS_Time, and Avg_TCP_Time measures to know to what this delay can be attributed - a delay in domain name resolution? Or a poor network connection to the server?
Note:
This test uses the Navigation Timing API to measure web site performance. The Navigation Timing API typically exposes several properties that offer information about the time at which different page load events happen - eg., the requestStart event, the responseStart event, etc. This test uses the time stamps provided by the Navigation Timing API to compute and report the duration of page load events, so you can accurately identify where page loading is bottlenecked.
The Navigation Timing API on Internet Explorer (IE) v11 reports an incorrect time stamp for the requestStart event of the page loading process. As a result, for page view requests initiated from IE 11 browsers alone, eG RUM will report incorrect values for this measure.
This issue was noticed in IE 11 in April 2019. It is recommended that you track hot fixes/patches released by Microsoft post April 2019, study the release notes of such fixes/patches, and determine if this bug has been fixed in any. If so, then you are advised to apply that fix/patch on IE 11 to resolve the issue. |
| Avg_DNS_Time |
Indicates the time taken by the browser used by a user in this city to perform the domain lookup for connecting to the web site/web application. |
ms |
A high value for this measure will not only affect DNS lookup, but will also impact the Avg_Network_Time and Avg_Page_Load_Time of the web site/web application. This naturally will have a disastrous effect on user experience. |
| Avg_TCP_Time |
Indicates the time taken by the browser used by a user in this city to establish a TCP connection with the server. |
ms |
A bad network connection between the browser client and the server can delay TCP connections to the server As a result, the Avg_Network_Time too will increase, thus impacting page load time and overall user experience with the web site/web application. |
| Avg_Brow_Init_Req_Time |
Indicates the interval between when a request was received from users in this city, when it followed a redirect, to when it was processed in the AppCache. |
ms |
This measure is the time spent between the navigationStart event to the domainLookupStart event. This also includes the time spent by the browser waiting for one event to end and the next to begin. In short, this measure is the sum of Avg_App_Cache_Time, Avg_Redirection_Time and Avg_Browser_Wait_Time measures. This means that if a request takes too long to follow a redirection, or if the AppCache takes too long to process the request, or if the request spends too much time on the browser for the previous request to complete, then the value of this measure will significantly increase. This in turn will impact user experience, and consequently, the Apdex score.
This is why, if this measure reports an abnormal value for a page group, it is important that you compare the value of the Avg_App_Cache_Time, Avg_Redirection_Time and Avg_Browser_Wait_Time measures for the same browser, to figure out where the request spent maximum time - in redirection? in the AppCache? or when waiting on the browser? |
| Avg_SSL_Handshake_Time |
Indicates the time taken by requests from this city to complete the SSL handshake. |
ms |
An SSL handshake happens when a browser makes a secure request for content, also known as an encrypted HTTPS connection. The user's browser and server negotiate encrypted keys and certificates to establish a secure connection between each other. Because this SSL negotiation requires exchanges between the browser and your server, it increases the time spent by the request on the network (i.e., it adds to the value of the Avg_Network_Time measure). This in turn increases page load time.
In fact, an SSL handshake, along with DNS lookup and TCP handshake add three round trips to the page load time.
A quick fix to this is to use HTTP/2. HTTP/2 can use caching to reduce SSL setup to only one round trip. |
| Avg_App_Cache_Time |
Indicates the time taken to check whether/not the requests from this region can be serviced by the AppCache. |
ms |
HTML5 provides an application caching mechanism that lets web-based applications run offline. Developers can use the Application Cache (AppCache) interface to specify resources that the browser should cache and make available to offline users. Applications that are cached load and work correctly even if users click the refresh button when they are offline. Using an application cache gives an application the following benefits:
Offline browsing: users can navigate a site even when they are offline.
Speed: cached resources are local, and therefore load faster.
Reduced server load: the browser only downloads resources that have changed from the server.
To enable the application cache for an application, you must include the manifest attribute in the <html> element in your application's pages The manifest attribute references a cache manifest file, which is a text file that lists resources (files) that the browser should cache for your application. The browser does not cache pages that do not contain the manifest attribute, unless such pages are explicitly listed in the manifest file itself. You do not need to list all the pages you want cached in the manifest file, the browser implicitly adds every page that the user visits and that has the manifest attribute set to the application cache.
When the browser visits a document that includes the manifest attribute, if no application cache exists, the browser loads the document and then fetches all the entries listed in the manifest file, creating the first version of the application cache.
Subsequent visits to that document cause the browser to load the document and other assets specified in the manifest file from the application cache.
If the manifest file has changed, all the files listed in the manifest-as well as those already added to the cache - are fetched into a temporary cache.
Once all the files have been successfully retrieved, they are moved into the real offline cache automatically. Since the document has already been loaded into the browser from the cache, the updated document will not be rendered until the document is reloaded (either manually or programmatically).
A high value of this measure signifies that requests are spending too much time in the AppCache. This also introduces page loading latencies, which have an adverse effect on user-perceived performance of a web site/application. Common reasons for AppCaching issues and their practical solutions are detailed below:
If the media type is not set, then AppCache will not work. To avoid this, make sure that the manifest file is always served under the correct media type of text/cache-manifest.
If the manifest file is not served to the web browser from the same origin as the host page, the manifest file will fail to load. To avoid this, make sure that the manifest file is always served from the same origin as the host page. However, note that the manifest file can hold reference to resources to be cached from other domains.
The relative URLs that you mention in the manifest are relative to the manifest file and not to the document where you reference the manifest file. If you make this error when the manifest and the reference are not in the same path, the resources will fail to load and in turn the manifest file will not be loaded. This will stall appcaching.
Any change made to the manifest file will cause the entire set of files to be downloaded again. Moreover, if a manifest file is added to a HTML file, it forces all resources to be downloaded synchronously as soon as the manifest file is downloaded. As a result, resources that may not yet be required, such as JavaScript or an image below the fold, will be downloaded at the start of the page. This can increase page load time significantly. The solution to this is to load the Application Cache from a simple HTML file loaded in an iframe. This not only avoids caching dynamic HTML but also avoids the Application Cache being downloaded asynchronously after the page load has completed.
|
| Avg_Redirection_Time |
Indicates the time that requests from this city spent in redirection before fetching the pages. |
ms |
This measure is the elapsed time between the redirectStart and redirectEnd events.
URL redirection, also known as URL forwarding, is a technique to give a page, a form, or a whole web site/application, more than one URL address. Usually, web site administrators use URL redirection to:
Redirect users to the mobile version of the site
Redirect users to secured pages
Redirect users to the latest version of the resource/content
Redirect users to pages specific to their geo location
Redirecting canonical URLs
Though redirects are useful, they have to be kept at a minimum, as each redirect on a page adds latency to the overall page load time. This is because, when a user enters a domain into the browser and hits enter, the DNS resolution process is triggered and the domain is resolved to its corresponding IP address in a few milliseconds. If the landing page has another redirect, then the browser repeats the entire DNS resolution process once again to guide the user to the correct web page. The multiple redirect requests are taxing on the browser resources and slow down the page load.
Web page load time is also affected by internal redirects; for example, if the page tries to load content from a URL that has been redirected to newer or updated content, then the browser must create additional requests to fetch the valid content. These redirects result in additional round trips between the browser and the web server which pushes the load time higher; the perceived performance is degraded every time the browser encounters a redundant redirect.
Web site/application performance is also impacted if redirects are not implemented correctly. Some of the common redirect errors are:
Multiple redirects: The higher the number of redirects on a page, the higher is its page load time.
Invalid redirects: There are often instances where the web site administrator sets up bulk redirects without verifying the validity of the redirects. The site may also have old redirects that were never cleaned up. This can cause several issues on the site like broken links and 404s.
Redirect loop: When there are several redirects on the page that are linked to each other, it creates a chain of redirects which may loop back to the same URL that initiated the redirect. This results in a redirect loop error and the user will not be able to access the site.
Therefore, if you find that the value of the Avg_Page_Load_Time measure is abnormally high owing to an unusually high value for the Avg_Redirection_Time measure, then make sure you follow the best practices outlined below while implementing redirects, so you can significantly reduce page load time and improve user experience:
Avoid redundant redirects: It's recommended to avoid redirects where possible and to use this method only when absolutely needed. This will cut down unnecessary overhead and improve the perceived performance of the page.
Chain redirects: When a URL is linked to another URL, this creates a chained redirect. Each URL added to the chain adds latency to the page. Chained redirects have a negative impact not only on page speed, but also SEO.
Clean-up redirects: You may have hundreds of redirects on your website and it could be one of the main factors affecting page speed. Old redirects may conflict with new URLs, backlinks can cause odd errors on the page. It is recommended to verify all the redirects you have set up and to remove the ones that are no longer needed. Retain the old links that have major referral traffic, while those that are rarely accessed can be removed. This exercise will help improve page speed significantly.
|
| Avg_Browser_Wait_Time |
Indicates the time that requests from this city spent on the browser, waiting for another request to complete. |
ms |
This is the sum of the time between every two consecutive events, starting with the navigationStart event till the requestStart event.
Typically, web browsers limit the number of active connections for each domain. Most modern browsers (eg., Chrome) support only six simultaneous requests/connections. In this case therefore, when the seventh request comes in, that request waits on the browser until the six requests sent previously are processed. The waiting time of the seventh request is the browser wait time.
High browser wait time can prolong the browser's initial request time, thus adversely impacting the overall responsiveness of the web site/application. This is why, if the Avg_Brow_Init_Req_Time measure reports an abnormally high value, you will have to compare the values of the Avg_App_Cache_Time, Avg_Redirection_Time and Avg_Browser_Wait_Time measure to determine whether/not the initial request delay observed on the browser is because requests have been waiting on the browser for too long a time.
Some of the means by which you can reduce browser waits are briefly discussed below:
Browsers such as Mozilla Firefox support upto 10 parallel requests. You may want to recommend such browsers for your web site/web application users, so that more requests are processed and fewer requests are queued on the browser, thus reducing browser wait time.
Web site/application developers can try domain sharding - i.e., split content across multiple domains. Typically, when a user connects to a web page, his or her browser scans the resulting HTML for resources to download. Normally these resources are supplied by a single domain - the domain providing the web page or a domain created specifically for resources. With domain sharding, the user's browser connects to two or more different domains to simultaneously download the resources needed to render the web page. This allows website/application to be delivered faster to users as they do not have to wait for the previous set of requests to end before beginning the next set.
|
|