|
The VM Dashboard
In large virtualized infrastructures today, a number of business-critical applications are being deployed on tens of virtual machines configured on a variety of virtualization platforms (e.g., VMware ESX, Citrix XenServer, Solaris LDoms, etc.), so as to optimize space and resource usage. In such environments, excessive resource usage by a single VM or a resource pool on a virtual host can cause a huge dent in the resources available to other VMs, thereby affecting the performance of the applications executing on those VMs! To ensure a high uptime for their key applications, administrators need to track in real-time the resource usage across VMs and physical servers regardless of the underlying virtualization platform, quickly detect abnormal usage patterns, accurately identify the VM(s) responsible for the same, and promptly initiate corrective action. Likewise, in environments where centralized management tools such as VMware vCenter are used, administrators also need to keep tabs on the availability and overall health of these tools, so as to ensure that performance degradations that the tool experiences does not impact the performance of the ESX servers it manages.
eG Enterprise provides a single, central VM Dashboard that provides an integrated interface from where administrators can compare resource usage across physical servers and VMs on each physical server, and provides them with real-time insights into the health of the physical servers, the status of the VMs, and how each VM currently uses the allocated and physical resources available to it.
Using this dashboard, administrators can:
- Understand, from a single look, the composition of the managed virtualized environment - in other words, using the dashboard, you can easily view and instantly figure out the complete hierarchical structure of your virtualized environment; for instance, for a VMware environment, the dashboard will help you identify the number of vCenter servers that are being managed, the number of datacenters configured on each vCenter, the clusters (if any) that exist within the datacenters, the ESX servers being managed by every datacenter, the resource pools on each ESX server, and the VMs that each resource pool comprises of.
- Detect at a glance, excessive resource usage by any VM, cluster, resource pool, or a physical server from across the environment, regardless of the virtualization technology in use; quickly diagnose the root-cause of the resource drain with the help of efficient drill down features;
- Accurately identify resource-intensive VMs/resource pools/clusters/physical servers;
- Promptly detect unavailable datastores;
- Instantly spot a powered-off VM anywhere in the environment;
- Know which VM is currently operating on which physical server, and thus keep tabs on VMotion/XenMotion (as the case may be) activities;
- View a consolidated list of issues currently encountered by physical servers and virtual machines across the environment and also per virtual component, so as to ease the troubleshooting efforts of a dedicated help-desk;
- Quickly identify and troubleshoot issues with vCenter servers (if any) in the environment;
- To access the VM Dashboard, follow the menu sequence: Components -> Virtual -> Dashboard in the eG monitoring console. VM Dashboard page then appears.
The VM Dashboard comprises of two panels. The left panel boasts of a tree-structure, comprising of a default global node named Zones. All the zones/farms in the target environment that have been configured with one/more virtual hosts (eg., VMware ESX servers, Citrix XenServers, etc.) or VMware vCenter servers, will be the sub-nodes of the Zones node. If you expand a particular zone node in the tree, you will find that the virtual component-types that have been added to that zone appear as its sub-nodes. If you expand any node under the zone node that corresponds to a particular virtual server-type (eg., VMware vSphere ESX, VMware vSphere VDI, Citrix XenServer, etc.), then such a tree will typically comprise of virtual hosts of that type that are included in the zone. Expanding the virtual host node will reveal the VMs executing on that host and resource pools (if any) configured on that host.
In addition to sub-nodes representing a virtual server-type, if a VMware vCenter component has been added to the zone, then a VMware vCenter sub-node will also appear under the corresponding zone node. If you expand the node representing the VMware vCenter component-type under a zone, then all the managed vCenter servers will appear as its sub-nodes. Expanding a particular vCenter server node will reveal the folders (if any) configured on that vCenter server; similarly, expanding a particular folder node will reveal the datacenters (if any) that are being managed by that vCenter server. To view the virtual hosts/clusters configured within a datacenter, you will have to expand the corresponding datacenter node. While the cluster tree will contain virtual hosts within the cluster as its sub-nodes, expanding a virtual host tree will reveal the VMs and resource pools that are executing on that virtual host. You can also view the virtual host tree by expanding any node that corresponds to a virtual host type under the a particular zone node. The state of a node in the tree depends upon the current state of its sub-nodes. Independent virtual hosts/vCenter servers that are not part of any existing zone will be automatically grouped under the Default zone tree.
By default, the tree-structure lists all virtual hosts, clusters, and virtual machines. Accordingly, when you click on the button at the right, top corner of the left panel, both the Hosts And Clusters and Show Virtual Machines options will be chosen by default from the Display Settings section that appear. To view only the VMs in the tree, select the Show Virtual Machines check box, but deselect the Hosts And Clusters check box. To hide VMs from the tree-view, select the Hosts And Clusters check box, but leave the Show Virtual Machines check box deselected.
The right panel is a context-sensitive panel, the contents of which will vary according to the node chosen from the left panel. By default, this panel provides 4 tab pages - a Summary tab page that provides a quick summary of performance and problem information pertaining to the node chosen from the tree, a VMs tab page that provides current status updates related to virtual machines, a Hosts tab page that displays virtual host-specific metrics gathered in real-time, and a Current Events tab page that lists the problems currently experienced by virtual hosts and guests. By default, all the tab pages provide information pertaining to managed VMware vSphere ESX servers in the environment. Accordingly, the VMware vSphere ESX option is chosen from the drop-down list in the top, right corner of the right panel. You can view details for a different virtual component type, by selecting a different option from this drop-down list.
Note:
In the case of a user who is associated with one/more VMs and no other infrastructure element, the VM Dashboard that appears when the Virtual -> Dashboard menu sequence is followed, will by default display the VMs and Current Events tab pages alone.
Note:
As stated earlier, by default, the tab pages in the right panel provide details related to VMware vSphere ESX servers. This default setting can be overridden in the following manner:
- Click on the
button at the right, top corner of the right panel. Configuration Settings page then appears.
- By default, the VMware vSphere ESX option is chosen from the DefaultType list, indicating that the details in the right panel pertain to the managed VMware vSphere ESX components by default. To change the default setting, you will have to select a different option from this list.
- Then, click the Update button.
Similarly, every tab page, by default, displays the details of the top 10 VMs/virtual hosts only. This is indicated by the default value 10 chosen from the Limit to list at the right, top corner of each tab page. To view more (or less) number of records in the tab pages, you will have to select a different value from the Limit to list. Alternatively, you can even override the default value 10, so that the tab pages display more or a less number of records by default. To achieve this, click the button in the right panel, and set the Limitations parameter that appears next to any value other than 10.
The Summary Tab Page
The Summary tab page serves as a single, central interface that combines 'problem and performance information' related to virtual infrastructures. Using this tab page, administrators can perform the following with ease:
- Oversee, by a mere glance, the composition of and the all-round performance of the virtual infrastructure as a whole or of the virtual infrastructure component chosen from the tree;
- View a consolidated list of current alarms pertaining to the node chosen from the tree, and instantly identify problem-prone virtual infrastructure elements;
- Receive real-time updates of the resource usage of physical servers and virtual machines, and instantly identify the hosts/guests experiencing a resource contention;
- Easily analyze and accurately detect disconcerting trends in the resource usage of the physical servers and virtual machines.
- If the global Zones node is chosen from the tree, the Summary tab page in the right panel will, by default, provide a quick overview of the composition and performance of the monitored VMware vSphere ESX servers spread across all the managed zones in the environment. To view the performance summary of a different virtualization platform, select a different option from the drop-down list in the right, top corner of the Summary tab page. By default, VMware vSphere ESX is chosen from this list.
- For the default VMware vSphere ESX infrastructure, this tab page provides an Infrastructure Overview section that briefly discusses the key ingredients of the virtual infrastructure - i.e., the number of managed physical servers in the target virtual environment, the total number of VMs on the physical servers, the number of VMs that are currently powered on, and the number of alarms currently open on the infrastructure elements. For more details about the target environment, simply move your mouse pointer over every value displayed in the Infrastructure Overview section. For instance, to know the names of physical servers that are being managed in the environment, simply move the mouse pointer over the value corresponding to Physical Servers in the Infrastructure Overview section. The Summary tab page will then change to display a pop-up that lists the names of the physical VMware vSphere ESX servers managed in the environment. This way, you can view the names of virtual machines executing on the physical servers, the names of powered on VMs, and also the list of current alarms pertaining to the environment. Besides helping you identify VMs that are powered-off currently, the Infrastructure Overview also enables you determine the number and nature of the unresolved problems in the environment.
- Now that you know the names of the physical servers, you might want to analyze the current resource usage of each of these servers to ascertain whether they are experiencing any resource shortages or not. For that, click on the Physical Servers label in the Infrastructure Overview section. This leads you straight to the Hosts tab page of the global Zones node, which displays the physical servers, their current state, and also the resource usage metrics pertaining to each server.
- Similarly, you can click on the Total VMs label in the Infrastructure Overview section to switch to the VMs tab page of the Zones node, focus on the performance of the individual VMs, and identify the VM that could be consuming resources excessively.
- While clicking on the Powered On VMs label in the Infrastructure Overview section takes you to the VMs tab page and allows you to analyze the resource usage of powered-on VMs alone, clicking on the Current Alarms label leads you straight to the Current Events tab page, where you can view the complete list of problems the target virtualized environment is currently experiencing.
- If you want to focus on each problem closely, then, you can use the Current Alarms section adjacent to the Infrastructure Overview section in the Summary tab page. Use the arrow buttons (<< and >>) above an alarm to navigate to the next alarm. Move your mouse pointer over an alarm to know which test has reported the problem.
- The Physical Servers section below the Infrastructure Overview section enables you to determine how problem-prone the physical servers in your environment are, by revealing the number of critical, major, and minor issues that are currently unresolved for each physical server. By moving your mouse pointer over an alarm priority corresponding to a physical server, you can view the details of current alarms of that priority .
- By clicking on a physical server in the Physical Servers section, you can zoom into the layer model of that server; this will indicate all the layers that have been affected by problems. From the color-coding of the layers, you can easily infer from which layer the problem originated. Click on that layer to view the problem tests, and then, click on a problem test to view the measures that have reported anomalies .
- The Virtual Machines section enables you to determine how problem-prone the virtual machines in your environment are, by revealing the number of critical, major, and minor issues that currently remain unresolved for each virtual machine. By moving your mouse pointer over an alarm priority corresponding to a virtual machine, you can view the details of current alarms of that priority.
- To know the exact layer where the problem occurred and the test that reported the problem, click on any virtual machine in the Virtual Machines section. A Figure will then appear indicating the same.
- In large virtualized environments comprising of a multitude of virtual hosts that are configured with tens of VMs, it is often difficult for administrators to instantly and accurately locate resource-intensive physical servers and/or virtual machines. Similarly, identifying the physical servers with too many VMs is also a herculean task. In order to ease the pain of the administrators, the Summary tab page provides two sections - one each for physical servers and VMs - which can be configured to list the top physical servers and VMs (respectively) in the environment, in terms of resource consumption. Also, by default, both sections will reveal the top consumers of physical CPU resources, starting with the leading consumer. Accordingly, the Physical CPU utilization measure is the default selection in both the Top Servers by and the Top VMs by lists. To view the top consumers of another resource, select a different measure from these lists.
- In the Top Servers by list, in addition to the Physical CPU utilization measure, the following measures are available for selection by default: Used physical memory, Used space, Registered guests, and VM power-on state. If need be, you can override this default setting, so that new measures can be added to the list and one/more existing measures can be removed from the list. To do this, follow the steps given below:
- Click on the
button at the right, top corner of the Summary tab page.
- A Figure then appears. To add a new measure to the Top Servers by list in the Summary tab page, first select the Summary option from the Add/Delete measures in section.
- Next, set VMHost as the Type, and then select Add from Add/Delete to add a new measure to the Top Servers by list.
- Then, pick the Component to which the new measure pertains.
- Select the Test that reports the measure of interest.
- Select the Measure to be added, and provide a Display Name for the measure.
- To add the measure, click the Update button. Doing so ensures that the Display Name specified appears as an option in the Top Servers by list in the Summary tab page.
- Similarly, you can remove a measure from the Top Servers by list. For this purpose, set the Add/Delete flag to Delete, select the Component to which the measure to be deleted pertains, select the Test reporting the measure, select the Measure to be deleted, and finally, click the Update button.
In the Top VMs by list on the other hand, in addition to the Physical CPU utilization measure, the following measures are available for selection by default: Memory usage, Disk capacity, and Percent disk usage. If need be, you can override this default setting, so that new measures can be added to the list and one/more existing measures can be removed from the list. To do this, follow the steps given below:
- Click on the
button at the right, top corner of the Summary tab page.
- Configuration Settings page then appears. To add a new measure to the Top VMs by list in the Summary tab page, first select the Summary option from the Add/Delete measures.
- Next, set VMGuest as the Type, and then select Add from Add/Delete to add a new measure to the Top VMs by list.
- Then, pick the Component to which the new measure pertains.
- Select the Test that reports the measure of interest.
- Select the Measure to be added, and provide a Display Name for the measure.
- To add the measure, click the Update button. Doing so ensures that the Display Name specified in Configuration Settings page appears as an option in the Top VMs by list in the Summary tab page.
- Similarly, you can remove a measure from the Top VMs by list. For this purpose, set the Add/Delete flag to Delete, select the Component to which the measure to be deleted pertains, select the Test reporting the measure, select the Measure to be deleted, and finally, click the Update button.
- Also, by default, both lists will display the top-5 resource consumers only. This default setting can be overridden by following the steps given below:
- Click on the
button at the right, top corner of the Summary tab page.
- Configuration Settings page then appears. By default, the value 5 is displayed in the Top value in summary text box, indicating that the Summary tab page displays the top-5 resource consumers, by default.
- Override this default setting by specifying a different number in the Top value in summary text box.
- Then, click the Update button in Configuration Settings page.
- Let us now focus on the Top Servers by section alone. Against every physical server displayed in the Top Servers by section, the percentage of the chosen resource currently utilized by each physical server will be displayed, followed by a miniature graph tracking the usage of that resource over a period of time. If you click on a physical server in this section - say, the server that is the leading consumer of physical CPU resources - Figure depicting the layer model of a CPU-intensive physical server will appear revealing the layer model of that server. From the layer model, you can navigate to the test and the measure reporting the physical CPU usage of the server, perform further analysis, and accurately identify which processor supported by the server has contributed to the excessive resource usage.
- If you click on the miniature graph that corresponds to a physical server, the graph will expand as depicted by Figure. By default, the expanded graph tracks the variations in the measure selected from the Top Servers by list, during the last 1 hour. With the help of this graph, you can effortlessly observe how the physical server has been using the chosen resource over the last 1 hour by default. You can change this default period by choosing a different Timeline for the graph. This analysis will enable you to effectively study usage trends, and accurately detect the exact time at which the physical server began experiencing spikes in resource usage.
- Let us now shift our focus to the Top VMs by section in the Summary tab page. Against every virtual machine displayed in the Top VMs by section, the percentage of the chosen resource currently utilized by each VM will be displayed, followed by a miniature graph tracking the usage of that resource over a period of time. If you click on a VM in this section - say, the VM that is the leading consumer of physical CPU resources - A figure depicting the layer model of a CPU-intensive VM will appear revealing the layer model of the physical server on which the VM is executing. By default, the test that reports the physical CPU usage of the VM in question will be selected in the layer model, and all the measures reported by that test for the chosen VM will also be displayed. Using these metrics, you can effectively assess the overall resource usage of that VM.
- If you click on the miniature graph that corresponds to a VM in the Top VMs by section, the graph will expand. By default, the expanded graph tracks the variations in the measure selected from the Top VMs by list, during the last 1 hour. With the help of this graph, you can effortlessly observe how the VM has been using the chosen resource over the last 1 hour by default. You can change this default period by choosing a different Timeline for the graph. This analysis will enable you to effectively study usage trends, and accurately detect the exact time at which the VM began exhibiting unhealthy resource usage trends.
- If the node representing a particular zone is chosen from the tree-structure in the left panel, the contents of the Summary tab page will change accordingly. Besides revealing the number of physical servers, VMs, and powered-on VMs in the chosen zone, the Infrastructure Overview section also presents a macro view of the health of the zone by indicating the number of unresolved problems in the zone. The Current Alarms section will enable you to view these unresolved problems one after another. To know which physical servers in the zone are responsible for these problems, you can take the help of the Physical Servers section, which lists the names of the servers along with the number and severity (critical/major/minor) of problems (if any) each server is associated with. Similarly, the Virtual Machines section, in addition to displaying the names of VMs that are executing on the physical servers included in the zone, also reveals the problematic VMs by indicating the number and severity of problems (if any) that each VM is currently experiencing. Besides the above, the Summary tab page for a particular zone will also enable you accurately identify the resource-intensive physical servers and VMs in a zone. The Top Servers by section of this tab page displays the top-5 (by default) resource-hungry physical servers; this enables you to quickly identify the server in the zone that is most resource-intensive. The Top VMs by section displays the top-5 (by default) resource-hungry VMs, and thus enables you to identify the most resource-intensive VM in the zone.
- If a particular virtual server-type (eg., VMware vSphere ESX, Citrix XenServer, etc.) under a zone is chosen from the tree-structure, then the contents of the Summary tab page will change.
- The Infrastructure Overview section in this case will provide a quick performance summary of only those servers in the zone that are of the type chosen from the tree-structure. In other words, if the VMware vSphere ESX node is chosen from the tree-structure, then the Infrastructure Overview section will display the number of ESX servers in the zone, the number of VMs and powered-on VMs executing on the ESX servers, and the current alarms related to these ESX servers. You can move your mouse pointer over any number displayed in this section, to view the corresponding details.
- To pay individual attention to each alarm, view them one after another in the Current Alarms section. You would now want to know which ESX servers in the zone are the most problematic, and also determine whether any of the VMs on ESX servers have contributed to these problems. For this purpose, you can use the Physical Servers and Virtual Machines sections, which list the problem virtual hosts and guests in the zone, and also indicate the number and type of problems currently encountered by each of the displayed hosts and guests. By moving your mouse pointer over an alarm priority corresponding to a physical server/VM, you can take a quick look at the alarms of that priority that are currently open on that physical server/VM.
- Also, you can receive real-time updates on the resource utilization levels of these hosts and guests from the Top Servers by and Top VMs by sections; these sections, by default, display the top-5 physical servers and VMs in terms of Physical CPU utilization. To view the toppers in a different performance realm, you can select a different measure from the list box available in both the sections. You can even add a new measure to the list or remove one/more of the existing measures. Similarly, to view more number of physical servers and VMs in this section, you can change the default value 5 to a different number. Using the information provided by these sections, you can determine the current resource usage of each displayed physical server and VM, and view graphs that can enable you to effectively assess the resource usage trends of these components over a broader period of time. Moreover, resource-intensive hosts and guests can be rapidly identified, sporadic/consistent surges in resource utilization by these hosts and guests can be promptly detected, and any potential resource contention can be diagnosed before its too late, and averted.
- If a zone consists of a VMware vCenter server, then the node that corresponds to this zone in the tree-structure, when expanded, will also reveal a VMware vCenter sub-node. If this sub-node is clicked, the contents of the Summary tab page in the right panel will change.
- The Infrastructure Overview section reveals the number of physical servers and VMs managed by all vCenter servers in a zone. Move your mouse pointer over each of these numbers to know the names of the ESX servers and VMs, and also that of the vCenter server managing them. Also, the number of unresolved issues related to these physical servers and VMs, and the number of performance degradations currently experienced by the managed vCenter servers themselves, will be added and displayed as the total number of Current Alarms in this section; this will provide you with a fair idea of how healthy the virtualized environment managed by vCenter is. To view the complete list of current alarms, move your mouse pointer over the number of current alarms. You can even focus on every performance issue individually by browsing the alarms, one after another, using the Current Alarms section in the Summary tab page.
To know which physical sever and which VM has contributed the maximum to the problems list, use the Physical Servers and Virtual Machines sections; these sections display the problem-prone ESX servers and VMs (respectively) across vCenter servers in a zone, and indicate how many problems of what severity are currently affecting each of the ESX servers and VMs. Move your mouse pointer over a problem severity corresponding to an ESX server or VM to view the details of the related alarms.
Besides, you can quickly identify the most resource-hungry ESX servers and VMs across vCenter servers in a zone, using the Top Servers by and Top VMs by sections in the Summary tab page. By default, these sections display the top-5 physical servers and VMs in terms of Physical CPU utilization. To view the toppers in the usage of a different resource, you can select a different measure from the list box available in both the sections. You can even add a new measure to the list or remove one/more of the existing measures. Similarly, to view more number of physical servers and VMs in this section, you can change the default value 5 to a different number. Using the information provided by these sections, you can determine the current resource usage of each displayed physical server and VM, and thus identify the physical server or VM that is consuming resources excessively. Also, by clicking on the miniature graph alongside a physical server or VM, you can expand the graph and effectively analyze the ups and downs in resource usage of the corresponding physical server or VM over time. This way, you can accurately determine whether the increase in resource usage (if any) occurred suddenly, or whether an upward trend in resource usage began earlier on.
- If you click on a particular vCenter server under the VMware vCenter node, then the resulting Summary tab page will provide an overview of the performance of that vCenter server alone.
- Know how many ESX servers and VMs are managed by the chosen vCenter server using the Infrastructure Overview section of the Summary tab page. Also, determine how healthy the vCenter server and the virtualized environment it manages is by viewing the number of Current Alarms in the Infrastructure Overview section. Move your mouse pointer over any number in this section to view the corresponding details.
- Adjacent to the Infrastructure Overview section, you will find a vCenter Health section that reports in real-time, the availability and responsiveness of the vCenter server, the load on the server in the terms of current sessions to vCenter, and the current license usage of the server. You can proactively detect the non-availability or a slowdown of the vCenter server, a server overload, or excessive license usage by the server using the metrics reported by this section. Against every value displayed, a miniature graph is available, tracking the time-of-day variations in the values of the corresponding measures. Expand the graph by clicking on it. The expanded graph, by default, reveals how the corresponding measure has performed during the last 1 hour. You can plot the graph for a broader period by choosing a different Timeline for the graph. Using this graph, you can easily analyze the performance of the vCenter over time.
- The Graphs section in the Summary tab page provides graphs that enable you to assess the CPU, disk, and memory usage of the vCenter server, over a default period of 6 hours. Resource usage trends can be accurately deduced from these graphs, and probable resource crunches can be proactively detected and averted. To zoom into a particular graph, click on it. The graph then expands, so that you can study it clearly and make sound inferences. You can even change the Timeline of the graph, so that you can generate resource usage graphs for longer time periods, and perform more effective analysis.
- If vCenter manages the target virtualized environment as folders, then expanding a vCenter server node in the tree will reveal the folders managed by that vCenter. If you click on the folder node in the tree, the Summary tab page will change to provide an overview of the composition and current state of that folder .
- The Infrastructure Overview section displays the number of physical servers, VMs, datastores, and datacenters managed by the chosen folder. To know the names of these elements, move your mouse pointer over the corresponding element count. In addition, the section also displays the number of Current Alarms related to the elements managed by the folder, and thus enables you to quickly assess the overall health of folder. If you want to take a close look at each of the current alarms, use the Current Alarms section, and browse the alarms one after another.
In order to figure out which physical servers and VMs are responsible for these alarms, use the Physical Servers and Virtual Machines sections. These sections list the problem-prone physical servers and virtual machines (as the case may be) in the folder, along with the number and type of problems that each physical server/virtual machine is currently experiencing. Move your mouse pointer over an alarm priority corresponding to a physical server/VM to view the details of alarms of that priority that are currently open on that physical server/VM.
To track the resource usage of physical servers and VMs within a folder and to identify those physical servers and VMs that are consuming resources excessively, you can use the Top Servers by and Top VMs by sections. By default, both these sections list the top-5 physical servers and VMs (as the case may be) in terms of Physical CPU utilization. To identify the top-5 consumers of a different resource, you can select a different measure from the list box available in both the sections. You can even add more measures to these list boxes for selection. Likewise, you can view more number of leading resource consumers in this section by changing the default value 5 to another number.
- If a folder node exists, then expanding this node in the tree, will reveal the datacenters included in that folder. If no folders exist, then you would have to expand the vCenter server node in the tree to view the datacenter sub-nodes. If you click on the node representing a datacenter in the tree-structure, then the contents of the Summary tab page will change to provide an overview of the composition and health of that datacenter.
- The Infrastructure Overview section displays the number of physical servers, VMs, and datastores managed by the chosen datacenter. To know the names of these elements, move your mouse pointer over the corresponding element count. In addition, the section also displays the number of Current Alarms related to the elements managed by the datacenter, and thus enables you to quickly assess the overall health of the datacenter. If you want to take a close look at each of the current alarms, use the Current Alarms section, and browse the alarms one after another.
In order to figure out which physical servers and VMs are responsible for these alarms, use the Physical Servers and Virtual Machines sections. These sections list the problem-prone physical servers and virtual machines (as the case may be) in the datacenter, along with the number and type of problems that each physical server/virtual machine is currently experiencing. Move your mouse pointer over an alarm priority corresponding to a physical server/VM to view the details of alarms of that priority that are currently open on that physical server/VM.
To track the resource usage of physical servers and VMs within a datacenter and to identify those physical servers and VMs that are consuming resources excessively, you can use the Top Servers by and Top VMs by sections. By default, both these sections list the top-5 physical servers and VMs (as the case may be) in terms of Physical CPU utilization. To identify the top-5 consumers of a different resource, you can select a different measure from the list box available in both the sections. You can even add more measures to these list boxes for selection. Likewise, you can view more number of leading resource consumers in this section by changing the default value 5 to another number .
- To receive an overview of the performance of a cluster within a datacenter, you will have to click on the cluster sub-node under the datacenter node in the tree-structure. The Summary tab page will then change .
- For a particular cluster, the Summary tab page will provide a quick summary of the performance of that cluster, so as to enable you to gauge how healthy the cluster is. The Infrastructure Overview section of the Summary tab page displays the number of physical servers, VMs, powered-on VMs, and resource pools that the chosen cluster consists of. In addition, the Current Alarms information in this section displays the number of issues related to the cluster that are still unresolved; this serves as an effective indicator of the overall health of the cluster. Move your mouse pointer over any number in this section to view the corresponding details.
To be alerted in real-time to abnormalities in resource usage by the cluster, you can use the Resources section. This section reports how well the cluster is currently using the physical memory and CPU resources available to it; alongside every usage value displayed in this section, you will find a miniature graph. Click on the graph to expand it, and view the time-of-day variations in resource usage during the last 1 hour (by default). You can even change the Timeline of the graph for analyzing usage patterns over a longer period of time. This graph helps you quickly detect disturbing trends in the usage of physical resources, and initiate corrective actions at the earliest.
The Graphs section in the Summary tab page displays a series of graphs depicts how the cluster has been using the physical memory and CPU resources available to it over a default period of 6 hours. You can magnify any of these graphs by clicking on it. You can even change the Timeline of the expanded graph, to facilitate more effective analysis of the resource usage. Using these graphs, you can determine how resource-intensive the cluster has been.
- To figure out, from a mere glance, the current state of an ESX host included in a cluster, click on the sub-node representing an ESX host under the cluster node. The Summary tab page will then change.
- For an ESX host, the Infrastructure Overview section reveals the number of VMs configured on the ESX host, the number of powered-on VMs, and the number of datastores used by the ESX host. By moving your mouse pointer over any of these numbers, you can view the corresponding details - i.e., the names of VMs/datastores, as the case may be. This way, you can rapidly identify the VMs that are currently powered-off. Moreover, the section also displays the number of Current Alarms related to the ESX host chosen from the tree, and thus enables you to quickly judge the current health of the ESX host. For complete details of the current alarms, move your mouse pointer over the number of Current Alarms.
The Resources section reports the percentage/amount of physical CPU, memory, and disk resources that the ESX host chosen is currently using. Sudden spikes in resource consumption by the ESX host can be promptly detected by closely observing the change in the resource usage levels reported by this section. Every value displayed in this section is accompanied by a miniature graph, which when clicked, expands to reveal how well the corresponding resource usage metric has performed during the last 1 hour (by default). To analyze the behavior of the said measure over a longer period of time, you can change the Timeline of the expanded graph. In the event of excessive resource usage by the ESX host, you can use this graph to figure out when exactly the upward trend in resource consumption began, and then, proceed to investigate the reasons for the same.
In addition to the Resources section, a Graphs section is also available in the Summary tab page. This section provides a series of graphs, which track the physical CPU, memory, and disk resources used by the ESX host during the last 6 hours (by default). Click on a graph in this section to expand it. Using the expanded graph, you can even change the Timeline of the graph, so that you can observe usage patterns over longer periods of time.
- If resource pools are configured on a cluster, then each resource pool will appear as a sub-node of that cluster node. When a resource pool sub-node is clicked, the Summary tab will change.
- The Infrastructure Overview section of Summary tab page reveals the number of VMs, powered-on VMs, and current alarms in the resource pool clicked on. Move your mouse pointer over any number in this section to view the corresponding details. This information enables you to swiftly identify powered-off VMs and also assess the overall health of the resource pool.
The Resources section reports the availability and usage of critical resources allocated to the resource pool, in real-time. Any sudden increase in usage of a resource can be promptly detected using the values reported by this section. Adjacent to every value, a miniature graph is provided. To track the usage of a resource over time, click on the miniature graph. The graph expands to reveal how well the resource pool used the corresponding resource during the last 1 hour (by default). You can even change the Timeline of the graph to understand resource usage trends over longer time periods.
The Graphs section of the tab page provides time-of-day graphs that reveal the variations in the physical CPU and memory usage by the resource pool during the last 6 hours (by default). Analysis of the resource consumption of the pool over time reveals whether the pool has been using resources optimally or inefficiently. Clicking on a graph expands it. You can change the Timeline to analyze resource usage over broader time periods.
The VMs Tab Page
The VMs tab page provides VM-centric information such as the name of the discovered VMs, the physical server on which each VM executes, and the metrics indicating how every VM uses its allocated and physical resources. Besides revealing resource-hungry VMs, this tab page also brings to light improper resource allocations to VMs.
This tab page, by default, provides the details of those VMs that are executing on the managed VMware vSphere ESX servers in your environment. Similarly, the details of the top-10 VMs alone will be displayed in this tab page, by default. These default settings can however be overridden by following steps given below:
- Click on the
button at the right, top corner of the VMs tab page.
- Configuration Settings page then appears.
- By default, the Limitations parameter is set to 10, indicating that the top-10 VMs are by default listed in the VMs tab page. You can override this default setting by changing the value of the Limitations parameter.
- Similarly, you will find that the DefaultType list is set to VMware vSphere ESX in the Summary tab page if the VMware vCenter node is clicked. This indicates that, by default, only those VMs that are executing on managed VMware vSphere ESX servers in the environment will be listed in the VMs tab page. To view details pertaining to the VMs on another virtualized component-type by default, select a different option from the DefaultType list.
- Finally, click the Update button in Configuration Settings page.
The VMs listed in this tab page change according to the node chosen from the tree-structure in the left panel. This section brings out these differences.
-
If the global Zones node is selected in the left panel, then the VMs tab page in the right panel will list the top 10 (by default) virtual machines that the eG agent auto-discovers from across all the managed virtual hosts of the chosen type.
Note:
By default, the VMs displayed in the VMs tab page are sorted in the order of their state - i.e., the powered-off VMs will top the list, followed by the powered-on VMs. Sometimes, administrators may want to hide the details of powered-off VMs from the VMs tab page, and instead view the resource usage metrics of the powered-on VMs alone. To enable this, follow the steps discussed below:
- Edit the eg_ui.ini file in the <EG_INSTALL_DIR>\manager\config directory.
- In the [VMDASHBOARD_DISPLAY] section of the file, you will find that the showPoweredOffVMs flag is set to Yes by default. This indicates that, by default, the VMs tab page will display powered-off VMs as well. Set this flag to No if you want the VMs tab page to only display powered-on VMs.
- Finally, save the file.
If state is the same across VMs, then the VMs are arranged in the order of their names. Against every VM listing, the state of the VM (whether powered on or off), the physical server on which the VM executes and the current resource usage metrics pertaining to each VM are displayed, so that administrators will be able to accurately identify powered-off VMs and resource-intensive VMs across all the managed physical servers (of the chosen type) in the environment, from just a quick glance. If required, you can sort the VM listing on the basis of any of the resource usage metrics. To change the sort order on-the-fly, click on the column head that represents the usage metric you want to sort on - for instace, to sort based on the CPU usage of the VMs, click on the column heading CPU used.
The resource usage metrics that accompany each VM displayed in this tab page are pre-configured in the eG Enterprise system.
If need be, you can alter this default measure list, so that more useful measures are displayed per VM or one/more unnecessary measures are removed from the display. To effect this change, follow the steps given below:
- Click on the
button at the right, top corner of the VMs tab page.
- Configuration Settings page. then appears. To add a new measure to the VMs tab page, first select the Nonsummary option from the Add/Delete measures in section.
- Set VMGuest as the Type, and select the Add option from the Add/Delete section.
- Then, pick the virtualized Component type for which a measure is to be added to the VMs tab page.
- Select the Test that reports the measure.
- Choose the Measure.
- Provide a Display Name for the measure.
- Finally, click the Update button to save the changes.
- To remove an existing measure from the VMs tab page, select the Delete option from Add/Delete, select the Component type, pick the Test that reports the measure, pick the Measure, and click the Update button.
- To zoom into the performance of a "powered on" VM, simply click on the VM Name in the right panel. This will invokeMEASURES FOR Server page displaying all the performance metrics extracted from that VM in real-time. You are thus enabled to cross-correlate across the various metrics, and quickly detect the root-cause of current/probable disturbances to the internal health of a VM.
Note:
If you click on the Name of a powered off VM in the VMs tab page, then Figure 4.124 will not appear. Instead, you will be lead to the layer model page of the physical server on which the VM executes, which will allow you to verify whether the VM is indeed powered off or not.
- Below the VMs list, you will find a Graph section. This section will be available in every tab page in the right panel of the VM dashboard. For the VMs tab, the Test list in this section will be populated with all those tests that report metrics for each VM on the virtual hosts of the chosen type. These tests will be sorted in the order of the test names, and the top test in the sorted test list will be selected by default in the Test list box. The Measures list naturally, will be populated with those metrics that the default Test reports. Since this list too is sorted in the order of the measure names, the top measure in the sorted list will be chosen by default in the Measures list. Accordingly, for the global Zones node in the left panel, the graph that appears in this section traces the variations in the default measure across all VMs (on virtual hosts of the chosen type) during the default timeline of 1 hour. Using this graph, administrators can compare the performance of a particular measure across VMs, and accurately identify those VMs that are weak in a chosen performance arena. If need be, you can plot a comparison graph for a different Test-Measure pair, for a different timeline. To change the timeline, click on Timeline; a window will then pop out, allowing you to change the date and time.
- The default graph will be a 3D graph. If required, you can select the 2D option from the drop-down list in the Graph section to generate a 2D graph.
- A legend is provided at the end of every graph clearly indicating which VM is represented using which color in the graph. This is accompanied by the Avg, Max, and Min values that the chosen Measure has recorded for every VM during the chosen Timeline. To view the legend, you can scroll down the Graph section.
- To view the graph more clearly, you can enlarge it by clicking the
button. The graph then zooms.
- You can even hide the Graph section by clicking on the
button below the VMs list.
- You can then click on
button to restore the Graph section. Similarly, you can hide the tree-structure by clicking on button next to the tree. This ensures that the right panel expands and fills the vacuum created by the tree.
- To restore the tree, click on
button.
- Now, let us see what happens if a particular zone is chosen from the tree-structure in the left panel. When this is done, the VMs list in the right panel will change to display the state and resource usage metrics related to the top-10 (by default) VMs that are executing on those physical servers (of the chosen type) that are included in the zone that is clicked on. This information helps administrators analyze how the performance of one/more VMs in a zone impact the performance of the zone as a whole.
- If you then drill down a particular zone in the tree, you will be able to view the virtual component-types that form part of the zone, and their current state. If you click on a particular component-type in the tree, the VMs tab page in the right panel will allow you to view the state and usage metrics pertaining to the top-10 (by default) VMs executing on the virtual hosts of that type that are included in the corresponding zone.
- If the vCenter servers in your environment are being monitored as part of a zone, then expanding that zone's node in the tree would reveal the VMware vCenter component-type. When this component-type is clicked on, the VMs tab page will change to display the state and resource usage metrics related to the top-10 (by default) VMs that are executing on the virtual hosts managed by all the vCenter servers included in that zone.
- To know which vCenter servers have been added to a zone, just expand the VMware vCenter sub-node under the zone node in the tree. This will reveal the name and the current state of the vCenter servers in that zone. Clicking on a particular vCenter server in the tree will provide the complete details of the top-10 VMs (by default) executing on the virtual hosts managed by that vCenter component.
- If a vCenter server manages the virtualized environment as folders, then expanding the vCenter server node in the tree will reveal sub-nodes representing the folders configured on that vCenter server. Click on a folder to view the resource usage metrics related to the VMs executing on the ESX servers included in that folder.
- Expanding the node representing a vCenter server will reveal the datacenters that have been configured on that vCenter. If the datacenters are included in a folder, then expanding the folder sub-node under the vCenter server node will reveal the datacenters. Click on a datacenter to view the names of the VMs executing on the hosts that reside within that datacenter, and the resource usage of each VM.
- Typically, when you expand datacenter in the tree, all the physical ESX servers that are being managed by that vCenter server will appear.
Note:
While the vCenter server tree will list even those ESX servers that are not monitored by eG Enterprise, the tree will not indicate the current state of such servers; also, clicking on any such server will not display corresponding performance information in the tab pages in the right panel.
However, many vCenter installations manage clusters of ESX servers. If such clusters have been configured on any monitored vCenter server, then, in the Virtual Infrastructure tree, these clusters will appear as sub-nodes of that datacenter node. If you click on a cluster sub-node in the tree, the VMs tab page will reveal the state and performance information pertaining to the top-10 VMs (by default) that are executing on the ESX servers that are part of the cluster clicked on.
- To view the individual virtual hosts that are part of a zone, do any of the following:
- Expand the nodes representing the virtual component-types in the zone;
- If the zone consists of components of type VCenter, expand the node representing the monitored vCenter server in your environment;
- If datacenters are configured on a monitored vCenter server, expand a datacenter sub-node under the vCenter server node;
- If clusters are configured within a datacenter, expand the cluster sub-node;
- If you then click on a virtual host in the tree, the VMs tab page will change to display the state and measures extracted from the top-10 (by default) virtual machines that are executing on the chosen virtual host alone.
- Similarly, if you expand the virtual host node in the tree-structure, you can view the name and state of the VMs that are executing on that virtual host. If you now click on a VM in the tree, the metrics extracted from that VM alone will be displayed in the VMs tab page.
- If resource pools are configured on a virtual host, then the resource pools also will appear as the sub-nodes of the virtual host-node. Clicking on a resource pool in the tree will reveal the current state and resource usage metrics related to the top-10 (by default) VMs present in that resource pool, in the VMs tab page.
The VC Tab Page
Promptly detect issues with vCenter and swiftly diagnose the root-cause of these issues, using the VC tab page offered by the VM dashboard. This tab page lists one/all vCenter servers in the environment (depending upon the node chosen from the tree), and reports the number of ESX servers managed, the resource usage, and the overall health of each vCenter server.
This section discusses how the contents of this tab page change with the node chosen from the tree.
- If the global Zones node is chosen from the tree, then the VC tab page will display the names of all vCenter servers included in all configured zones, and will report a default set of metrics per vCenter server, revealing the availability and resource usage of each server. Resource-intensive/unavailable vCenter servers in the target environment can thus be instantly identified.
- The default measure list that accompanies every vCenter listed in the VC tab page can be modified. For this purpose, follow the steps given below:
- Click on the button at the right, top corner of the tab page. Configuration Settings page will appear.
- Select Nonsummary from the Add/Delete measures in section in Configuration Settings page.
- Select Add from Add/Delete to add a new measure to the VC tab page.
- Then, set VMware vCenter as the Component type. Select the Test that reports the measure to be added, and then pick the Measure. Provide a Display Name for the new measure, and finally, click the Update button.
- To remove a measure that pre-exists in the VC tab page, select Delete from Add/Delete, select the Component type, pick the Test, choose the Measure, and click the Update button.
- In the event of a slowdown in the performance of a vCenter server, you can click on the name of that server in the VC tab page; this will lead you straight to the layer model of the vCenter server, which reveals the exact layer where the problem has originated. Click on the problem layer to view the problem test, and click on the problem test to view the measures.
- The VC tab page also embeds a time-of-day graph, which, by default, reveals the variations in the number of VMs in each resource pool configured within each cluster on vCenter, during the last 1 hour. Accordingly, ClusterResourcePools is chosen by default from the Test list, the default Measure is VMs in pool, and the default Timeline is 1 hour. You can select a different Test and Measure combination for the graph, and also define a different Timeline for the graph, if required. Using this graph, you can efficiently analyze performance trends and proactively detect performance issues.
- To view the zones configured with one/more virtual component-types, expand the global Zones node in the tree. The individual zones will appear as sub-nodes of the global Zones node. Selecting a particular zone will reveal the performance details related to each vCenter server within that zone in the VC tab page.
- Alternatively, you can select the VMware vCenter sub-node under a zone node to see how all the vCenter servers in a zone are performing.
- If you want to check whether a particular vCenter server within a zone is available or not, and if available, how well its using the resources available to it, select the node representing a vCenter server under the VMware vCenter node in the tree structure.
The Hosts Tab Page
The Hosts tab, as mentioned already, provides insights into the current state and the extent to which resources are currently used by the managed virtual hosts in the environment. This tab page sheds light on resource-intensive hosts, and embeds efficient drill downs to discover the underlying cause for the high resource consumption of a host.
As stated earlier, this tab page, by default, provides the details of those managed virtual hosts that are of the type VMware vSphere ESX. Similarly, the details of the top-10 virtual hosts alone will be displayed in this tab page, by default. These default settings can however be overridden using the procedure already discussed.
Here again, the hosts displayed depend upon the node chosen from the tree-structure in the left panel. This section explains how the contents of this tab page change with context.
- If the global Zones node is selected in the left panel, then the Hosts tab page in the right panel will list the top 10 (by default) virtual hosts that the eG agent auto-discovers from across all the managed virtual hosts of the chosen type. This list is typically sorted by the current state of the hosts. If state is the same across hosts, then the hosts are arranged in the order of their host names. Against every virtual host, the state of the host, the total number of VMs configured on the host, the number of VMs powered-on, and a default set of metrics indicating the extent to which resources are currently utilized by the host, will be displayed. This default measure list can be modified by adding new measures to be displayed in the Hosts tab page or by removing one/more existing measures from the tab page. To achieve this, follow the steps given below:
- Besides revealing the VM load on each virtual host, this tab page also enables administrators to instantly figure out the following:
- Is any host experiencing performance issues currently?
- Is any VM on this host currently powered off?
- Are there any resource-hungry virtual hosts in the environment? If so, which ones are they?
- Below the host list, you will find a Graph section. This section will be available in every tab page in the right panel of the VM dashboard. The Test list in this section will be populated with all the tests related to the virtual hosts of the chosen type. Note that VM-related tests will not be available for selection in this list. These tests will be sorted in the order of the test names, and the top test in the sorted test list will be selected by default in the Test list box. The Measures list naturally, will be populated with those metrics that the default Test reports. Since this list too is sorted in the order of the measure names, the top measure in the sorted list will be chosen by default in the Measures list. Accordingly, for the global Zones node in the left panel, the graph that appears in this section traces the variations in the default measure across the top 10 (by default) virtual hosts (of the chosen type) in the environment during the default timeline of 1 hour. Using this graph, administrators can compare the performance of a particular measure across virtual hosts, and accurately identify those virtual hosts that are poor performers. If need be, you can plot a comparison graph for a different Test-Measure pair, for a different timeline. To change the timeline, click on Timeline; the window will then pop out, allowing you to change the date and time.
- The default graph will be a 3D graph. If required, you can select the 2D option from the drop-down list in the Graph section to generate a 2D graph.
- A legend is provided at the end of every graph clearly indicating which virtual host is represented using which color in the graph. This is accompanied by the Avg, Max, and Min values that the chosen Measure has recorded for every virtual host during the chosen Timeline. To view the legend, you can scroll down the Graph section. If need be, you can hide/unhide the Graph section or the Tree in the left panel using the buttons provided in the VM Dashboard.
Note:
By default, while plotting a graph for a descriptor-based test across virtual hosts, eG Enterprise aggregates the measure values across all descriptors for a host, and plots only a single value for each virtual host. Accordingly, the AggregateGraphs flag in the Configuration Settings window that appears when the button is clicked is set to Yes by default. Sometimes, administrators might want the graph to plots values per descriptor. In such a case, set the AggregateGraphs flag to No.
- Now, let us see what happens if a particular zone is chosen from the tree-structure in the left panel. When this is done, the Hosts list in the right panel will change to display the state and resource usage metrics related to the top 10 virtual hosts that are operating within the zone that is clicked on. This information helps administrators analyze how the performance of one/more virtual hosts in a zone impact the performance of the zone as a whole.
- If you then drill down a particular zone in the tree, you will be able to view the virtual component-types that form part of the zone, and their current state. If you click on a particular component-type in the tree, the Hosts tab page in the right panel will allow you to view the state and usage metrics pertaining to the top-10 (by default) virtual hosts of that type that are included in the corresponding zone.
- If the vCenter servers in your environment are being monitored as part of a zone, then expanding that zone's node in the tree would reveal the VMware vCenter component-type. When this component-type is clicked on, the Hosts tab page will change to display the state and resource usage metrics related to the top-10 (by default) virtual hosts managed by all vCenter servers included in that zone.
- To know which vCenter servers have been added to a zone, just expand the VMware vCenter sub-node under the zone node in the tree. This will reveal the name and the current state of the vCenter servers in that zone. Clicking on a particular vCenter server in the tree will provide the complete details of the top-10 hosts (by default) managed by that vCenter component.
- Typically, when you expand a vCenter server node in the tree, the physical ESX servers that are being managed by that vCenter server will appear.
Note:
While the vCenter server tree will list even those ESX servers that are not monitored by eG Enterprise, the tree will not indicate the current state of such servers; also, clicking on any such server will not display corresponding performance information in the tab pages in the right panel.
However, if folders have been configured on the vCenter server, then these folders will appear as sub-nodes of the vCenter server node. If you click on a folder, then the Hosts tab page will indicate how well the top-10 hosts (by default) within the folder are performing.
- If folders exist within a vCenter server, then expanding the folder node reveals sub-nodes representing the datacenters that have been configured on the vCenter server. If no folders exist, then expanding the vCenter server node will reveal the datacenter sub-nodes. If you click on a datacenter, then the Hosts tab page will indicate how well the top-10 hosts (by default) within that datacenter are performing.
- Many vCenter installations manage clusters of ESX servers. If such clusters have been configured on any monitored vCenter server, then, in the Virtual Infrastructure tree, these clusters will appear as sub-nodes of the datacenter node. If you click on a cluster sub-node in the tree, the Hosts tab page will reveal the state and performance information pertaining to the top-10 hosts (by default) that are part of the cluster clicked on.
To view the individual virtual hosts that are part of a zone, do any of the following:
- Expand the nodes representing the virtual component-types in the zone;
- If the zone consists of components of type VMware vCenter, expand the node representing the monitored vCenter server in your environment;
- If datacenters are configured on a monitored vCenter server, expand a datacenter sub-node under the vCenter server node;
- If clusters are configured within a datacenter, expand the cluster sub-node;
- If you then click on a virtual host in the tree, the Hosts tab page will change to display the state and measures extracted from the chosen virtual host alone .
- If a virtual host in the Hosts tab page is found to be in a critical state, then to zoom into the problems affecting the health of that virtual host, simply click on it. Layer model page then appears revealing the layer model of the virtual host, and clearly indicating the problem layer. While clicking on the problem layer will reveal the problem test, a click away from the problem test is the problem measure, which sheds light on the root-cause of the problem with the virtual host. To return to the VM dashboard, just click on the Back to Virtual Dashboard link at the right, top corner of the layer model page.
The Datastores Tab Page
The Datastores tab page lists the datastores used by a particular ESX host or one/more ESX hosts within a datacenter, and reports their availability and usage, so that unavailable datastores and those that are currently running out of space can be accurately identified, and the hosts affected by this anomaly can be easily isolated.
This section discusses how the contents of this tab page change according to the node chosen from the tree-structure.
- If the node representing a particular folder is chosen from the tree, then the Datastores tab page will display the top-10 (by default) datastores used by the ESX hosts managed by that folder. Against every datastore displayed here, the availability and space usage metrics pertaining to that datastore will be provided, along with the number of VMs and ESX servers that have been using the datastore. Using this information, you can swiftly isolate unavailable or over-utilized datastores, and the number of ESX servers and VMs that have been impacted by the anomaly.
- If the node representing a particular datacenter is chosen from the tree, then the Datastores tab page will display the top-10 (by default) datastores used by the ESX hosts managed by that datacenter. Against every datastore displayed here, the availability and space usage metrics pertaining to that datastore will be provided, along with the number of VMs and ESX servers that have been using the datastore. Using this information, you can swiftly isolate unavailable or over-utilized datastores, and the number of ESX servers and VMs that have been impacted by the anomaly.
- In addition to listing datastores, the Datastores tab page provides a time-of-day graph, which, by default, reveals the number of ESX servers that have been using each datastore during the last 1 hour. Accordingly, the default selection in the Measure list is ESX servers using the datastore, and the default Timeline is 1 hour. Typically, the Measure list contains all the measures reported by the default Test, which is the Datastores test. The options available in the Measure list box are sorted in the ascending order of the measure names, and, by default, the first measure in the sorted list will be displayed as the default Measure. The default graph clearly indicates how workload on a datastore has varied during the the last 1 hour. You can plot the graph for a different measure or a timeline by choosing a different option from the Measure list, and by altering the timeline for the graph by clicking the right-arrow button that prefixes Timeline.
- If you click on a datastore in the Datastores tab, a figure will appear leading you straight to the layer model of the VMware vCenter server, which manages that datastore. In the event of the non-availability or excessive usage of a datastore, you can use this model to instantly identify the layer that has been affected by the problem with the datastore. Click on the problem layer to view the problem test, and then, click on the problem test to view the problem measure(s). This way, you can easily understand the nature of the problem with the datastore, and how it has impacted the state of the vCenter server.
- Expanding the datacenter node in the tree will enable you to view the ESX hosts that are managed by that datacenter. If a sub-node representing an ESX host is clicked in the tree, then the Datastores tab page will only display those datastores that are currently in use by the ESX host and the VMs on it.
- Here again, a time-of-day graph is available, but this graph, by default, reveals the physical disk capacity of each datastore during the last 1 hour. Accordingly, the default selection in the Measure list is Physical disk capacity, and the default Timeline is 1 hour. Typically, the Measure list contains all the measures reported by the default Test, which is the Datastores-Esx test. The options available in the Measure list box are sorted in the ascending order of the measure names, and, by default, the first measure in the sorted list will be displayed as the default Measure. You can plot the graph for a different measure or a timeline by choosing a different option from the Measure list, and by altering the timeline for the graph by clicking the right-arrow button that prefixes Timeline.
The Current Events Tab Page
The Current Events tab page lists the problems that are currently affecting the performance of managed virtual hosts and virtual machines executing on them. This tab page enables administrators to focus on issues related to their virtualized environment alone, without being distracted by the "non-virtual" issues.
As stated earlier, this tab page, by default, lists the alarms pertaining to those virtual hosts that are of type VMware vSphere ESX.
Like the other tab pages, the Current Events tab page too changes with respect to the node chosen from the tree-structure. To see how, read on.
- If the global Zones node is selected in the left panel, then the Current Events tab page in the right panel will list the problems that are adversely impacting the performance of the virtual hosts of the chosen type across the environment. This list is typically sorted by event priority. If event priority is the same across events, then the events are arranged in the order of the names of the problem hosts. The details provided here include the name of the problem component, a brief description of the problem, and the time at which the problem was reported. This information enables administrators to understand how problem-prone their virtualized environment is, and also provides them with pointers to the root-cause of the problem.
- Now, let us see what happens if a particular zone is chosen from the tree-structure in the left panel. When this is done, the Current Events list in the right panel will change to display the problems related to those virtual hosts that are included in the zone that is clicked on. This information helps administrators identify those problems that are affecting the performance of a particular zone.
- If you then drill down a particular zone in the tree, you will be able to view the virtual component-types that form part of the zone, and their current state. If you click on a particular component-type in the tree, the Current Events tab page in the right panel will allow you to view the problems related to virtual hosts of that type that are included in the corresponding zone.
- If the vCenter servers in your environment are being monitored as part of a zone, then expanding that zone's node in the tree would reveal the VMware vCenter component-type. When this component-type is clicked on, the Current Events tab page will change to display the problem events pertaining all the vCenter servers included in that zone, and those that correspond to the ESX servers managed and VMs by every vCenter server in the zone.
- To know which vCenter servers have been added to a zone, just expand the VMware vCenter sub-node under the zone node in the tree. This will reveal the name and the current state of the vCenter servers in that zone. Clicking on a particular vCenter server in the tree will display problem events pertaining to that vCenter component and those that relate to ESX servers and VMs managed by that vCenter, in the Current Events tab page.
- Typically, when you expand a vCenter server node in the tree, the physical ESX servers that are being managed by that vCenter server will appear.
Note:
While the vCenter server tree will list even those ESX servers that are not monitored by eG Enterprise, the tree will not indicate the current state of such servers; also, clicking on any such server will not display corresponding performance information in the tab pages in the right panel.
- However, in some environments, folders may be configured on vCenter server, where every folder could contain one/more datacenters, ESX servers, clusters, datastores, and VMs. In such cases, expanding the vCenter server node will reveal the folders configured on that vCenter server as sub-nodes. Clicking on a folder sub-node will display the complete list of problems currently affecting the performance of datacenters, clusters, ESX servers, and VMs that are included in that folder in the Current Events tab page.
- Expanding the folder node will reveal the datacenters within that folder. If no folders exist, then expanding a vCenter server node in the tree, will display the datacenters that have been configured within a vCenter. To know what problems are affecting the performance of a particular datacenter currently, click on the node representing a datacenter in the tree.
- Every datacenter, in turn, may manage clusters of ESX servers. If such clusters have been configured on any monitored vCenter server, then, in the Virtual Infrastructure tree, these clusters will appear as sub-nodes of that datacenter node. If you click on a cluster sub-node in the tree, the Current Events tab page will reveal the problem events pertaining to all managed ESX servers included in the cluster.
- To view the individual virtual hosts that are part of a zone, do any of the following:
- Expand the nodes representing the virtual component-types in the zone;
- If the zone consists of components of type VMware vCenter, expand the node representing the monitored vCenter server in your environment;
- If datacenters are configured on a monitored vCenter server, expand a datacenter sub-node under the vCenter server node;
- If clusters are configured within a datacenter, expand the cluster sub-node;
If you then click on a virtual host in the tree, the Current Events tab page will change to display problems affecting that virtual host.
- Similarly, if you expand the virtual host node in the tree-structure, you can view the name and state of those VMs that are executing on that virtual host. If you now click on a VM in the tree, the problem affecting that VM alone will be displayed in this tab page.
- If resource pools are configured on a virtual host, then the resource pools also will appear as the sub-nodes of the virtual host-node. If you click on a resource pool in the tree, the corresponding Current Events tab page will list the current problems affecting the performance of one/more virtual machines included in the resource pool.
- To perform additional diagnostics on one of the problems listed in this tab page, you can click on the corresponding Graph icon. A graph of the problem measure for the last 1 hour (by default) will then appear revealing when exactly the problem occurred.
- Clicking on an event listed in this tab page will lead you to the layer model page of the problem component, using which you can quickly determine the problem layer, test, and measure.
The Resource Pools Tab Page
If resource pools have been configured on a virtual host, then the eG agent auto-discovers these pools and displays them as sub-nodes of a virtual host-node. The Resource Pools tab page appears only when a resource pool in the tree is clicked on. This tab page reveals the current configuration of the chosen resource pool, which includes the number of virtual machines on the resource pool, the number of running virtual machines, and the number of child resource pools. Besides the configuration, administrators can also use this tab page to determine the current state of each of the virtual machines and child resource pools under the chosen resource pool, and simultaneously analyze the resource usage by the pool.
Clicking on a VM listed against the Virtual Machines section of a resource pool leads administrator directly to the layer model page of the virtual host on which that resource pool is configured, and automatically displays metrics that provide an "outside view" of that VM's performance. These metrics help administrators in understanding the impact the VM has on the physical resources of the virtual host.
Similarly, clicking on a particular child resource pool displayed against the Child Resource Pools section, takes the administrator to the layer model page of the corresponding virtual host, thereby granting him a sneak peek at the resource usage metrics of that child resource pool, and enabling him to analyze how resource-intensive the child .
|