eG Monitoring
 
Measures reported by PVSRamCacheTest

Provisioning Services provides administrators the ability to virtualize a hard disk or workload and then stream it back out to multiple devices. The workloads, which can be server or desktop, are ripped from a physical or virtual disk into Microsoft's virtual hard disk (VHD) format and treated as a golden master image called a vDisk. This master image is then streamed over the network from a Windows server running the stream service to multiple target devices that were PXE booted. When a vDisk is in private mode, the vDisk can be edited. When a vDisk is in standard mode, it is read-only and no changes can be made to it. Instead all disk write operations are redirected to what is referred to as a write-cache file. The intelligent device drivers are smart enough to redirect writes to the write-cache file and read newly written files from the write-cache file instead of the server when necessary. When using Citrix Provisioning Services with the vDisk in standard mode you have a write-cache drive location that holds all the writes for the operating system. If the write-cache file fills up unexpectedly, the operating system will behave the same as if the drive ran out of space without any warning - in other words, it will blue screen. To avoid this, it is imperative to continuously track the usage of the write-cache, so that you can be forewarned of a probable space crunch in the write-cache and can resize the write-cache file to accommodate subsequent writes.

This test helps administrators keep tabs on the usage of the write-cache of every target device that is connected to the Provisioning server, and sends out proactive alerts to administrators if it finds that a write-cache file is rapidly filling up. This way, the test aids in averting operating system crashes that may occur owing to lack of space in the write-cache. Moreover, in the process of monitoring the I/O activity on the Citrix PVS, the test also promptly captures I/O transaction failures and reports the number of times each target device had to retry an I/O transaction on the PVS. This will shed light on communication issues that may exist between the target device and the PVS.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Ram_cache_used_pct Indicates the percentage of space in the write-cache that is currently utilized. Percent A high value or a consistent increase in this value is a cause for concern, as it indicates that write-cache space is being eroded. You may have to allocate more space to the write-cache to avoid a complete space drain! The optimum size of write-cache drive does depend on several factors:

  • Frequency of server reboots. The write-cache file is reset upon each server boot so the size only needs to be large enough to handle the volume between reboots.
  • Amount of free space available on the c: drive. The space that will be used for new files written to the c: drive is considered the free space available. This is a key value when determining the write-cache drive size.
  • Amount of data being saved to the c: drive. Data that is written to the c: drive during operation will get stored automatically in the write-cache drive. New files will be stored in the write-cache file and decrease the amount of available space. Replacements for existing files will also be written to the write-cache file but will not marginally affect the amount of free space. For instance, a service pack install on a standard-mode disk will result in the write-cache file holding all the updated files, with very little change in available space.
  • Size and location of the pagefile. When a local NTFS-formatted drive is found, Provisioning Services moves the Windows pagefile off of the c: drive to the first available NTFS drive, which is also the location of the write-cache file. Therefore, in the default configuration, the write-cache drive will end up holding both the write-cache file and the pagefile. To learn more about correctly sizing your pagefile, see Nick Rintalan's blog, “The Pagefile Done Right!”.
  • Location of the write-cache file. The location of the write-cache file is also a factor in determining its size. The write-cache file can be held on the target device's local disk, the target device's RAM, or on the streaming server.
    • Target device disk: If the write-cache file is held on the target device's disk, it could be a local disk to client, local disk to the hypervisor, network storage to the hypervisor, or SAN storage to the hypervisor.
    • Target device RAM: If the write-cache file is held in the target device's RAM the response time will be faster and in some cases the additional RAM is less expensive than SAN disk.
    • Streaming Server: If the write-cache file is on the server, no preset size is necessary. When using server-side write-cache file, the Provisioning Services streaming server must have enough disk space to hold the write-cache files for all target devices managed.
No_of_retries Indicates the number of times this target device had to retry an I/O transaction on the Citrix PVS. Number The client's driver performs a vDisk I/O by sending a request to the Provisioning Server. If a transaction fails due to a timeout (which is a no-reply timeout), the driver tries to send the I/O request again. This measure indicates the number of I/O requests that were resent by the client's driver. A high value indicates that I/O transactions are failing repeatedly; issues in network connectivity between the target device and PVS can cause such failures.
Write_cache_size Indicates the current size of the write cache of this target device. MB When using Citrix Provisioning Services with the vDisk in standard mode you have a write-cache drive location that holds all the writes for the operating system. If the write-cache file is not properly sized, it may fill up unexpectedly; in this case, the operating system will behave the same as if the drive ran out of space without any warning, in other words it will blue screen.

The optimum size of write-cache drive depends on several factors:

  • Frequency of server reboots. The write-cache file is reset upon each server boot so the size only needs to be large enough to handle the volume between reboots.
  • Amount of free space available on the c: drive. The space that will be used for new files written to the c: drive is considered the free space available. This is a key value when determining the write-cache drive size.
  • Amount of data being saved to the c: drive. Data that is written to the c: drive during operation will get stored automatically in the write-cache drive. New files will be stored in the write-cache file and decrease the amount of available space. Replacements for existing files will also be written to the write-cache file but will not marginally affect the amount of free space. For instance, a service pack install on a standard-mode disk will result in the write-cache file holding all the updated files, with very little change in available space.
  • Size and location of the pagefile. When a local NTFS-formatted drive is found, Provisioning Services moves the Windows pagefile off of the c: drive to the first available NTFS drive, which is also the location of the write-cache file. Therefore, in the default configuration, the write- cache drive will end up holding both the write-cache file and the pagefile.
  • Location of the write-cache file. The location of the write-cache file is also a factor in determining its size. The write-cache file can be held on the target device's local disk, the target device's RAM, or on the streaming server.

    • Target device disk: If the write-cache file is held on the target device's disk, it could be a local disk to client, local disk to the hypervisor, network storage to the hypervisor, or SAN storage to the hypervisor.
    • Target device RAM: If the write-cache file is held in the target device's RAM the response time will be faster and in some cases the additional RAM is less expensive than SAN disk.
    • Streaming Server: If the write-cache file is on the server, no preset size is necessary. When using server-side write-cache file, the Provisioning Services streaming server must have enough disk space to hold the write-cache files for all target devices managed.

Below are a few guidelines for right-sizing the client-side write-cache drive.

  • Write-cache drive = write-cache file + pagefile (if pagefile is stored on the write-cache drive)
  • Write-cache file size should be equal to the amount of free space left on the vDisk image. This will work in most situations, except those where servers receive large file updates immediately after booting. As a rule, your vDisk should not be getting updated while running in standard-mode.
  • Always account for the pagefile location and size. If it is configured to reside on the c: or d: drive, include it in all size calculations.
  • Set the pagefile to a predetermined size to make it easier to account for it. Letting Windows manage the pagefile size starts with 1x RAM but it could vary. Manually setting it to a known value will provide a static number to use for calculations.
  • During the pilot, use server-side write caching to get an idea of the maximum size you might see a file reach between server reboots. Obviously, the server should have a full load and should be subject to the normal production reboot cycle for this to be of value.

In most situations, the recommended write-cache drive size will be free space available on vDisk image plus the pagefile size. For instance, if you have a 30GB Windows Server 2008 R2 vDisk with 16GB used (14GB free) and are running with an 8GB pagefile, it would be good practice to use a write-cache drive of 22GB calculated as 14GB free space + 8GB for the pagefile. If space doesn’t permit, you have a few options, not all of which may be available to you.

  • If storage location for the write-cache drive supports thin-provisioning, configure thin-provisioned drives for the write-cache drive to save space;
  • Use dynamic VHDs (instead of fixed VHDs) though this approach is generally only recommended for XenDesktop workloads. If you choose this approach, you will probably need to periodically reset the size of the dynamic VHD, which can be done with a PowerShell script.
  • Reboot the servers more frequently which in turn will reduce the maximum size of the write-cache file.
  • Move the pagefile to a different drive or run without a pagefile.