Updates from QA Training

DRS Advanced Concepts

Having taught the vSphere 4 and now 5 courses for a few years now, I thought it was time to elaborate on the way that DRS identifies whether it should or should not migrate vms.


Andy Fox | 10 October 2012

Having taught the vSphere 4 and now 5 courses for a few years now, I thought it was time to elaborate on the way that DRS identifies whether it should or should not migrate vms.

The "Problem"

DRS "balances" virtual machine workloads between the hosts in your DRS enabled cluster.  To do this, it must first identify the resource requirements for your vms.

This is done initially by looking at the vm reservation, but over time DRS will also get a picture of what resources are used during peak activity and what are required when idle.

DRS uses filtering to avoid migrations that either would not make any difference, would cost more in resource usage to perform the move, or could prove risky as the resource requirements for the vm are unknown, or can vary wildly.

MinGoodness Filtering

This metric is used to ensure that the vm is moved only if it will resolve resource imbalance.

Goodness = imbalance metric (before move) - imbalance metric (after move)

Moves must meet the threshold of imbalance to be recommended, and any move candidates that cannot meet the threshold of goodness are MinGoodness filtered.

e.g. all the vms are too small to make a significant difference.

CostBenefit Filtering

This metric is used to ensure that the vm is moved only if there is a resource benefit in doing so.

Benefit = Higher Resource Availability

Cost  = Migration Cost i.e. vMotion cpu + memory cost and vms slowdown during vMotion.

Risk Cost i.e. any benefit may not be sustained due to load variation.


Candidates that cannot meet the criteria are CostBenefit filtered.

Handling Severe Imbalance

There may be situations where the cluster appears severely imbalanced, where perhaps all vms are located on one, or a few of the hosts and others have few of no vms.  This can be caused by one of the following reasons:

  • Target too impractical, too many constraints
  • Filters too aggressive for certain inventories
  • Newly powered on hosts, or recently updates hosts (VUM - vSphere Update Manager)

In vSphere 5.1, DRS will automatically detect and address severe imbalance, the filters are automatically relaxed/dropped, and are reinstated when the condition is resolved.  This is also the default with vSphere 4.1 U3 and vSphere 5.0 U2.

The parameters that control this are:

SevereImbalanceRelaxMinGoodness=1

SevereImbalanceRelaxCostBenefit=1

FixSevereImbalanceOnly=1

You can modify the default behavior, but you should use caution!

FixSevereImbalanceOnly=0

SevereImbalanceDropCostBenefit=1

In older versions and updates, there are other parameters than can be set to influence DRS:

MinGoodness=0; allows any migration with goodness > 0

CostBenefit=0; considers expensive / ephemeral moves as well

Using these options can cause a large amount of migrations, and don't forget to set them back to "1"!.

Handling Multiple Metrics

DRS uses smart heuristics and tries not to "hurt" any metric.

Score (move) = W (cpu) * Score (cpu) + W (mem) * Score (mem)

Weights (W) are calculated dynamically based upon resource utilization.

Extra Metric: "Eggs in a basket"

LimitVMsPerESXHost

The idea with this metric is to restrict the number of vms per host, and is new in vSphere 5.1.


For example, LimitVMsPerESXHost=6, would do exactly as is suggests, and DRS would not admit or migrate more than 6 vms per host.  This may impact on vm "happiness" and load balancing, and would typically not be beneficial if the hosts in the cluster have differing resource capacities.

The Future

"Eggs in a basket": Auto Tune

LimitVMsPerESXHostPercent=50

The idea of this metric is to restrict the number of vms per host, but also allow for hosts to have a degree of flexability in the number based upon how many vms there are, the number of hosts, and a percentage.

In the original example for the "Eggs in a basket", Host A has 4 vms, Host B has 5, Host C has 5, and Host D (a host with much more capacity than the others) has 6.  Using the previous LimitVMsPerESXHost=6 value, this would limit Host D to only running 6 vms, when it is clearly suited to running more.

If we take the previous example and assume in addition, 12 more vms are required, the total would be 32 vms.

32/(number of hosts=4) = 8 + 50% * 8 = 12, meaning each host potentially could run 12 vms.


Note - This option is not available yet!

Turbo Mode

This is an option that would allow a one off load balancing option, where the administrator could click a button in the client and force rebalancing regardless of the filters.

Pros

  • Reach lowest possible imbalance metric
  • Maximum exploration of solution space
  • No MinGoodness or CostBenefit filtering

Cons

  • No MinGoodness or CostBenefit filtering
  • May cause a large amount of migrations

Note - This option is not available yet!

Source:

VMWorld 2012 session VSP2825

Aashish Parikh  (VMware)

Ajay Gulati (VMware)


Andy-Fox

Andy Fox

Senior Learning Consultant

Andy has been a Consultant Instructor with QA for 10 years, and has 16 years IT Training experience. In his 25+ years in the IT industry he has gained experience working with Novell products and Microsoft from MS-DOS onwards. Since joining QA, his focus moved towards SuSE Linux where he gained CLP and CLE status. Over the past 4 years he has been engaged in the delivery of VMware vSphere training and has gained VCP, VCI and VCAP-DCA status.
Talk to our learning experts

Talk to our team of learning experts

Every business has different learning needs. QA has over 30 years of experience in combining the highest quality training with the most comprehensive range of learning services, ensuring the very best fit for your organisation.

Get in touch with our learning experts to talk about how we can help.