VMworld 2015: 5 Functions of SW Defined Availability

Session: INF4535

Duncan Epping, Frank Denneman

Introduction to SDA (Software defined availability): VM, server, storage, data center, networking, management. Business only cares about the application, not the underlying infrastructure.

vSphere HA

  • Configured through vCenter but not dependent on it
  • Each host has an agent (FDM) will be installed for monitoring state
  • HA restarts VMs when failure impacts those VMs
  • Heartbeats via network and storage to communicate availability
  • Can use management network or VSAN network if VSAN is enabled
  • Need spare resources
  • Admission control – Allows you to reserve resources in case of a host failure
  • Admission control guarantees VM receives their reserved resources after a restart, but does not guarantee that VMs perform well after a restart.
  • Best practices: Select policy that best meets your needs, enable DRS, simulate failures to test performance
  • Percentage based is by far the most used and is Duncan recommended
  • Duncan went through various failure scenarios (host failure, host isolation, storage failure) and how HA restarts the VMs.
  • Use VMCP (new in 6.0) [VM component protection]. Helps protects against storage connectivity loss.
  • Generic recommendations: disable “host monitoring”; make sure you have redundant management network; enable portfast; use admission control


  • DRS provides load balancing and initial placement
  • DRS is the broker of resources between producers and consumers
  • DRS goal is to provide the resources the VM demands
  • DRS provides cluster management (maintenance mode, affinity/anti-affinity rules)
  • DRS keeps VM’s happy, it doesn’t perfectly balance each host
  • DRS affinity rules: Control the placement of VMs on hosts within a cluster.
  • DRS highest priority is to solve any violation of affinity rules.
  • VM-host groups configureable in mandatory (must-rule) or preferential (anti-)affinity rules (should-rule)
  • A mandatory (must) rule limits HA, DRS and the user
  • Why use resource pools? Powerful abstraction for managing a group of VMs. Set business requirements on a resource pool.
  • Bottom line is resource pools are complex, and VMs may not get the resources you think they should. Only use them when needed.
  • Try to keep the affinity rules as low as possible. Attempt to use preferential rules.
  • Tweak aggressiveness slider if cluster is unbalanced.


  • Storage IO control is not cluster aware, it is focused on storage
  • Enabled at the datastore level
  • Detects congestion and monitors average IO latency for a datastore
  • Latency above a particular threshold indicates congestion
  • SIOC throttles IOs once congestion is detected
  • Control IOs issued per host
  • Based on VMs shares, reservations, and limits
  • SDRS runs every 8 hours and checks balance, and looks at previous 16 hours for 90th percentile
  • Capacity threshold per datastore
  • I/O metric threshold per datastore
  • Affinity rules are available
  • SDRS is now aware of storage capabilities through VASA 2.0 (array thin provisioning, dedupe, auto-tiering, snapshot)
  • SDRS integrated with SRM
  • Full vSphere replication full support


  • Migrate live VM to a new compute resource
  • vSphere 6.0: cross vCenter vMotion, long-distance vMotion, vMotion to cloud
  • May not realize it, but lots of innovation and new features here since its introduction in 2003
  • Long distance vMotion supports up to 150ms. No WAN acceleration needed.
  • vMotion anywhere: vMotion cross-vCenters, vMotion across hosts without shared storage, easily move VMs across DVS, folders and datacenters.

vSphere Network IO Control

  • Outbound QoS
  • Allows you to partition network resources
  • Uses resource pools to differentiate between traffic types (VM, NFS, vMotion, etc.)
  • Bandwidth allocation: Shares and reservations. NIOC v3 allows configuration of bandwidth requirements for individual VMs
  • DRS is aware of network reservations as well.
  • Bandwidth admission control in HA
  • Set reservations to guarantee minimum amount of bandwidth for performance of critical network traffic. Sparingly use VM level reservations.


Print Friendly, PDF & Email

Related Posts

Notify of
Inline Feedbacks
View all comments