VMworld 2016: Architecture Future of Network Virtualization

Session: NET8193R. Bruce Davie, CTO Networking

Software developers need to be treated as a first class customer. The developer is king.

Network virtualization is the bridge to the future.

Network architecture today: Data plane, control plane, management plane, cloud consumption. Distributed data plane, centralized control.

Management Plane Availability

  • Developers need access to the management plane and it needs higher availability than in years past
  • New: The scalable persistence of memory
  • Write and read scalability
  • Durability
  • Shrink-wrapped
  • Consistent snapshots
  • Atomic transactions
  • Driving innovation: Distributed, shared log – No single point of failure

Control Plane Evolution

  • Heterogeneity – Hypervisors, gateways, top-of-rack switches, public cloud workloads, containers
  • Scalability – Thousands of hypervisors, 10,000s of logical ports
  • Central control plane – Generalized instructions that doesn’t need to understand heterogenity
  • Local control plane – Hypervisor specific controls (vSphere, KVM, Hyper-v, AWS, Azure, etc.)

What about non-virtualized workloads?

  • NSX has solutions for this problem

High-performance Data Plane

  • x86 processors can forward hundreds of millions of packets a second
  • DPDK – Data path development kit from Intel.
  • active-active edge cluster
  • Active-hot-standby for stateful services

Takeaway: Developers are key, and need to make them successful.

Extend NSX to the public cloud – VMware is starting with AWS

Network virtualization for containers – Put a vSwitch in the guest OS


TechEd 2014: Mark and Mark on the Cloud

Session DCIM-B386: Mark Russinovich and Mark Minasi on Cloud Computing. Mark and Mark are probably easily the top two speakers each year at TechEd. Between their delivery style and technical content, you can’t beat them. This session had zero slides, and was more of a Q&A format. Minasi asked Russinovich a variety of questions. I’ve captured some of the highlights in the session notes below. For the full effect, and lots of jokes, watch the video on Channel 9 whenever it gets posted.

  • Azure will double capacity this year, and then double again next year. They have over one million servers today and buy 17% of the servers worldwide.
  • Parts of Azure update from daily to every three weeks. Different components have different release cadences.
  • Azure hyper-v team branches the code base with new features, then the Azure features are rolled back into the general public release in the future. The merging and branching happens continuously.
  • Boxed products like Windows Server have a much longer test cycle than Azure releases. Different risk mentality.
  • Azure now runs on stock Hyper-V 2012 R2. Previously it was running a branched WS2012 hypervisor.
  • Building Azure is speeding up the pace at which features are added to Windows Server and other MS products.
  • The cloud is becoming cheaper and cheaper. Automation drives the cost of computing down. You must force yourself to automate.
  • Azure buys a zillion servers, custom white boxes, and intense automation drives down the prices.
  • Mark R. states there will be on-prem “forever”. For example, you still see mainframe today.
  • We are at the beginning of the hockey stick and haven’t hit the inflection point for cloud migrations.
  • On-prem wil still be growing for the next several years. But the cloud will be growing much, much faster than on-prem.
  • As the cloud scales up, that’s where all the innovation and investments will go.
  • On common path to the cloud is dev/test. Developers are in a hurry and can easily spin up VMs and don’t have to wait for IT. They are off and running and no need to wait for on-prem resources. Less security concerns.
  • Another common scenario is using the cloud for DR. Maybe companies will just leave it in the cloud after a failure.
  • Three major cloud players: Azure, Amazon, Google. The others in the short term will still exist, but over the years will fall away.
  • Cloud providers need a global presence and footprint, and takes 3 years and $1b per datacenter to build out. MS is building out 20 concurrent datacenters right now. Small cloud providers just can’t compete on that scale.
  • Microsoft thinks they are the best cloud player because customers already have MS software on-prem and know it well. MS has a good connection with customers/products. Azure has Active Directory, which lets you use on-prem credentials for the cloud. Same role based access controls.
  • Active Directory is the center of gravity for cloud identity.
  • Office + Active directory worked extremely well for on-prem, and Azure is duplicating that in the cloud.
  • Over the next two years MS will increase the ‘same experience’ between on-prem and Azure, first starting with developers. Second priority is production workoad similarity. Application and management consistency between on-prem and Azure.
  • IP addresses in Azure are not static. If you power cycle (not reboot) a VM it may/will get a different IP address.
  • This week MS announced true static IPs in Azure. You get 5 static IPs for free with every subscription.
  • Multiple NICs are coming to Azure VMs “soon”
  • Azuze storage can be geo-replicated at an additional cost
  • Azure offers “site recovery” feature. Symantec is offering Azure backup targets.
  • Microsoft says a bug that would expose customer data to other customers would be “catastrophic” and may be end of the cloud.
  • Microsoft is very concerned about data security
  • Microsoft does not datamine from VMs in Azure
  • MS is working on encryption technology where you can do compute on encrypted data but MS will not have access to the data.

Beyond informative, the session was very entertaining. I definitely recommend watching the video for the full effect.



TechEd: IaaS with the Azure Pack (MDC-B364)

This session covers how to develop on-prem IaaS (Infrastructure as a service) using the Azure pack for Windows Server 2012 R2 and VMM 2012 R2. The session was more developer oriented than I thought from the description, so I ended up leaving a bit early since I’m not a developer. However, in the beginning the speaker did several demos of what the Azure pack does, which I found very useful. He then dove into the back-end details on how it all worked and what you have to do to build your own on-prem Azure VM gallery.

Hinted at in this session, and other sessions, is a possible roadmap feature where Microsoft would provide pre-configured gallery templates for certain Microsoft products like System Center and SQL. You would then be able to tweak the config, and easily built up a service catalog, and deploy MS services on Hyper-V in a highly controlled, standardized, and automated way. The R2 Windows Server and System Center release have a lot of the building blocks to enable those features in the future. Given the accelerated release cadence of MS’s cloud platform, customers will get new features much faster than they historically have.


  • MS is hyper-focused on consistent cloud experience across the clouds (on-prem, Azure, service provider) at all layers (UX, APIs, PowerShell)
  • IaaS (Infrastructure as a service) – Elastic tiers
  • Customer requests: Enable templates to be deployed to any cloud, Provide a gallery of applications, Provide console access to remote VMs, anaging standalone VMs is not enough
  • Vision (not 100% delivered in R2): A consistent service model amongst Windows Server, System Center and Windows Azure for composing, deploying and scaling virtualized applications and workloads.
  • Four pillars: Portal User experience, deployment artifacts, management APIs, on-prem, hosted clouds and Azure
  • Consistent IaaS Platform: Delivered on portal user experience (Azure Pack), deployment artifacts, management APIs, Clouds

Demo #1

  • Showed a gallery for the VM role (new to Azure). Lists various services (SQL srever, IIS web server, SharePoint, etc.) that the admin has configured and curated. Gallery shows different versions of the same template, and can be tied to a subscription. When deploying a VM you can define the number of instances, for scale-out.
  • VM container, and Application container concepts (application payload is delivered into an OS)
  • The Gallery wizard prompts for a number of service properties (website name, admin names, VM sizes, etc.).
  • Shows a usage portal, which lists cores, RAM, storage, and VM usage. Also lists instances, IP address, disks, subscription, VM operations (power, stop, reset, etc.). Scale slider for increasing VM count.
  • Shows the ability to create a virtual network  (e.g. creating a site-to-site VPN) in the Azure pack.
  • Shows the ability to open a console to a Linux VM, or a VM without a network or OS

Iaas Architecture

  • Stack is: Hyper-V, VMM, Orchestrator, Operations manager, and two portals (tenant and service admin)
  • Steps to setup:
  • Load application extensions to VMM
  • Create a gallery item (VMM role template)
  • Create a service admin
  • Expose to tenant

Remote Console

  • Requires a new RDP client to support the new console version
  • Trust is established between all components (Azure Pack, Hyper-V, RDS gateway)
  • RDPTLSv2 is the new protocol

How to Build your Gallery

  • Definitions: VIEWDEF, RESDEF, RESEXT (consistent naming across Azure and on-prem/service provider)
  • REDEF: Virtual machine role resource definition (VM size, OS settings, OS image reference)
  • RESEXT: Your Application (roles, features, OS image requirements, etc.)
  • VIEWDEF: User GUI experience definition (parameters, grouping, ordering, validation, etc.)
  • RESCONFIG: RESDEF parameter values, single deployment, versioned (e.g. hard coded port number, etc.)
  • Uses JSON not XML files (make it more REST and portal friendly format)
  • Good support for command line installers/scripting (integrate PowerShell desired state, Puppet, etc.)
  • First class support for SQL deployments, IIS, etc. to make it very easy to configure
  • Built-in full localization support with a default language (which you can change)

TechEd: Building Clouds on Server 2012 R2 (MDC-B312)

This session was a firehose of information on the design considerations when building your private cloud based on Server 2012 R2. There are ton of new features in WS2012 and R2, so this was a high level roadmap on how to figure out what you want to implement. Bottom line is that with WS2012 R2 and System Center 2012 R2, you have a full Cloud stack available. The 2012 releases built the foundation, but had some missing pieces. The R2 release rounds out those holes, and unifies the release schedule and simplifies the experience.


  • Windows Server 2012 is Cloud optimized
  • Clouds are dynamic, multi-tenant, high scale, low cost, manageable and extensible
  • Major new cloud enabling features in Server 2012, released last year
  • 2012 built  a strong platform, but was not a full cloud solution

WS2012 R2 Improvements

  • Live migration is much faster
  • Live migration from 2012 servers
  • Shared VHDX clustering
  • Automated block-level storage tiering
  • write-back cache
  • Per-share auto-redirection to scale-out file servers
  • Dedupe of VDI workloads
  • iSCSI target VHDX support
  • Multi-tenant site-to-site VPN gateway
  • Hyper-V NAT and forwarding gateway
  • vRSS
  • NIC teaming dynamic-mode
  • Desired state configuration
  • Datacenter abstraction layer
  • All aligned with System Center 2012 R2

Blueprint for a Cloud

  • Build your managment stack
  • Start provisioning compute nodes and storage
  • Then you scale out as needed
  • This is a cloud “stamp”
  • Publish a self-service portal or APIs
  • Add network gateways
  • Add users


  • Think about: workloads, networking, storage, resiliency

Designing for the workload

  • Cloud-aware stateless apps or stateful apps?
  • IaaS cloud can support both but with different design considerations
  • What are the workloads performance requirements
  • 2 socket servers offer the best ROI
  • Some workloads will benefit from hosts with SR-IOV
  • Are workloads trusted? Think about level of isolation between workloads and QoS policies
  • Keep it simple and manageable
  • Can’t optimize a unified infrastructure for all possible workloads
  • Standardize VMs, self-service based, managed to an SLA

Network Design

  • Traffic isolation considerations (tenant generated traffic) and hoster/datacenter traffic (cluster traffic, storage, live migration mgtmt, etc.)
  • Use physical isolation as needed, port ACLs, QoS & VM QoS
  • Between tenants and datacenter: separate networks
  • Between tenant VMs of different tenants: Hyper-V network virtualization & VM QoS
  • Hardware offloads for NICs: HW QoS (DCB), RDMA, RSC, RSS, VMQ, IPsecTo, SR-IOV
  • For storage, if using SMB 3.0, then the NIC would benefit from RDMA feature
  • R2: can also use RDMA for Live Migration
  • Look at RSS and RSC for the NIC which support management (Live Migration, management)
  • Look at IPsecTO and VQM for VM guest NICs
  • SR-IOV bypasses the extensible switch
  • R2: vRSS (spreads NIC traffic load across multiple VM cores

Storage Design

  • Hyper-V servers with internal SAS disks is a perfectly acceptable if you don’t need super high HA
  • 2012: Can pool shared JBOD SAS array for some good HA
  • Scaling options: Block based FC or iSCSI or file based (lower cost w/ high performance)
  • Block based enables storage offload with ODX, and high IOPS

Resiliency Approaches

  • Infrastructure – VMs not designed to handle failures, HA at server level, failover clustering as another layer of protection. High end servers, redundant power and apps.
  • App-Level Resiliency – Cloud-aware apps can sustain failures without infrastructure dependency

WS2012 Representatitve Configurations

  • Three different approaches are fully documented and validated by Microsoft:
  • aka.ms/CloudBlog
  • aka.ms/CloudConfigs
  • aka.ms/CloudPowerShell

How do you deploy and configure?

  • In 2012 it was a mixture of GUI and a lot of PowerShell
  • With R2 and aligning with system center 2012 R2, it is much much easier
  • “Physical computer profile” is new in SC2012R2 – Deploy Hyper-V to bare metal
  • Demo showed provisioning a new scale out file server and creating a file share, all from a GUI

Scaling Considerations

  • Compute (Hyper-V) cluster size
  • Larger clusters improve overall efficiency
  • Consider clustering across failure domains (e.g. cross-rack)
  • Storage: Need JBODs with appropriate number of SAS interfaces

Management Stack Improvements In R2

  • Provides a unified Powershell method to manage physical devices, such as switches
  • MS created a logo program that vendors can certify against
  • MS open sourced the OMI standard for anyone to use
  • Desired State Configuration (DSC) MDC-B302 session

Windows Azure Pack

  • Same self-service portal as Azure
  • Common management experience
  • Workload portability
  • As future services are delivered in Azure, they will transfered into the private cloud

San Diego VMUG: vCloud Director Best Practices

This session covered some high level Clould Director best practices, by Spencer Cuffe (VMware). It was short session without a lot of detail, but for what it’s worth here are a few notes:

  • Pre-Reqs – One cell per vCenter, NTP across all devices, one vCNS per vCenter, one provider vDC per vSphere Cluster


  • There is no one size fits all solution
  • There are some things very challenging to change post-development
  • Allocation models: Changing may require powering down VMs to apply limits/reservations
  • Storage tiering is important (capacity, tiers, I/O requirements, fast provisioning [don’t use it everywhere])
  • Networking: Configure external network first, then configure DVS first
  • IP addressing withing the VM will require power cycle if changed
  • VXLAN: Plan for it and prepare vCNS with it first
  • Use distributed switches for everything

vCloud Director Use Cases

  • Hosted/Public Cloud – Customer isolation, catalog isolation
  • Development – Consistency, on-demand environments
  • Testing – Dev, IT, vendor packages, etc.
  • QA – Pre-prod clean environments
  • Support desk – Test in isolated environment
  • Private/Hybrid Clouds


  • Storage gets used quickly
  • Services: DHCP, DNS, NTP, etc.
  • Who is responsible? Administrators, organization admins, template maintenance, firewalls, remote access, etc.
  • Access to VMs? Jump box, remote console, SSH, etc.

Redundancy and Maintenance

  • Load balance cells
  • VMware vCenter heartbeat
  • Consider VMware FT/HA for vCNS
  • Use maintenance mode when doing maintenance on a cell
  • Use “Display debug information” to dig deeper into error messages
  • Configure syslog to capture all the activities centrally

Using vCD

  • Set HA host failures to a percentage instead of N+1
  • Use VMware Orchestrator to automate common tasks
  • When using fast provisioning, end users should have a limited lifecycle for vApps.