SQL 2012 Failover Cluster Pt. 3: iSCSI Config

QNAP

This is Part 3 of the SQL 2012 failover clustering series we will configure the required LUNs and iSCSI components, so that Windows Server 2012 can mount the shared LUNs. As I mentioned before, I’m choosing to do in-guest iSCSI for the shared LUNs vice RDMs. RDMs are acceptable as well, and there’s no one right or wrong answer here.

On larger SQL servers you will quickly run out of drive letters, so we will be using mount points for the majority of the volumes we create. This minimizes drive letter usage and allows seamless future expansion without worrying about what letter to use, or running out. CSVs (cluster shared volumes) also appear to be an option, but honestly I’ve never used them.

On a security note, iSCSI supports various kinds of CHAP authentication. Depending on your array vendor, if may or may not support CHAP. If you value security then I would encourage you to enable mutual chap, so both your target and initiator can authenticate each other. Some vendors also limit the CHAP password complexity, so if you have problems with CHAP authentication, try simpler passwords like limiting them to all upper case letters and numbers (yes, I’m speaking to you QNAP).

Blog Series

SQL 2012 Failover Cluster Pt. 1: Introduction
SQL 2012 Failover Cluster Pt. 2: VM Deployment
SQL 2012 Failover Cluster Pt. 3: iSCSI Configuration
SQL 2012 Failover Cluster Pt. 4: Cluster Creation
SQL 2012 Failover Cluster Pt. 5: Service Accounts
SQL 2012 Failover Cluster Pt. 6: Node A SQL Install
SQL 2012 Failover Cluster Pt. 7: Node B SQL Install 
SQL 2012 Failover Cluster Pt. 8: Windows Firewall
SQL 2012 Failover Cluster Pt. 9: TempDB
SQL 2012 Failover Cluster Pt. 10: Email & RAM
SQL 2012 Failover Cluster Pt. 11: Jobs N More
SQL 2012 Failover Cluster Pt. 12: Kerberos n SSL

LUN Design Considerations

First let me say that SQL LUN design is HIGHLY depending on your workload, number of SQL instances, database size, number of databases, etc. You must understand the business and performance requirements, or you are setting yourself up for failure.

I don’t claim to be a DBA, so consult your nearest DBA for configuration details specific to your environment. However, what I have found that works well for smallish SQL servers that have a single instance are the following volumes/LUNs:

  • Quorum – Needed for MS failover clustering
  • Database – Database files
  • Database Logs – Database log files
  • TempDB – SQL temporary database (may see high I/O depending on apps)
  • TempDFB Logs – Temporary database long files (may see high I/O depending on apps)
  • Backup – Working space for SQL-native backup files

If you are building a true enterprise SQL server with multiple instances, you will very likely have a lot more volumes. For the purposes of my small lab and supporting vCenter 5.5, I kept the LUNs small. One trick that I’ve learned over the years to help keep your LUNs straight is making them all slightly different sizes. This is particularly true if you have multiple SQL instances on a server that use different LUNs. If you look in the Windows storage manager and see 10 LUNs of the same size, but they are backed by different storage QoS, it’s hard to match them all up. So I vary the sizes by 1-2 GB to keep them unique.

For my home lab I configured the following LUNs on my QNAP array. As you can I see I varied the size of all the LUNs so I can quickly match them up with their intended purpose inside Windows. Same logic would apply to RDMs. As with any block storage, you should configure array-side LUN masking so that only the authorized systems can access your LUNs.

9-21-2013 12-02-41 PM

I won’t go into depth about the RAID levels and number of IOPS you should design for. Modern storage arrays are complex beasts, IOPS requirements wildly vary, and you must know your application requirements. Generally RAID-1 for the SQL logs and RAID-5 for the databases is recommended. Again, consult your DBA and storage architect for the right answers.

New fangled storage technology like PernixData and VMware Virsto can do wonders for I/O performance (though they won’t help with in-guest iSCSI). That’s another reason why I’m a fan of SQL AlwaysOn replication, since you can fully virtualize all disks and use these exciting new storage technologies to increase performance. Dare I say software defined storage?

Windows Disk Configuration

1. Now that you’ve carved up your iSCSI LUNs we need to install a couple of Windows features to proceed. Run through the Add Roles and Features Wizard and add the following two Features: Failover Clustering and Multipath I/O. Do this on both of your SQL VMs.

Windows failover clustering

2. Since iSCSI networks should be non-routed for security and performance reasons, I’m going to add a second NIC to each SQL VM. If you are using the VMware distributed virtual switch (with NIOC) or Cisco UCS where you can configure QoS for various pNICs or traffic types, then I would suggest putting some thought into what makes sense for your environment. Even if you aren’t using the DVS, configure an iSCSI port group on your vSwitch and configure it for your non-routed iSCSI VLAN. Don’t configure a vmkernel iSCSI port, since we are using in-guest iSCSI.

Once you add your second NIC to the VM, we need to make some configuration changes. First up, I would strongly recommend you rename the NICs so you don’t get them confused. I called mine Production (regular network traffic) and iSCSI. Do NOT put both NICs on the same IP subnet. Cluster services will only recognize one NIC per subnet. I faked it out by using pseudo IPs for my iSCSI network (and iSCSI actually routed through production).

9-21-2013 8-47-09 AM

3. Open the iSCSI adapter properties and un-bind the File and Printer Sharing protocol.

9-21-2013 8-44-03 AM

Configure the adapter with a static IP address and do not configure a gateway or DNS info.

9-21-2013 8-45-56 AM

Un-check the box to register the connection details in DNS. Why? You don’t want a private non-routable IP address in DNS, or clients will try to connect to the unreachable address. Clearly this would cause bizarre issues.

9-21-2013 8-45-42 AM

Do a DNS lookup of your SQL server hostname to ensure that only a single IP is returned.

4. Press the Windows key then type iSCSI. You should get a warning message that the iSCSI service is not running. Click Yes to make sure it starts every time Windows does.

9-21-2013 8-27-41 AM

5. When the iSCSI Initiator Properties window appears, enter the IP address of your iSCSI target. In my case I entered the IP address of my baby QNAP. Click Quick Connect and your array should now be listed in the discovered targets.

9-21-2013 8-57-10 AM

6. Click on the Volumes and Devices tab, and click Auto Configure. A list of LUNs should appear, that is equal to the number you exported from your array.

iSCSI LUN

7. One common “mistake” that I see people new to Windows Server 2012 is using the legacy Disk Management MMC snap-in that’s been around for 15 years. Wrong answer these days for most storage tasks. Instead, from Server Manager click on File and Storage Services, then open Disks.

Disk Manager

8. From here you should now see all of your LUNs, regardless of their state. Notice that the “number” does not correspond to the LUN IDs from your array (unless you get lucky), hence my propensity for unique LUN sizes.

9-21-2013 11-37-59 AM

Don’t fall into the “click next” trap here and bring all of the disks online with their default values. No no! Another performance optimization is choosing the proper NTFS allocation size based on the LUN’s intended usage. For our D drive, where the SQL binaries are installed, we will select the default size.

However, for all other LUNs (database, logs, and tempdb) we will use 64K allocation size. This better matches the SQL server I/O size, and is generally more efficient. The new 2012 disk wizard also initializes all disks as GPT, which is required to go beyond 2TB. Even for smaller LUNs I now always use GPT, instead of the legacy MBR format. The only LUN configured as MBR is the boot volume. Short name generation is also disabled in the 2012 wizard, and I leave that off as well for a slight performance tweak.

A summary of my proposed disk configuration is in the table below. F is a tiny partition that only serves as a mount point holder for the bigger database and log volumes. While I’m still using up a fair number of letters here, as you add more DB and log volumes those would get mounted under F.

9-21-2013 12-05-50 PM

Now go ahead and on ONE node (ONLY!!!) bring all of the volumes online, format, and mount as shown in the table above.

9. Assuming that you followed my example, if you look at the Volumes tab you should see something similar to the screenshot below.

SQL Volumes

10. Go over to your second SQL node and format your D volume (ONLY!). Do a disk rescan and validate that all of the shared volumes are OFFLINE. Do NOT bring them online. If you did bring them online on the second node, you may have corrupted the NTFS file system. So to be safe I’d unmount the volume(s) from both nodes, reset the disk config, and reformat on one node.

9-21-2013 1-39-59 PM

Multi-Pathing

Built into Windows 2012 is a multi-pathing plug-in that works with a variety of storage vendors. Most commonly this would be for Fibre Channel SANs, or physical Windows servers that have multiple NICs and storage arrays with multiple iSCSI IPs. Depending on your iSCSI infrastructure you may in fact have two paths to your LUNs. If you do, then I’ll show you how to enable MPIO and make sure it has claimed your iSCSI LUNs.

In my home lab MPIO won’t do anything for me since I have a single NIC in my VM and my QNAP has a single iSCSI target IP. I don’t like to configure unnecessary services, so if you can’t take advantage of true multi-pathing, then I would not suggest activating iSCSI MPIO.

1. In the Server Manager dashboard select the MPIO tool. Open the Discover Multi-Paths tab, check the iSCSI box, then click Add. You will then need to reboot.

9-21-2013 1-56-05 PM

2. Your reboot make take longer, so don’t worry if things seem to hang for a couple of minutes. Open Computer Management then find one of your iSCSI devices and you should see a MPIO tab. In there you can find some geeky stats and settings (which I would NOT tweak).

9-21-2013 2-00-11 PM

Summary

Ok, so that was a lot of disk configuration and it was somewhat tedious. I can certainly see why some people would just throw up their arms, install SQL on the C drive, put everything else on the D drive and call it a day. And they then wonder why SQL is slow as a snail in winter and blame it on VMware overhead. Blame yourself, not VMware. 🙂

Next up in Part 4 is configuring Microsoft failover cluster services.

VMware ESXi 5.1 Patches Released

VMwareHot off the presses are some ESXi 5.1 patches. This build of ESXi 5.1 (1157734) fixes several bugs and more importantly addresses some security issues. As always in any environment, please test out the patches thoroughly before putting them into production. Each environment is unique, and issues may surface that could cause you some headaches. These bug fixes aren’t earth shattering, so I would not suggest rushing them out to production systems.

ESXi 5.1 Build 1157734

Highlights of the patch bundle included in this release are:

  • Black frames might appear around text boxes in an application running on Virtual Machine Hardware Version 8 or later. This issue occurs on virtual machines with Windows 7 guest operating system and View 5.0 PCoIP.
  • For two ESXi hosts with different host names, identical machine names are generated in the domain controller under certain conditions. As a result, the ctive Directory functionality is lost for one of the two ESXi hosts.
  • After you upgrade to ESXi 5.1 from an earlier version, attempts to power on a virtual machine with static MAC address outside the allowed range (00:50:56:[00-3f] or 00:50:56:[80-BF]) fail with the following error message: The MAC address entered is not in the valid range.
  • If a physical NIC is named using non-standard naming conventions (other than vmnic#) and is added to a vSwitch, host profile creation fails with the following error message: Invalid value chosen for active NICs.
  • ESXi 5.1 hosts might get disconnected randomly from the vCenter Server system. This issue might occur if the heartbeat thread in the vpxa agent does not receive a response from the futex_wait system call. As a result, the heartbeat thread stops responding, and the vCenter Server does not receive heartbeat messages from the ESXi hosts for several hours.
  • Upon reboot, ESXi 5.1 hosts configured to obtain DNS configuration and host name from a DHCP server displays its host name as localhost in syslog rather than displaying the host name obtained from the DHCP server. As a result, for a remote syslog collector, all ESXi hosts appear to be the same, with the same host name.
  • To prevent buffer overflow, the HPSA proc node truncates LUN details on an ESXi host.
  • This patch updates the esx-base VIB to resolve a stability issue.

As always, you can down the ESXi patches from here. The full KB article for the patch bundle is here.

TechEd: Comparing Microsoft and VMware Private Clouds (MDC-B352)

This was Part 2 of a two part series on comparing VMware and Microsoft virtualization/Cloud offerings. Part 1 was focused on the hypervisor and how Hyper-V and ESXi compare. I had a schedule conflict with part 1, so I didn’t attend it. This is part 2, focusing on the private cloud offerings. I thought Microsoft did a decent job in the 75 minutes provided. VMware has a leg up in areas, while other areas Microsoft has a leg up or a longer track record (such as Operations and Configuration manager).

A lot of differences in both products were not discussed, and would take a lot more time than 75 minutes. But it’s clear with Windows Server 2012 R2 and System Center 2012 R2 that they are making rapid and big strides in the private cloud and virtualization arena. Now that VMware and Microsoft appear to be on a yearly release cadence, I see the “Cloud OS” battle really heating up. MS has a lot of ground to make up, and they clearly knew it.

Private Cloud Technologies

Speaker acknowledges this is not a perfect comparison, as some products from each vendors package up features differently. For example, vCloud Director does a lot more than just self-service, but MS VMM has vCloud directly-like functionality not found in vCenter. So you can’t exactly line up products and say they are the same. But combine the entire stack from each vendor to really see how they shape up instead of doing per-product comparisons.

  • Hypervisor: Microsoft – Hyper-V; VMware – vSphere Hypervisor
  • VM Management – Microsoft – VMM; VMware – vCenter Server
  • Self-Service – Microsoft – App Controller; VMware – vCloud Director.
  • Monitoring – Microsoft – Operations Manager; VMware – vCenter Operations Management Suite
  • Protection – Microsoft – Data Protection Manager; VMware – vSphere Data Protection
  • Service Management – Microsoft – Service Manager ; VMware – vCloud Automation Center
  • Automation – Microsoft – Orchestrator; VMware – vCenter Orchestrator

Private Cloud Software Licensing

For both suites both vendors license the products by the socket basis. You can buy some VMware products a la carte, and some lesser known products aren’t included in the vCloud Suite. So depending on what features you need, you may need a different set up products.

  • Microsoft – System Center 2012 SP1 (per socket) & Hyper-V
  • VMware – vCloud Suite & vCenter

Key Focus Area for this Session

  • Granular App & Service Deployment
  • Deeper insight and remediation
  • Protection for key apps andworkloads
  • Hybrid Infrastructure
  • Costs

Granular App & Service Deployment

  • On VMware you use templates to deploy standardized templates. Templates are simple, but static.
  • In VMM you also have a dedicated Library to VM templates (like VMware) and service templates
  • In VMM you can have lots of templates all pointing to the same VHDX image (templates can have different features/etc.). Or small, medium, large, etc. templates all pointing to the same OS image.
  • In VMM you can add roles/features to the guest VM template and capture them in the template
  • You can have separate guest profile, and can marry up them with a hardware profile and a VDHX image without using any extra disk space
  • In VMM you can add applications, such as SQL, and easily create a template
  • VMM can directly configure App-V server packages and inject them into the VM template
  • VMM 2012 has a concept of service templates. Service template allows you to build and model multi-tier services. Ability to configure scale out rules, for example. Drag and drop VM templates onto a canvas and you can customize the VM properties.
  • Anything you can do in VMM you can do in PowerShell
  • VMM is more about delivering services to the business unit, not just deploying individual VMs
  • “Create Cloud” button in VMM. Defines resources, networks, load balancers, VIP templates, Port classifications (NIC), Storage, library, define capacity quotas (vCPUs, memory, storage, VMs, etc.). Ability to select hypervisor (Hyper-V, VMware, XenServer).

Service Manager

  • IT self-service management portal, built on SharePoint (also a full helpdesk ticketing system)
  • ITaaS offering
  • Plugs into VMM, Orchestrator
  • BI is built into service manager for deep reporting
  • Download “Cloud Service Process Pack” which pre-configures VMM, Service Manager and Orchestrator for a self-service VM portal

Orchestrator

  • Custom automation with minimal scripting needed
  • MS Orchestrator has a lot of plug-ins for third party products and hardware (integration packs)

Operations Manager

  • Extensible with MS and third-party management packs. Veeam MP can do deep monitoring of VMware environments.
  • Veeam MP is not free, so if you want to monitor VMware with SCOM you will have to license the excellent MP
  • OpsMgr can also monitor network infrastructure (switch CPU usage, memory, port-level stats, etc.)
  • Maintains the relationship between VMs and physical hardware such as switch ports, etc.
  • Server-side, client-side and synthetic transactions for application monitoring
  • Global Service Monitor (GSS) – MS Azure based global services that will test your private cloud app

Visual Studio Integration

  • VMM Library is accessible from Visual Studio
  • Team Foundation Server can use the “Test & Lab Manager” which will spin out VMs for automated dev testing via VMM

System Center Advisor

  • Provides configuration guidance around specific workloads (SQL, etc.) for troubleshooting. Free from MS.

Data Protection Manager

  • Supports Windows server, SQL server, SharePoint, Exchange, Dynamics
  • Up to every 15 minute differential backups
  • DPM can backup to Azure and tape
  • Changed block tracking for VM backups
  • Cluster aware – integrates with CSV
  • Item-level restore
  • DPM has no inline dedupe, but VMware data protection does

Heterogeneous Environments

  • VMM can connect to and provide basic management of vCenter
  • Can use VMM service templates on VMware hosts
  • Many integration and management packs for third party software and hardware (HP, NetApp, Cisco, etc.)

Hybrid Infrastructure

  • Private cloud (VMM can manage XenServer, vSphere, Hyper-V)
  • System Center can link to Service Provider and Azure
  • Single Sign on with AD (Azure)
  • Integrated with DEV (Team Foundation)

Cost Scenario

Cost scenarios can be extremely tricky and misleading. Plus large enterprises will likely get big discounts from both VMware and Microsoft. So take the numbers below with a grain of salt. Not in the cost calculation is the cost of the guest operating systems, since it was assumed both used the same OSes so the cost was a wash. The costs were only for the hypervisor and cloud stack.

The speaker didn’t mention the Microsoft ECI license (enrollment for core infrastructure). This combines the operating system and system center stack licenses into a single SKU, licensed by the socket. The datacenter edition of ECI allows unlimited VM deployment and management using all cloud features. Even if you are a 100% VMware shop for the hypervisor,  you may still have the ECI license if you use system center components (such as SCCM or SCOM). So you may already be fully licensed from the MS perspective and incur no additional software costs for the MS cloud stack.

  • Example: 500 VM Private cloud; 15:1 VM to host ratio; 34 hosts, 2 sockets with 16 cores; Windows Server licensing additional; comprehensive management; 68 licenses of Windows server datacenter
  • 68 CPUs Hyper-V: $0; 68 CPUs of System Center $122K
  • 68 CPUs vCloud Enterprise Suite $781K, vCenter $5K

Awarded VMware vExpert 2013 Status

It’s that time of year again when VMware vExpert 2013 awards are handed out. Last year I was very surprised and honored that my blogging efforts got such recognition and that I made the 2012 list. Since then I’ve tried to do even more community work, and spent countless hours doing the vSphere 5.1 install guide series. And for 2013 I’m honored to again be selected as a VMware vExpert. You can find the full VMware blog post with all 579 of us here. I look forward to continuing to participate in the community, and expand my audience reach. My last name got fat fingered in the list, so hopefully they can fix that up in short order.

Automate Sysprep on vSphere w/o Custom Specs

I’m a huge fan of using vCenter customization specifications to automate the sysprep process for deploying new Microsoft VMs. The sysprep process ensures a unique Windows SID, sets the VM’s hostname, and can even join a VM to the domain, among other things. However, the customization specifications can only be triggered when you clone a VM. While that may be good for a vast majority of use cases, I recently ran across a scenario where that was not possible since my VMs already existed.

For a VDI project I am looking at a software storage appliance to offload a lot of the IOPS from our back end storage system, to increase performance and reduce costs. One feature the our particular solution has is called “fast clone”, which allows the storage appliance to create a VM clone in just a few seconds, instead of several minutes using the vCenter clone method. Internally it adjusts some pointers to the VMDK, and de-dupes, so it doesn’t have to copy every block when you create a new VM. In fact, very few blocks are copied during the cloning process.

However the “fast clone” process literally cloned the master VM and did not have any method to trigger vCenter customization specs. As a result all the Windows hostnames were the same as were the SIDs. I certainly did not want to run sysprep manually on hundreds of VMs. The vendor workaround was far too complex and cumbersome to consider. So I developed the script below which automates the major tasks which the vCenter customization specifications perform and easier (IMHO) than what the vendor suggested.

Script Features

  • Copies an existing sysprep unattend XML file to the VM via the VMware tools VIX interface
  • Each unattended XML file is automatically customized with the VM’s name as it appears in vCenter, so sysprep will change the Windows hostname appropriately
  • Deletes the residual unattended XML files which may contain sensitive passwords or product keys
  • Auto-joins the VM to the domain assuming an appropriately configured unattend XML file and DHCP is available
  • Accepts a command line argument for easy testing against one VM, but it will also read a CSV file for mass processing

It’s up to you to supply an appropriately configured Windows sysprep unattended XML file for the operating system in question. If you include domain join parameters then it will join the VM to the domain as well, all without prompting for a username or password. To delete the residual XML files, the script will upload a setupcomplete.cmd file to c:\windows\setup\scripts. It will not over-write any existing file, so make sure it doesn’t exist. Windows knows to automatically run that script after the sysprep process.

In order to customize the unattended XML file with the VM’s hostname, the script does a simple replace on a string called “CHANGEHOSTNAME”. When you create your XML file be sure to use this name for the machine name, so the search and replace will work properly. Otherwise all the VMs will have the same hostname!

5-18-2013 6-35-18 PM

Using the Script

When you want to run the script against several machines, use the csv option. The csv file must have the vCenter VM name, one VM per line, without any header or empty lines at the end. There’s limited error checking, so I would urge you to take a snapshot of your target VM so you can revert back until you work out the kinks with your unattend file. In the vCenter console you will see some authentication errors when sysprep kicks off and invoke-script can no longer connect to the VM , but those are harmless messages.

In the example below I executed the script on the vCenter server using a PowerCLI console. I had configured the CSV input file with two hostnames. First I entered my password (for my current user account), then the administrator credentials for the guest VM. The script assumes all VMs have the same credentials as you will only be prompted once.

vsphere sysprep windows

If you watch the vCenter console you will see a bunch of entries. As I mentioned earlier, once sysprep kicks off vCenter is unable to connect to the guest so some authentication errors appear.

vsphere sysprep windows

After minute or so the VM rebooted and the sysprep process kicked off. A few minutes later my VM was joined to the domain with its new name and ready for use. Depending on the complexity of your unattended sysprep file you do could a lot of customization within the guest, install software, etc. the sky is really the limit. This script just gives you an easy way to run sysprep against dozens or hundreds of existing VMs if you can’t use vCenter customization specifications.

5-18-2013 6-52-35 PM

# This script will copy a sysprep unattend XML file to the guest VM and execute it,
# using the VM's vCenter name. Input can be a single arguement on the command line,
# or a csv file. The CSV must have one VM name per line and no blank lines or header.
# The setupcomplete.cmd deletes the two copies of the unattend XML file, which may
# contain sensitive passwords or product keys.
#
# Derek Seaman derekseaman.com
#

# Your vcenter server name
$vCenter = "vcenter.domain.com"

# Your master sysprep unattended file. It will not be modified.
$MasterSysprep = "d:\sysprep-master.xml"

# Optional CSV input file. Only called if no VM argument is provided.
# One vCenter VM name per line with no header
$CSV_File = "D:\vms.csv"

# "Hostname" in the master unattended sysprep file that will be replaced for each VM
$ReplaceHost = "CHANGEHOSTNAME"

# Resulting sysprep file with the custom hostname, overwritten for each VM. Do not change.
$CustomSysprep = "D:\sysprep.xml"

# Don't change anything below here
#

#Validates VMware PowerCLI snap-ins are loaded

$xPsCheck = Get-PSSnapin | Select Name | Where {$_.Name -Like "*VMware*"}
If ($xPsCheck -eq $Null) {Add-PsSnapin VMware.VimAutomation.Core}
if ($args[0] -eq $null ) {$list = import-csv $CSV_File -header name} else { $list = $args[0] }

# Function to mask password input
function Read-HostMasked([string]$prompt="Password") {
$password = Read-Host -AsSecureString $prompt;
$BSTR = [System.Runtime.InteropServices.marshal]::SecureStringToBSTR($password);
$password = [System.Runtime.InteropServices.marshal]::PtrToStringAuto($BSTR);
[System.Runtime.InteropServices.Marshal]::ZeroFreeBSTR($BSTR);
return $password;

} # function Read-HostMasked([string]$prompt="Password")

# Connects to vCenter
$currentUser = ([System.Security.Principal.WindowsIdentity]::GetCurrent()).Name
$currentUsePassword = Read-HostMasked "Enter the password for the current user"
Connect-VIServer -Server $vCenter -User $currentUser -Password $currentUsePassword | out-null

# Guest OS administrator credential input
$guestuser = read-host "Enter guest administrator username"
$guestpassword = read-HostMasked "Enter guest administrator password"

Foreach ($vm in $list) {

# Cleans up prior local sysprep output file and replaces hostname in sysprep.xml
remove-item $CustomSysprep -ErrorAction SilentlyContinue

$content = Get-Content $MasterSysprep
$content | foreach { $_.Replace($ReplaceHost, $VM.name) } | Set-Content $CustomSysprep
write-host $vm.name Custom sysprep file created

# Creates setupcomplete.cmd file to delete sysprep XML files post-sysprep. File must not already exist.
$Script1 = "echo `"del /F /Q c:\windows\panther\unattend.xml c:\windows\system32\sysprep\sysprep.xml`" | out-file -encoding ASCII c:\windows\setup\scripts\setupcomplete.cmd"
invoke-vmscript -scripttext $script1 -VM $VM.name -guestuser $guestuser -GuestPassword $GuestPassword | out-null
write-host $vm.name setupcomplete.cmd uploaded

# Copies sysprep.xml to guest and executes asynchronously
$script2 = "c:\windows\system32\sysprep\sysprep.exe /generalize /oobe /unattend:c:\windows\system32\sysprep\sysprep.xml /reboot"
copy-vmguestfile -source $CustomSysprep -destination c:\windows\system32\sysprep -VM $VM.name -localtoguest -guestuser $guestuser -guestpassword $guestpassword
invoke-vmscript -scripttext $script2 -VM $VM.name -guestuser $guestuser -GuestPassword $GuestPassword -scripttype bat -runasync | out-null
write-host $vm.name Sysprep executed
}

Safely virtualizing Windows Server 2012 Active Directory via Generation-ID

Windows Server 2012 generation ID is a great new feature that will allow use to safely virtualize a domain controller, on specific hypervisors. One of the really great features that hypervisors have had for ages is the ability to perform snapshots, then roll back to a prior state with a click of a mouse. Invaluable feature in both the lab, and in production.

I know during all my (failed) vSphere 5.1 installs I practically wore out the revert to snapshot button in vCenter. But, there is at least one class of VMs that you almost NEVER want to roll back from a snapshot with, those which are vector-clock synchronized software such as Active Directory.

Why is rolling back AD bad? I mean why is rolling back AD *REALLY* bad? Microsoft has these little things called USNs, or Update Sequence Numbers. A USN is an Active Directory database instance counter which gets incremented each time an update to AD is made. USNs are unique to each DC, and use a monotonically increasing value. USNs are used to determine what changes need to be replicated to other DCs.

When you revert to a snapshot a USN rollback occurs. What can happen if a USN rollback occurs? Lots of bad things, such as missing AD objects, wrong security group memberships, passwords are reset, and re-appearing AD objects. Also, DCs that are rolled back may accumulate many changes which never get replicated to other DCs. In short, the AD consistency of your forest is SHOT.  Starting with Windows Server 2003 SP1 and later, an event log ID 2095 is generated if a USN roll-back is detected, but it’s up to you to fix the mess. Microsoft has a great KB article here that goes into a lot more detail.

What has Microsoft done in Windows Server 2012 (and Windows 8) to address this problem? They’ve introduced a safeguard called a VM-Generation ID, which can be implemented by any hypervisor. This generation ID can be used by applications and operating systems to detect if a virtual machine has been rolled back in time, and take appropriate measures.

So what happens when AD detects that the Generation IDs have changed? First, it dumps the RID pool, then does a non-authoritative synchronization of the SYSVOL folder. AD replication is then re-established to other DCs, to bring the reverted DC back into a consistent state with the rest of the forest.

Sounds great right? Well it is, but only a very limited number of hypervisors support VM-Generation ID. As of this writing the hypervisors are Hyper-V 3.0, vSphere 5.0 U2, and vSphere 5.1. Since a USN rollback is quite unpleasant, you of course want to verify that WS2012 and your hypervisor are playing nice and using the Generation-ID feature. If you look in the Directory Service event log, you will see event ID 2168 and 2172. In the screenshots below they have the same Generation-ID, since the VM was not reverted to a previous snapshot.

To test out this new feature I fired up my vCenter 5.1 web console and took a snapshot of my WS2012 domain controller. After the snapshot completed, I created a new group on another DC, then reverted the WS2012 DC back to my snapshot. Let’s look in the event viewer and see what happened:

Yes, AD realized it was reverted back to a prior snapshot…

Microsoft even tells you that snapshots are not backups, and silly, use an AD aware backup program to restore AD.

And now life is almost good…

Let’s freshen up FRS a little bit while we are at it…

Nothing like a new database to start off the day with…

A touch of USN cleanup…

And a few minutes later, everything is back in sync! As you can see from the screenshots, Microsoft is very verbose in the logs on exactly what is happening and why. In a very large forest with a lot of DCs the recovery process could take longer.

So under what circumstances does the Generation-ID change and not change? Here’s a list:

Generation-ID NOT changed when:
VM is paused or resumed
VM is rebooted
VM host reboots
VM is vMotioned/Live Migrated

Generation-ID IS changed when:
VM starts executing a snapshot
VM is recovered from a backup
VM is failed over in a disaster recovery environment
VM is imported, copied, or cloned

This feature alone should be a huge driver for deploying WS2012 based DCs on all of your hypervisors. Never thought I’d say this..but happy snapshotting your domain controllers! For even more detailed information on virtualized domain controllers, Microsoft has a great series of articles here you can read.

P.S. This feature does NOT work with array-based snapshots. The hypervisor tracks and creates the new Generation-IDs. So DO NOT revert a domain controller back to a prior state by reverting to a previous snapshot that your array created vice your hypervisor. With the forthcoming VVOLS in vSphere .Next, Generation-ID could be supported with hardware-snapshot offloads but we will have to wait and see if that’s the case.

vCenter 5.1 U1 Installation: Part 13 (VUM Configuration)

This installment of the 15 part vSphere 5.1 Installation covers some basic VUM configuration that most people will want to do. In Part 12 we configured VUM to use trusted SSL certificates. Now that the under the covers configuration of VUM is done, we need to perform some basic GUI configuration to make VUM useful. Every environment is different, and VUM is quite customizable, so the steps below are just basic guidance for a vanilla VUM setup. Creating custom baselines, tweaking remediation options, and other settings are not covered below.

Before we get started, listed below are the other related articles in this series:

Part 1 (SSO Service)
Part 2 (Create vCenter SSL Certificates)
Part 3 (Install vCenter SSO SSL Certificate)
Part 4 (Install Inventory Service)
Part 5 (Install Inventory Service SSL Certificate)
Part 6 (Create vCenter and VUM Databases)
Part 7 (Install vCenter Server)
Part 8 (Install Web Client)
Part 9 (Optional SSO Configuration)
Part 10 (Create VUM DSN)
Part 11 (Install VUM)
Part 12 (VUM SSL Configuration)
Part 14 (Web Client and Log Browser SSL)
Part 15 (ESXi Host SSL Certificate)

VUM Configuration

1. If you haven’t already installed the vSphere 5.1 vSphere Client for Windows, now is the time. After that is installed, connect to your vCenter server and click on the Plug-Ins menu. You should now see an available plug-in.

2. Click on Download and Install. Run through the installation wizard using all default values.

3. After the installation is completed, close the vSphere client.

4. Reconnect to the vCenter server using the vSphere client. If all goes well you should NOT get an SSL certificate warning and you should see a Update Manager tab in vCenter.

5. Depending on your server hardware vendor, you may want to add the HP depot URL to VUM so know when they release updated software. Unfortunately at this time I’m not aware of a Cisco VIB depot. Open the Admin View of VUM.

6. Once the VUM Admin page opens click on Configuration. Add a Download Source and use the following URLs:

HP:

http://vibsdepot.hp.com/index.xml

Dell:

http://vmwaredepot.dell.com/index.xml

Validate the URL then click OK.

7. After the URL is added, click on Download Now and wait a minute or two.

8. If you open the Patch Repository tab and sort by vendor you should now see some HP patches listed.

9. You can create your own patch baselines, which is out of the scope of this article. But I would recommend you at least attach the host and VM critical patch baselines. Switch to the Hosts and Cluster view, then click on the Update Manager tab.

10. Right click in the left pane and select Attach. I would recommend you select both baselines, unless you build your own.

11. Switch to the VMs and Templates view, change to the Update Manager view, then right click in the left baseline pane and select Attach. Again, unless you have a custom baseline, I would select all three options.

And there you go…a pretty vanilla VUM configuration. You will probably want to tweak some remediation settings, and possibly schedule regular scans (say weekly) for updates for both VMs and ESXi hosts. Next up is Part 14, which fixes the LogBrowser service SSL issue.

Note: When using VUM if you try and import a patch file, you will likely get an SSL security warning. You will notice that a self-signed VMware certificate is presented for you to trust. I’ve seen mentioned elsewhere that for now users are unable to change this particular certificate to a trusted one. So just “ignore” the error, as much as it may pain you.

Windows Recovery Environment VMware Driver Injection

This post will show you how to configure Windows recovery environment VMware drivers. In a previous blog post here I described how to inject VMware pvscsi and VMNET3 mass storage drivers into your Windows Server 2008 or Windows 7 image. However, that did not cover injecting the same drivers into the Windows Recovery Environment, which is a separate WIM within the install.wim, thus requiring extra work.

Here’s how to inject drivers into the winRE.wim file and repackage the install.wim with the updated recovery environment.

1. Follow the first five steps at my blog here to install WAIK, find the right drivers, and create a scratch directory.

2. Mount the install.wim file from your Windows installation DVD:
dism /Mount-Wim /WimFile:D:\install.wim /Index:1 /MountDir:D:\mount

3. Copy the winRE.WIM to a working folder:
copy D:\mount\windows\system32\recovery\winre.wim D:

4. Create another mount directory, D:Mount2, then run this command:
dism /Mount-Wim /WimFile:D:\winre.wim /Index:1 /MountDir:D:\mount2

5. Inject the pvscsi and VMXNET3 drivers:
dism /image:D:\mount2 /Add-Driver /driver:d:\drivers\pvscsi.inf
dism /image:D:\mount2 /Add-Driver /driver:d:\drivers\vmxnet3ndis6.inf

6. Unmount the winRE image:
dism /unmount-wim /mountdir:d:\mount2 /commit

7. Copy the modified winRE.wim file to:
D:\mount\windows\system32\recovery

8. Unmount and commit the changes to the install.wim:
dism /unmount-wim /mountdir:d:\mount /commit

At this point you now have a modified install.wim file that you can copy back into your Windows OS ISO. Depending on your install.wim file, you may have multiple operating systems that need to be modified. To do this you would serially mount all OS indexes, inject the new winRE.wim then unmount the image. For a typical Windows Server 2008 R2 DVD, you could have 8 images that need to be modified to cover all of your bases. So this process can be a bit tedious and would lend itself to scripting.

Potential CommVault licensing gotchas for VMs

Some backup software manufacturers are offering a capacity based licensing model, instead of a agent based model. Depending on your situation, this may be a dramatically easier and cheaper licensing model. Traditionally you had to buy per-server licenses, per-agent licenses, per-library licenses, and maybe other options as well. Backup software licensing could get very complex and very expensive. But it’s extremely important to understand how their model works, for both physical and virtual servers or you may be in for sticker shock. CommVault licensing may surprise you how they count capacity in your environment.

CommVault, Symantec, and others have introduced capacity based models. With this model, you have unlimited number of agents and servers, but the total amount of data you want to be backed up must be licensed. Depending on the vendor, they may have slight variations on this model. Tivoli capacity licensing is nearly as complex as their agent based model.

When can this model be more cost effective? Generally if you have a lot of servers with minimal data on them, this model works well. Per-server and per-application agents can get expensive. If you have a small number of servers with huge data stores, then a per-server/agent method could be more cost effective.

But, you need to be VERY careful and understand how the backup product measures capacity, if you go with that model. In a physical environment, it’s generally very simple. If your pizza box server has 4TB of storage but you’ve only written 500GB of data, 500GB would count towards your capacity license. The remaining 3.5TB is ‘free’, until you start to physically use it.

In a virtual environment, this can get more complicated, and you could be in for some nasty surprises. Let’s say you do a P2V migration of that same pizza box server, but you downsize the virtual disks to just 1.5TB. Now the million dollar question is, how much capacity counts towards your backup license? 500GB or 1.5TB, or something in between?

With CommVault Simpana 9.0, the answer is ‘it depends’. CommVault counts the VMware VMDK disk size against your capacity license, regardless of how much physical space the VM is using. If you use VMware thin provisioned VMDKs, then at least 500GB comes out of your capacity license. If you use thick VMDKs and rely on your storage array to do the thin provisioning (which nearly all modern arrays support), the full 1.5TB counts against your license!! This is because Simpana is not intelligent enough to look inside the VM for actual disk usage and just charge you for the allocated amount. It looks at the VMDKs as a big blob, and charges you accordingly.

As a result, licensing for a virtual server can be significantly more expensive per GB than a physical environment, at least with Simpana 9.0. I don’t know how NetBackup or other capacity based products count VMDK storage. I’d hope they are more consistent and use the physical server logic. If my VM has a 2TB hardware thin provisioned disk, but I’ve only written 500GB of data, only 500GB should count.

Using VMware thin provisioned disks doesn’t fully solve the problem. Why? Let’s say you have 2TB software thin provisioned disk, and the allocated VMDK space is 500GB. If you copy 1TB of data into the VM, then delete it, the VMDK is now 1.5TB but only contains 500GB of data. So you pay 3x the price for the same 500GB of data, unless you somehow shrink the VMDK.

Finally, if you leverage VMware fault tolerance, you simply can’t use VMware thin provisioned disks. You must use EZT (eager zeroed thick) disks. So regardless of how much or how little disk space you use, the total VMDK disk size counts against your license.

This can be very confusing. CommVault has a two methods for counting capacity that you need to be aware of. One model for physical servers and another for virtual servers that can catch you off guard depending on how your VMDKs are configured.

Geekdom with VMware Storage – Know thy array!

Today I attended a killer session at VMworld 2009 on getting the most performance out of your disk array with VMware. After the session, my head was spinning with all the new information and the realization that optimizing your vSphere storage can be non-trivial. If you attended VMware world, please listen to session TA2467. Understanding VMware storage concepts is critical.

The speaker covered all supported network storage protocols, including Fibre Channel, iSCSI and NFS. The primary root cause of poor performance in a virtualized environment is poor storage performance. VMware vSphere 4.0 has massive improvements in the storage stack for all protocols. However, it’s NOT a simple plug and play matter if you really want to optimize your environment. Both iSCSI and Fibre Channel require low-level knowledge about ESX and your storage array to make the most of your disk array.

Each protocol and each version of ESX have very specific requirements and storage multi-pathing implementations that you MUST understand. Many concerns are array specific, so you get with your storage vendor and read their whitepapers on best practices for VMware. VMware has their own set of whitepapers, which should be required reading prior to any deployment. You can read their Fibre Channel configuration guide here. Also, tripple check that your exact storage configuration is supported in the VMware storage compatibility guide. Many customer problems can be traced back to running in an unsupported configuration.

In vSphere 4.0 VMware added native multi-pathing, but even with this major enhancement you still need to deep dive into your array to understand how it handles LUNs, load balancing, and what you need to do to tweak settings. For example, one critical feature of your array that you must understand is what the controller architecture is. Is it active-active concurrent, active-active non-concurrent, active-passive, etc. If you only ‘know’ your array is active-active, it’s vital to know if it’s concurrent or non-concurrent. Only A/A concurrent is active-active in the eys of vSphere.

How do you tell and what’s the difference? First, ask your disk vendor if you aren’t sure. Secondly, look at how much you spent on the array. 🙂 If you mortgaged your business to buy the array and got an EMC Symmetrix or HDS USP, you have A/A concurrent controllers. Or, if you did your market research and found other storage vendors like 3PAR or Compellent, you also have A/A concurrent controllers but didn’t die of sticker shock from the price tag. If you went the ‘safe’ mid-range route and bought something like the HP EVA, EMC Clariion or any NetApp then you don’t have concurrent controllers and are active/passive in vSphere speak.

If you are lucky enough to have an A/A concurrent controller, all is not well out of the box. The default pathing mode is fixed path which doesn’t make the most of your array. You can change the mode to round robin, but even that simple change isn’t optimal. By default round robin doesn’t alternate single I/Os down each path. Instead it sends a stream of 1000 I/Os at once, then changes the path. This can lead to a non-optimal configuration. If your storage vendor concurs, changing this value to ‘1’ will evenly distribute the load among all paths. But check with your array vendor before changing anything!

According to EMC, even the native round robin feature in vSphere leaves something to be desired. So they wrote PowerPath VE, which completely replaces the vSphere MPIO subsystem. News to me was that PowerPath VE works with many non-EMC arrays as well as EMC arrays. EMC claims that PowerPath VE can increase storage performance 30% to 300% over native MPIO. They have a 45-day trial version you can download and try out, to let you measure how much it may help you out. Can’t hurt to try it out, as long as your array is on their compatibility list.

iSCS has an entirely different set of issues, which I may tackle in another blog. If you want to check out the blog for the presenter of the session, you can see it here. The session was jam packed with information, and this post only covered a tiny piece of his fire hose of information.

© 2017 - Sitemap