The purpose of this article is explain how to enable round robin and multipathing on an ESXi4 cluster.
Our environment consists of:
- 48 HP BL460cG1 servers running ESXi4 embedded with qlogic fc mezzanine cards(2 hba’s/host)
- 3 C7000 Chassis with VC-Enet modules and Cisco MDS9124 switches
- SAN fabric connected to an HP EVA 8400
These instructions are primarily from HP and are SPECIFICALLY FOR THE EVA! Check with your SAN vendor for their recommendations. The shared storage in our environment is all fibre channel, so these instructions will most likely not work on iSCSI or shared storage over other protocols. This article assumes you have two hba’s per host as well. Also make sure that your SAN or LUN’s are setup for an active/active configuration otherwise you’ll have problems with LUN trespassing. Most newer SAN’s are active/active by default, but some SAN’s such as some of the older EMC CX series are setup for active/passive and you have to use powerpath or a vendor specific product in order to setup true multi-pathing on a host. Perform these steps at your own risk! If you’re not comfortable with any part of this then do some research, reference the sources at the bottom of the page, or call VMWare support before you go ahead with this. Now that we’ve got the disclaimers out of the way, let’s get down to the good stuff. The whole process consists of approximately 3 steps: enabling round robin on all LUN’s on all hosts in the cluster, setting each host to use both preferred and non-preferred paths, and finally telling each host how many iops before it switches paths, utilzing both paths more effectively and helping to spread the load across both controllers on your SAN.
Enabling Round Robin
Set multi-path policy to Round Robin on all LUN’s on all hosts in a cluster using PowerCLI:
Get-VMHost -Location <Clustername>|Get-ScsiLun -LunType "disk"|where {$_.MultipathPolicy –ne "RoundRobin"}|Set-ScsiLun -MultipathPolicy "RoundRobin"
Check to see if it took:
Get-VMHost -Location <Clustername>|Get-ScsiLun
The following steps are run in the “unsupported console”. Google to see how to enable ssh on each host.
Set the default Path Selection Policy(PSP) to Round Robin and SATP to VMW_SATP_ALUA on each host
esxcli nmp satp setdefaultpsp --satp VMW_SATP_ALUA --psp VMW_PSP_RR
Set the LUN’s to use preferred and non-preferred paths
Login to each host and type in the following command:
for i in `ls /vmfs/devices/disks/ | grep naa.600` ; do esxcli nmp roundrobin setconfig --useANO 1 --device $i ;done
You might get some errors, but run this command to see if it took:
esxcli nmp device list |grep ANO=
Set amount of iops before it switches paths
for i in `ls /vmfs/devices/disks/ | grep naa.600` ; do esxcli nmp roundrobin setconfig --type "iops" --iops=1 --device $i ;done
By default this is set to 1000, and you’ll have to write a script that runs on startup as the setting doesn’t keep over a reboot. In fact it seems that if you touch the iops= setting, then after a reboot it’s replaced with a random number.
Check the sources below for more detailed information, especially the top link which is the HP “Official” best practices document for this scenario.
-bb
Sources:
thanks for the info, very useful
Thank you for the script to enable round robin. It saved me a ton of time implementing it on our VDI clusters. Now when I go to the server clusters those have a lot more hosts and luns. Do you have a script to backout the change? In other words, do you have a script to go from round robin to fixed?
Thanks in advance!
Tom