Minimize storage response time on ESXi when a partner server has gone offline unexpectedly
Description
It is a known fact that there is a pause in I/O during unexpected failures, like network cable disconnection, switch failure, partner node interruption when utilizing the iSCSI protocol. For the majority of the environments, the default settings are working fine because ESXi is able to queue I/O and abstract some of the issues away from the production VMs.
While for the most of environments the default failover time of 25-30 seconds is completely inconspicuous, for some of them such delay might be critical, especially if they are running applications that require faster recovery times, such as video surveillance applications.
The steps below are provided for the cases when the default settings should be tweaked as it’s necessary by applications requirement.
Solution
There should be done several changes as in the StarWind VSAN configuration file as well as on ESXi server iSCSI initiator settings.
- Changes in StarWind VSAN configuration file:
– In the StarWind Management Console, check that all StarWind HA devices have the “Synchronized” status on all servers;
– Check that all datastores have active paths from all StarWind servers and MPIO policy set to Round Robin;
– Stop StarWind VSAN service (Windows: CMD->net stop starwindservice, Linux: systemctl stop StarWindVSA ;
– Open StarWind VSAN configuration file (Windows: “C:\Program Files\StarWind Software\StarWind\StarWind.cfg” , Linux: nano /opt/StarWind/StarWindVSA/drive_c/StarWind/StarWind.cfg);
– In StarWind VSAN configuration file add the line ‘iScsiPingCmdSendCmdTimeoutInSec’ = ‘1’ and save changes;
– Start StarWind VSAN service (Windows: CMD->net start starwindservice, Linux: systemctl start StarWindVSA ;
- Changes in ESXi iSCSI initiator settings, where StarWind iSCSI targets are connected:
– make sure that all iSCSI devices are not in use;
– under Storage, open Software ISCSi initiator advanced settings.
– Navigate the “NoopInterval” setting and set it to 3. Click on the Save Configuration button.
Additionally, it is recommended to change StarWind VSAN virtual machine settings proactively to make sure that there will be no controller resets, which might cause delays on underlying storage and issues with iSCSI targets availability. Please find detailed instructions here: https://knowledgebase.starwindsoftware.com/troubleshooting/fixing-reset-to-device-error-when-running-the-lsi_sas-controller/
Request a Product Feature
To request a new product feature or to provide feedback on a StarWind product, please email our support at support@starwind.com and put “Request a Product Feature” as the subject.