Page 1 of 1

Performance issues on 1 node (Hyper-V 2-Node HCI using CVM)

Posted: Mon Jan 13, 2025 5:02 pm
by andrew.kns
I am experiencing an issue where the disk latency is > 10,000ms for any guest VM hosted on node 1. Guest VM's hosted on node 2 have an average disk latency <500ms and I don't know why node 1 performs significantly worse than node 2. Moving / migrating a VM from node 1 to node 2 reflects an improvement in storage performance (and usability as a side effect). As a result almost all VM's are currently being hosted on node 2 rather than node 1 in a 50/50 split.

Host 1 & 2 are configured identically and are directly connected to each other (no switches) over 25G fiber SFPs. The storage is parsed directly though to CVM. Moving the VHD of the VM from one CSV to another does not make a difference. I have re-validated all configuration steps and cannot find a difference between the two host nodes and I am beginning to think that it may be an issue with how Windows is accessing the iSCSI resources offered by the CVM's.

What are some troubleshooting steps that I could look at next to investigate the performance issue? I will also note that the Windows event log shows high I/O latency events as well for the CSV's.

Re: Performance issues on 1 node (Hyper-V 2-Node HCI using CVM)

Posted: Mon Jan 13, 2025 5:50 pm
by yaroslav (staff)
What is the storage configuration?
Look at the underlying storage processes and MDADM configuration.
See if there are any processes in mdstat. Also, make sure the disks are healthy. Please also consider rebooting the CVMs one by one if there is nothing running there.

Re: Performance issues on 1 node (Hyper-V 2-Node HCI using CVM)

Posted: Mon Jan 13, 2025 6:48 pm
by andrew.kns
yaroslav (staff) wrote:
Mon Jan 13, 2025 5:50 pm
What is the storage configuration?
Look at the underlying storage processes and MDADM configuration.
See if there are any processes in mdstat. Also, make sure the disks are healthy. Please also consider rebooting the CVMs one by one if there is nothing running there.
The storage Configuration is as follows:
- PERC H755 controller parsed using PCIe passthrough to the CVM
- 6x 1TB SSD (Hardware RAID 10)
- 6x 10TB HDD (Hardware RAID 10)

The storage processes inside the CVM's appear to be ok.. The disks are healthy and running the latest firmware pushed by Dell. The CVM's have been updated to the latest version and restarted within the last week. This issue has been lingering for quite some time now (more than 3 months). I replaced the NIC in both hosts, upgrading from 10G to 25G to see if that would improve the issue. This however unfortunately did not resolve the latency problems.

Re: Performance issues on 1 node (Hyper-V 2-Node HCI using CVM)

Posted: Tue Jan 14, 2025 12:03 am
by yaroslav (staff)
Could you please run the tests for the underlying storage with FIO (4k and 64k sequential and random patterns)?
Also, could you please share the network diagram? Please include the network aggregations if there are any.

Re: Performance issues on 1 node (Hyper-V 2-Node HCI using CVM)

Posted: Fri Apr 04, 2025 11:08 am
by alicebelinda
It's smart that you've already ruled out some of the usual suspects like VM placement and CSV location. Identical hardware and direct 25G fiber? That should be a recipe for snappy performance across the board.