Hello,
I have installed Slackware64 14.2 on my Dell PowerEdge T110 II server. The Dell PowerEdge T110 II server has an Intel Xeon E3-1230 V2. However, I am having an issue with getting SR-IOV/IOMMU working properly with it. I am trying to use SR-IOV with my Intel Ethernet Server Adapter I350-T2. SR-IOV/IOMMU is enabled in the BIOS and kernel. I have added the option to enable 7 virtual functions which works properly on the host as can be seen below:
Code:
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether a0:36:9f:87:b0:6a brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 5 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 6 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether a0:36:9f:87:b0:6b brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 5 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 6 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
The problem is when I attempt to start a guest VM running on QEMU/KVM it says that "group 1 is not viable. Please ensure that all devices within the iommu_group are bound to their vfio bus driver." All of the virtual functions are using the igbvf module. The devices bound to IOMMU group 1 appears to be both of the I350-T2 network interfaces, and the PCIE root port as again can be seen below:
Code:
root@Power:~# find /sys/kernel/iommu_groups/ -type l | grep "iommu_groups/1"
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.1
/sys/kernel/iommu_groups/1/devices/0000:02:10.0
/sys/kernel/iommu_groups/1/devices/0000:02:10.1
/sys/kernel/iommu_groups/1/devices/0000:02:10.4
/sys/kernel/iommu_groups/1/devices/0000:02:10.5
/sys/kernel/iommu_groups/1/devices/0000:02:11.0
/sys/kernel/iommu_groups/1/devices/0000:02:11.1
/sys/kernel/iommu_groups/1/devices/0000:02:11.4
/sys/kernel/iommu_groups/1/devices/0000:02:11.5
/sys/kernel/iommu_groups/1/devices/0000:02:12.0
/sys/kernel/iommu_groups/1/devices/0000:02:12.1
/sys/kernel/iommu_groups/1/devices/0000:02:12.4
/sys/kernel/iommu_groups/1/devices/0000:02:12.5
/sys/kernel/iommu_groups/1/devices/0000:02:13.0
/sys/kernel/iommu_groups/1/devices/0000:02:13.1
root@Power:~# lspci -v | grep 00:01.0
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09) (prog-if 00 [Normal decode])
root@Power:~# lspci -v | grep 01:00.0
01:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
root@Power:~# lspci -v | grep 01:00.1
01:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
The rest of the devices appear to be Intel Corporation I350 Ethernet Controller Virtual Functions. The list of those PCI addresses are below:
Code:
02:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:10.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:10.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:10.5 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:11.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:11.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:11.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:11.5 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:12.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:12.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:12.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:12.5 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:13.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
02:13.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
I have been searching around for a few hours now, and have found it appears that ACS isn't working properly to isolate the devices. I did find this article from Lenovo that seems to indicate that the Xeon E3 processors do not support ACS.
https://support.lenovo.com/us/nb/solutions/ht504019 But, I haven't found much else to confirm that is true. I have read about some ACS override, but that seems dangerous as it can lead to VM leaks.
Does anyone know if a solution to this? Or, do the Xeon E3's not support ACS properly, and I'm pretty much out of luck getting SR-IOV working properly? As it seems the devices in this IOMMU group are the PCI root port, and the NIC/VF interfaces itself I'm unsure if moving the card to a difference PCI Express slot would help. Would moving the card physically to a different slot help at all?
Any help would be much appreciated.
Thanks!