Issue installing ib0 for lustre
Hi,
I am trying to setup lustre environment but facing the issue related to infiniband setup: While trying to bring up ib0 I am running: [root at slave3 ~]# /etc/rc.d/init.d/rdma restart Unloading OpenIB kernel modules: Found opensm running. Please stop all RDMA applications before downing the stack. [FAILED] Loading OpenIB kernel modules:FATAL: Error inserting ib_addr (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_addr.ko): Unknown symbol in module, or unknown parameter (see dmesg) Failed to load module WARNING: Error inserting ib_core (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_core.ko): Unknown symbol in module, or unknown parameter (see dmesg) WARNING: Error inserting ib_mad (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_mad.ko): Unknown symbol in module, or unknown parameter (see dmesg) WARNING: Error inserting ib_sa (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_sa.ko): Unknown symbol in module, or unknown parameter (see dmesg) WARNING: Error inserting iw_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/iw_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg) WARNING: Error inserting ib_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg) FATAL: Error inserting rdma_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg) Failed to load module WARNING: Error inserting iw_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/iw_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg) WARNING: Error inserting ib_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg) WARNING: Error inserting rdma_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg) FATAL: Error inserting rdma_ucm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_ucm.ko): Unknown symbol in module, or unknown parameter (see dmesg) Failed to load module FATAL: Error inserting ib_ipoib (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/ulp/ipoib/ib_ipoib.ko): Unknown symbol in module, or unknown parameter (see dmesg) Failed to load module [FAILED] FYI.. Though I checked that ibstatus and ibstat working fine: [root at slave3 ~]# ibstat CA 'mlx4_0' CA type: MT26428 Number of ports: 1 Firmware version: 2.9.1000 Hardware version: b0 Node GUID: 0x0002c903000be516 System image GUID: 0x0002c903000be519 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 7 LMC: 0 SM lid: 2 Capability mask: 0x0251086a Port GUID: 0x0002c903000be517 Link layer: InfiniBand [root at slave3 ~]# [root at slave3 ~]# ibstatus Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0002:c903:000b:e517 base lid: 0x7 sm lid: 0x2 state: 4: ACTIVE phys state: 5: LinkUp rate: 40 Gb/sec (4X QDR) link_layer: InfiniBand |
Did you run dmesg and see what the error is?
|
While I rebooted the machine and its up the dmesg was:
Code:
alloc irq_2_iommu on node -1 Code:
[root@slave3 ~]# /etc/rc.d/init.d/rdma restart Code:
dmesg shows: |
That's pretty clear. Your kernel seems to not have ipv6 configured. Either recompile the ib driver the same way or get a new kernel.
|
smallpond,
How to enable ipv6 in the kernel level? How to do it? |
I think I'm wrong. Try just doing
Code:
modprobe ipv6 |
All times are GMT -5. The time now is 02:06 AM. |