LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 05-07-2013, 01:31 PM   #1
your_shadow03
Senior Member
 
Registered: Jun 2008
Location: Germany
Distribution: Slackware
Posts: 1,429
Blog Entries: 6

Rep: Reputation: 51
Issue installing ib0 for lustre


Hi,

I am trying to setup lustre environment but facing the issue related to
infiniband setup:

While trying to bring up ib0 I am running:

[root at slave3 ~]# /etc/rc.d/init.d/rdma restart
Unloading OpenIB kernel modules:
Found opensm running.
Please stop all RDMA applications before downing the stack.
[FAILED]
Loading OpenIB kernel modules:FATAL: Error inserting ib_addr
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_addr.ko):
Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module WARNING: Error inserting ib_core
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_core.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_mad
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_mad.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_sa
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_sa.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting iw_cm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/iw_cm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_cm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_cm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
FATAL: Error inserting rdma_cm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_cm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module WARNING: Error inserting iw_cm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/iw_cm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_cm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_cm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting rdma_cm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_cm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
FATAL: Error inserting rdma_ucm
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_ucm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module FATAL: Error inserting ib_ipoib
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/ulp/ipoib/ib_ipoib.ko):
Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module [FAILED]

FYI..

Though I checked that ibstatus and ibstat working fine:

[root at slave3 ~]# ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 1
Firmware version: 2.9.1000
Hardware version: b0
Node GUID: 0x0002c903000be516
System image GUID: 0x0002c903000be519
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 7
LMC: 0
SM lid: 2
Capability mask: 0x0251086a
Port GUID: 0x0002c903000be517
Link layer: InfiniBand
[root at slave3 ~]#

[root at slave3 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000b:e517
base lid: 0x7
sm lid: 0x2
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand
 
Old 05-07-2013, 03:11 PM   #2
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 1,163

Rep: Reputation: 258Reputation: 258Reputation: 258
Did you run dmesg and see what the error is?
 
Old 05-08-2013, 04:28 AM   #3
your_shadow03
Senior Member
 
Registered: Jun 2008
Location: Germany
Distribution: Slackware
Posts: 1,429
Blog Entries: 6

Original Poster
Rep: Reputation: 51
While I rebooted the machine and its up the dmesg was:
Code:
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.1: irq 79 for MSI/MSI-X
ioatdma 0000:00:04.2: PCI INT C -> GSI 31 (level, low) -> IRQ 31
ioatdma 0000:00:04.2: setting latency timer to 64
  alloc irq_desc for 80 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.2: irq 80 for MSI/MSI-X
ioatdma 0000:00:04.3: PCI INT D -> GSI 39 (level, low) -> IRQ 39
ioatdma 0000:00:04.3: setting latency timer to 64
  alloc irq_desc for 81 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.3: irq 81 for MSI/MSI-X
ioatdma 0000:00:04.4: PCI INT A -> GSI 31 (level, low) -> IRQ 31
ioatdma 0000:00:04.4: setting latency timer to 64
  alloc irq_desc for 82 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.4: irq 82 for MSI/MSI-X
ioatdma 0000:00:04.5: PCI INT B -> GSI 39 (level, low) -> IRQ 39
ioatdma 0000:00:04.5: setting latency timer to 64
  alloc irq_desc for 83 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.5: irq 83 for MSI/MSI-X
ioatdma 0000:00:04.6: PCI INT C -> GSI 31 (level, low) -> IRQ 31
ioatdma 0000:00:04.6: setting latency timer to 64
  alloc irq_desc for 84 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.6: irq 84 for MSI/MSI-X
ioatdma 0000:00:04.7: PCI INT D -> GSI 39 (level, low) -> IRQ 39
ioatdma 0000:00:04.7: setting latency timer to 64
  alloc irq_desc for 85 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
ioatdma 0000:00:04.7: irq 85 for MSI/MSI-X
mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
mlx4_core: Initializing 0000:05:00.0
mlx4_core 0000:05:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
mlx4_core 0000:05:00.0: setting latency timer to 64
udev: renamed network interface eth0 to em1
  alloc irq_desc for 86 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 86 for MSI/MSI-X
  alloc irq_desc for 87 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 87 for MSI/MSI-X
  alloc irq_desc for 88 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 88 for MSI/MSI-X
  alloc irq_desc for 89 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 89 for MSI/MSI-X
  alloc irq_desc for 90 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 90 for MSI/MSI-X
  alloc irq_desc for 91 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 91 for MSI/MSI-X
  alloc irq_desc for 92 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 92 for MSI/MSI-X
  alloc irq_desc for 93 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 93 for MSI/MSI-X
  alloc irq_desc for 94 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 94 for MSI/MSI-X
  alloc irq_desc for 95 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 95 for MSI/MSI-X
  alloc irq_desc for 96 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 96 for MSI/MSI-X
  alloc irq_desc for 97 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 97 for MSI/MSI-X
  alloc irq_desc for 98 on node -1
  alloc kstat_irqs on node -1
alloc irq_2_iommu on node -1
mlx4_core 0000:05:00.0: irq 98 for MSI/MSI-X
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
iTCO_vendor_support: vendor-support=0
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.07rh
iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS
i801_smbus 0000:00:1f.3: PCI INT C -> GSI 19 (level, low) -> IRQ 19
ACPI: resource 0000:00:1f.3 [io  0x4000-0x401f] conflicts with ACPI region SMBI [io 0x4000-0x400f]
ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
EDAC MC: Ver: 2.1.0 Dec 14 2012
EDAC sbridge: Seeking for: dev 0e.0 PCI ID 8086:3ca0
EDAC sbridge: Seeking for: dev 0e.0 PCI ID 8086:3ca0
EDAC sbridge: Seeking for: dev 0f.0 PCI ID 8086:3ca8
EDAC sbridge: Seeking for: dev 0f.0 PCI ID 8086:3ca8
EDAC sbridge: Seeking for: dev 0f.1 PCI ID 8086:3c71
EDAC sbridge: Seeking for: dev 0f.1 PCI ID 8086:3c71
EDAC sbridge: Seeking for: dev 0f.2 PCI ID 8086:3caa
EDAC sbridge: Seeking for: dev 0f.2 PCI ID 8086:3caa
EDAC sbridge: Seeking for: dev 0f.3 PCI ID 8086:3cab
EDAC sbridge: Seeking for: dev 0f.3 PCI ID 8086:3cab
EDAC sbridge: Seeking for: dev 0f.4 PCI ID 8086:3cac
EDAC sbridge: Seeking for: dev 0f.4 PCI ID 8086:3cac
EDAC sbridge: Seeking for: dev 0f.5 PCI ID 8086:3cad
EDAC sbridge: Seeking for: dev 0f.5 PCI ID 8086:3cad
EDAC sbridge: Seeking for: dev 11.0 PCI ID 8086:3cb8
EDAC sbridge: Seeking for: dev 11.0 PCI ID 8086:3cb8
EDAC sbridge: Seeking for: dev 0c.6 PCI ID 8086:3cf4
EDAC sbridge: Seeking for: dev 0c.6 PCI ID 8086:3cf4
EDAC sbridge: Seeking for: dev 0c.7 PCI ID 8086:3cf6
EDAC sbridge: Seeking for: dev 0c.7 PCI ID 8086:3cf6
EDAC sbridge: Seeking for: dev 0d.6 PCI ID 8086:3cf5
EDAC sbridge: Seeking for: dev 0d.6 PCI ID 8086:3cf5
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:ff:0e.0
EDAC sbridge: Driver loaded.
sr0: scsi3-mmc drive: 24x/24x cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 6:0:0:0: Attached scsi CD-ROM sr0
microcode: CPU0 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU1 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU2 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU3 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU4 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU5 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU6 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
microcode: CPU7 sig=0x206d5, pf=0x1, revision=0x513
platform microcode: firmware: requesting intel-ucode/06-2d-05
Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
sd 1:0:0:0: Attached scsi generic sg0 type 0
sr 6:0:0:0: Attached scsi generic sg1 type 5
EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts:
EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts:
Adding 4030456k swap on /dev/mapper/VolGroup-lv_swap.  Priority:-1 extents:1 across:4030456k
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
8021q: adding VLAN 0 to HW filter on device eth1
cnic: Unknown symbol ip6_route_output
8021q: adding VLAN 0 to HW filter on device em1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
type=1305 audit(1368108959.718:48954): auid=4294967295 ses=4294967295 op="remove rule" key=(null) list=4 res=1
type=1305 audit(1368108959.718:48955): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 res=1
readahead-collector: sorting
readahead-collector: finished
While I run /etc/rc.d/init.d/rdma restart
Code:
[root@slave3 ~]# /etc/rc.d/init.d/rdma restart
Unloading OpenIB kernel modules:                           [  OK  ]
Loading OpenIB kernel modules:FATAL: Error inserting ib_addr (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_addr.ko): Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module WARNING: Error inserting ib_core (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_core.ko): Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_mad (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_mad.ko): Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_sa (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_sa.ko): Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting iw_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/iw_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg)
FATAL: Error inserting rdma_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module WARNING: Error inserting iw_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/iw_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting ib_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/ib_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg)
WARNING: Error inserting rdma_cm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_cm.ko): Unknown symbol in module, or unknown parameter (see dmesg)
FATAL: Error inserting rdma_ucm (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/core/rdma_ucm.ko): Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module FATAL: Error inserting ib_ipoib (/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/kernel/drivers/infiniband/ulp/ipoib/ib_ipoib.ko): Unknown symbol in module, or unknown parameter (see dmesg)

Failed to load module                                      [FAILED]
[root@slave3 ~]#
Code:
dmesg shows:

dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
sd 1:0:0:0: Attached scsi generic sg0 type 0
sr 6:0:0:0: Attached scsi generic sg1 type 5
EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts:
EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts:
Adding 4030456k swap on /dev/mapper/VolGroup-lv_swap.  Priority:-1 extents:1 across:4030456k
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
8021q: adding VLAN 0 to HW filter on device eth1
cnic: Unknown symbol ip6_route_output
8021q: adding VLAN 0 to HW filter on device em1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
type=1305 audit(1368108959.718:48954): auid=4294967295 ses=4294967295 op="remove rule" key=(null) list=4 res=1
type=1305 audit(1368108959.718:48955): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 res=1
readahead-collector: sorting
readahead-collector: finished
mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (April 4, 2008)
ib_addr: Unknown symbol ipv6_dev_get_saddr
ib_addr: Unknown symbol ip6_route_output
ib_addr: Unknown symbol ipv6_chk_addr
ib_addr: Unknown symbol ipv6_dev_get_saddr
ib_addr: Unknown symbol ip6_route_output
ib_addr: Unknown symbol ipv6_chk_addr
ib_addr: Unknown symbol ipv6_dev_get_saddr
ib_addr: Unknown symbol ip6_route_output
ib_addr: Unknown symbol ipv6_chk_addr
ib_ipoib: Unknown symbol icmpv6_send
 
Old 05-08-2013, 08:08 AM   #4
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 1,163

Rep: Reputation: 258Reputation: 258Reputation: 258
That's pretty clear. Your kernel seems to not have ipv6 configured. Either recompile the ib driver the same way or get a new kernel.
 
Old 05-08-2013, 09:31 AM   #5
your_shadow03
Senior Member
 
Registered: Jun 2008
Location: Germany
Distribution: Slackware
Posts: 1,429
Blog Entries: 6

Original Poster
Rep: Reputation: 51
smallpond,

How to enable ipv6 in the kernel level? How to do it?
 
Old 05-08-2013, 09:50 AM   #6
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 1,163

Rep: Reputation: 258Reputation: 258Reputation: 258
I think I'm wrong. Try just doing
Code:
modprobe ipv6
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Issue with Lustre Client setup your_shadow03 Linux - Networking 2 05-04-2013 03:19 AM
Need help on parallel filsystem like lustre.. your_shadow03 Linux - Networking 0 03-19-2013 09:44 AM
PVFS2 vs Lustre on RocksCluster. abominable ROCK 0 06-11-2012 05:18 AM
Lustre server not healthy Langton Linux - General 1 05-13-2011 03:22 AM
Lustre server not healthy Langton Linux - Server 0 05-12-2011 04:15 AM


All times are GMT -5. The time now is 09:03 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration