LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 09-12-2007, 11:40 AM   #1
subram
LQ Newbie
 
Registered: Sep 2007
Posts: 3

Rep: Reputation: 0
clustering Dell's Poweredge 1900 using D1000 array


Hardware details:
2 Dell's Poweredge 1900 servers.
1 D1000 array with 6 drives.
Adaptec 3944AUWD controller.

SCSI BIOS setup:
* "Reset SCSI Bus at IC Initialaization" changed from "Enabled" to "Disabled" on both the servers.
* on one server "SCSI controller ID" is set to 7 and on the second server it is set to 14.

Linux:
Linux srv2 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

On booting either one of the server(while the 2nd server is powered off), i could see all hard drives. Also created logical volumes.

But, when one server is up & running, and when i try to bring up the second server, the second server is not booting. It hangs during the SCSI scan process. Also, on the first server getting SCSI errors on /var/log/messages. attaching message at the end of this thread.

Would above mentioned hardware would work fine for clustering?

Any help to fix this problem is appreciated.

/var/log/messages snippet:

Sep 12 10:15:20 localhost kernel: scsi1: Someone reset channel A
Sep 12 10:15:31 localhost kernel: scsi1:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
Sep 12 10:15:31 localhost kernel: SAVED_SCSIID == 0x7, SAVED_LUN == 0x0, ARG_1 == 0xff ACCUM = 0x80
Sep 12 10:15:31 localhost kernel: SEQ_FLAGS == 0xc0, SCBPTR == 0x0, BTT == 0xff, SINDEX == 0x31
Sep 12 10:15:31 localhost kernel: SCSIID == 0x57, SCB_SCSIID == 0x57, SCB_LUN == 0x0, SCB_TAG == 0xff, SCB_CONTROL == 0xe8
Sep 12 10:15:31 localhost kernel: SCSIBUSL == 0x0, SCSISIGI == 0x46
Sep 12 10:15:31 localhost kernel: SXFRCTL0 == 0x88
Sep 12 10:15:31 localhost kernel: SEQCTL == 0x10
Sep 12 10:15:31 localhost kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Sep 12 10:15:31 localhost kernel: scsi1: Dumping Card State in Data-in phase, at SEQADDR 0x19b
Sep 12 10:15:31 localhost kernel: Card was paused
Sep 12 10:15:31 localhost kernel: ACCUM = 0x80, SINDEX = 0x31, DINDEX = 0x65, ARG_2 = 0x0
Sep 12 10:15:31 localhost kernel: HCNT = 0x0 SCBPTR = 0x0
Sep 12 10:15:31 localhost kernel: SCSISIGI[0x46] ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x40]
Sep 12 10:15:31 localhost kernel: SCSISEQ[0x12] SBLKCTL[0x2] SCSIRATE[0x0] SEQCTL[0x10]
Sep 12 10:15:31 localhost kernel: SEQ_FLAGS[0xc0] SSTAT0[0x7] SSTAT1[0x23] SSTAT2[0x15]
Sep 12 10:15:31 localhost kernel: SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0x8c] SXFRCTL0[0x88]
Sep 12 10:15:31 localhost kernel: DFCNTRL[0x0] DFSTATUS[0x28]
Sep 12 10:15:31 localhost kernel: STACK: 0x135 0x0 0x159 0xfe
Sep 12 10:15:31 localhost kernel: SCB count = 8
Sep 12 10:15:31 localhost kernel: Kernel NEXTQSCB = 2
Sep 12 10:15:31 localhost kernel: Card NEXTQSCB = 2
Sep 12 10:15:31 localhost kernel: QINFIFO entries:
Sep 12 10:15:31 localhost kernel: Waiting Queue entries:
Sep 12 10:15:31 localhost kernel: Disconnected Queue entries:
Sep 12 10:15:31 localhost kernel: QOUTFIFO entries:
Sep 12 10:15:31 localhost kernel: Sequencer Free SCB List: 0 1 3 2 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Sep 12 10:15:31 localhost kernel: Sequencer SCB Info:
Sep 12 10:15:31 localhost kernel: 0 SCB_CONTROL[0xe8] SCB_SCSIID[0x57] SCB_LUN[0x0] SCB_TAG[0xff]
Sep 12 10:15:31 localhost kernel: 1 SCB_CONTROL[0xe8] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
Sep 12 10:15:31 localhost kernel: 2 SCB_CONTROL[0xe8] SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff]
....
....
Sep 12 10:15:31 localhost kernel: Pending list:
Sep 12 10:15:31 localhost kernel: Kernel Free SCB list: 7 3 0 1 6 5 4
Sep 12 10:15:31 localhost kernel: DevQ(0:0:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:1:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:2:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:3:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:4:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:5:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:14:0): 0 waiting
Sep 12 10:15:31 localhost kernel:
Sep 12 10:15:31 localhost kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
 
Old 09-12-2007, 02:44 PM   #2
farslayer
LQ Guru
 
Registered: Oct 2005
Location: Northeast Ohio
Distribution: linuxdebian
Posts: 7,249
Blog Entries: 5

Rep: Reputation: 191Reputation: 191
if that's a Sun D1000 array, the install guide shows it attached to a single server, or the array split and attached to two servers each accessing half the drives. they do not show an arrangement where two servers can access the same drives/logical array at the same time. (page 28 of the .pdf)
http://www.sun.com/products-n-soluti...05-2624-12.pdf

I know when we setup a shared storage cluster in windows, both servers were directly attached to a single Dell disk array like you are talking about, but the Windows NT clustering software prevented both servers from trying to access the directly connected array at the same time. If one server failed, then the other server could take over the resources on the array.

If you were using a NAS or SAN that would be a different story..

Last edited by farslayer; 09-12-2007 at 02:46 PM.
 
Old 09-12-2007, 04:49 PM   #3
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
When we did Sun E10K using D1000 and A1000 only the latter were used for the cluster configuration. The D1000 were used only for things local to the node. I'm not sure D1000 couldn't be used for the cluster but that it was NOT used so maybe that was because it couldn't be. Otherwise I can't see any reason why we would have used two different kind of arrays in the same systems.
 
Old 09-13-2007, 09:59 AM   #4
subram
LQ Newbie
 
Registered: Sep 2007
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks for the feedback.

Per SUN Corp, D1000 could be used for clustering. I came across couple of documents where they talk about clustering using D1000. Obviously, they all use SUN servers.

one eg: http://www.filibeto.org/sun/lib/appl...l/817-0170.pdf

My concern is making it to work with this Dell server.

I am wondering, if i make the SCSI BIOS not to scan for SCSI devices, at the time of booting, might fix the problem. Because as they scan, they try to reset the scsi devices, which is causing other server to panic.

Your suggestions are greatly appreciated.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Red Hat Enterprise 4 on Dell PowerEdge 1900 : Help needed welwitchia Red Hat 5 08-29-2007 01:57 AM
Cisco 1900 switch vande012 General 1 08-10-2007 05:42 PM
Create clustering of two linux with shared array manstt Linux - Enterprise 2 11-23-2005 02:13 AM
72-pin RAM sticks for packard bell Multimedia D1000 primorec Linux - Hardware 4 03-21-2005 08:20 PM
Storedge D1000 and scsi moosie_au Solaris / OpenSolaris 1 05-04-2004 05:07 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 03:53 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration