LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices

Reply
 
LinkBack Search this Thread
Old 09-12-2007, 11:40 AM   #1
subram
LQ Newbie
 
Registered: Sep 2007
Posts: 3

Rep: Reputation: 0
clustering Dell's Poweredge 1900 using D1000 array


Hardware details:
2 Dell's Poweredge 1900 servers.
1 D1000 array with 6 drives.
Adaptec 3944AUWD controller.

SCSI BIOS setup:
* "Reset SCSI Bus at IC Initialaization" changed from "Enabled" to "Disabled" on both the servers.
* on one server "SCSI controller ID" is set to 7 and on the second server it is set to 14.

Linux:
Linux srv2 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

On booting either one of the server(while the 2nd server is powered off), i could see all hard drives. Also created logical volumes.

But, when one server is up & running, and when i try to bring up the second server, the second server is not booting. It hangs during the SCSI scan process. Also, on the first server getting SCSI errors on /var/log/messages. attaching message at the end of this thread.

Would above mentioned hardware would work fine for clustering?

Any help to fix this problem is appreciated.

/var/log/messages snippet:

Sep 12 10:15:20 localhost kernel: scsi1: Someone reset channel A
Sep 12 10:15:31 localhost kernel: scsi1:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
Sep 12 10:15:31 localhost kernel: SAVED_SCSIID == 0x7, SAVED_LUN == 0x0, ARG_1 == 0xff ACCUM = 0x80
Sep 12 10:15:31 localhost kernel: SEQ_FLAGS == 0xc0, SCBPTR == 0x0, BTT == 0xff, SINDEX == 0x31
Sep 12 10:15:31 localhost kernel: SCSIID == 0x57, SCB_SCSIID == 0x57, SCB_LUN == 0x0, SCB_TAG == 0xff, SCB_CONTROL == 0xe8
Sep 12 10:15:31 localhost kernel: SCSIBUSL == 0x0, SCSISIGI == 0x46
Sep 12 10:15:31 localhost kernel: SXFRCTL0 == 0x88
Sep 12 10:15:31 localhost kernel: SEQCTL == 0x10
Sep 12 10:15:31 localhost kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Sep 12 10:15:31 localhost kernel: scsi1: Dumping Card State in Data-in phase, at SEQADDR 0x19b
Sep 12 10:15:31 localhost kernel: Card was paused
Sep 12 10:15:31 localhost kernel: ACCUM = 0x80, SINDEX = 0x31, DINDEX = 0x65, ARG_2 = 0x0
Sep 12 10:15:31 localhost kernel: HCNT = 0x0 SCBPTR = 0x0
Sep 12 10:15:31 localhost kernel: SCSISIGI[0x46] ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x40]
Sep 12 10:15:31 localhost kernel: SCSISEQ[0x12] SBLKCTL[0x2] SCSIRATE[0x0] SEQCTL[0x10]
Sep 12 10:15:31 localhost kernel: SEQ_FLAGS[0xc0] SSTAT0[0x7] SSTAT1[0x23] SSTAT2[0x15]
Sep 12 10:15:31 localhost kernel: SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0x8c] SXFRCTL0[0x88]
Sep 12 10:15:31 localhost kernel: DFCNTRL[0x0] DFSTATUS[0x28]
Sep 12 10:15:31 localhost kernel: STACK: 0x135 0x0 0x159 0xfe
Sep 12 10:15:31 localhost kernel: SCB count = 8
Sep 12 10:15:31 localhost kernel: Kernel NEXTQSCB = 2
Sep 12 10:15:31 localhost kernel: Card NEXTQSCB = 2
Sep 12 10:15:31 localhost kernel: QINFIFO entries:
Sep 12 10:15:31 localhost kernel: Waiting Queue entries:
Sep 12 10:15:31 localhost kernel: Disconnected Queue entries:
Sep 12 10:15:31 localhost kernel: QOUTFIFO entries:
Sep 12 10:15:31 localhost kernel: Sequencer Free SCB List: 0 1 3 2 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Sep 12 10:15:31 localhost kernel: Sequencer SCB Info:
Sep 12 10:15:31 localhost kernel: 0 SCB_CONTROL[0xe8] SCB_SCSIID[0x57] SCB_LUN[0x0] SCB_TAG[0xff]
Sep 12 10:15:31 localhost kernel: 1 SCB_CONTROL[0xe8] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff]
Sep 12 10:15:31 localhost kernel: 2 SCB_CONTROL[0xe8] SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff]
....
....
Sep 12 10:15:31 localhost kernel: Pending list:
Sep 12 10:15:31 localhost kernel: Kernel Free SCB list: 7 3 0 1 6 5 4
Sep 12 10:15:31 localhost kernel: DevQ(0:0:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:1:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:2:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:3:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:4:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:5:0): 0 waiting
Sep 12 10:15:31 localhost kernel: DevQ(0:14:0): 0 waiting
Sep 12 10:15:31 localhost kernel:
Sep 12 10:15:31 localhost kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
 
Old 09-12-2007, 02:44 PM   #2
farslayer
Guru
 
Registered: Oct 2005
Location: Willoughby, Ohio
Distribution: linuxdebian
Posts: 7,230
Blog Entries: 5

Rep: Reputation: 185Reputation: 185
if that's a Sun D1000 array, the install guide shows it attached to a single server, or the array split and attached to two servers each accessing half the drives. they do not show an arrangement where two servers can access the same drives/logical array at the same time. (page 28 of the .pdf)
http://www.sun.com/products-n-soluti...05-2624-12.pdf

I know when we setup a shared storage cluster in windows, both servers were directly attached to a single Dell disk array like you are talking about, but the Windows NT clustering software prevented both servers from trying to access the directly connected array at the same time. If one server failed, then the other server could take over the resources on the array.

If you were using a NAS or SAN that would be a different story..

Last edited by farslayer; 09-12-2007 at 02:46 PM.
 
Old 09-12-2007, 04:49 PM   #3
MensaWater
Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 5,192

Rep: Reputation: 468Reputation: 468Reputation: 468Reputation: 468Reputation: 468
When we did Sun E10K using D1000 and A1000 only the latter were used for the cluster configuration. The D1000 were used only for things local to the node. I'm not sure D1000 couldn't be used for the cluster but that it was NOT used so maybe that was because it couldn't be. Otherwise I can't see any reason why we would have used two different kind of arrays in the same systems.
 
Old 09-13-2007, 09:59 AM   #4
subram
LQ Newbie
 
Registered: Sep 2007
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks for the feedback.

Per SUN Corp, D1000 could be used for clustering. I came across couple of documents where they talk about clustering using D1000. Obviously, they all use SUN servers.

one eg: http://www.filibeto.org/sun/lib/appl...l/817-0170.pdf

My concern is making it to work with this Dell server.

I am wondering, if i make the SCSI BIOS not to scan for SCSI devices, at the time of booting, might fix the problem. Because as they scan, they try to reset the scsi devices, which is causing other server to panic.

Your suggestions are greatly appreciated.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Red Hat Enterprise 4 on Dell PowerEdge 1900 : Help needed welwitchia Red Hat 5 08-29-2007 01:57 AM
Cisco 1900 switch vande012 General 1 08-10-2007 05:42 PM
Create clustering of two linux with shared array manstt Linux - Enterprise 2 11-23-2005 02:13 AM
72-pin RAM sticks for packard bell Multimedia D1000 primorec Linux - Hardware 4 03-21-2005 08:20 PM
Storedge D1000 and scsi moosie_au Solaris / OpenSolaris 1 05-04-2004 05:07 AM


All times are GMT -5. The time now is 06:50 AM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration