Slow response from server
I am having a problem with a Linux Samba server
Initially I was running a DELL poweredge 400 with 2 7200 RPM IDE drives.
The OS was on 1 drive and the other drive was shared as a Samba share with the network
The Samba configuration is very simple (one share which everybody has access to)
We recently decided to upgrade the server to a DELL 1600 Poweredge with 1 IDE Drive and
a 3 disk RAID Cat 5 array comprising 3 36GB 10,000 RPM SCSI disks.
The thought was that the information on the RAID array would not only be safer but accessing
it via Samba would be faster.
This is not the case. When it comes to opening and saving the files the RAID system is
much slower. I think that the problem is in how i migrated the server settings
from the old server to the new server.
1. Install OS on new system using same name (not connected to the network)
2. backup the /etc directory to a .tar file on the Samba share
3. copy all of the files on the first server from the share to a backup system
4. bring the first server down
5. Bring the second server up on the network
6. Copy the files from the backup system to the second server
7. uncompress the .tar file to over write the /etc directory on the second server
With the exception of an erroneous network card setting that I deleted and the
slow connections when users try to connect to it the server seems to be opperrating fine
I think that I over wrote something in the /etc directory that I should not have but
I really have no idea which directories within the /etc directory were the wrong ones
to tar or which should have been tared only
The first idea that comes to mind is that the confuration files in the /etc directory might be inappropriate for the new system. Your modprobe.conf file might be loading the wrong modules for example. You can check that by looking at the dmesg output or the /var/log/boot.msg file to see if there are any warnings or errors. The /var/log/messages file would be a good place to look for error messages as well.
Another possibility is the RAID 5 implementation. Is the RAID 5 array implemented completely in software, partially in software and partially in hardware, or completely in hardware? A RAID 5 configuration requires a lot of processing. If it isn't implemented 100% in hardware then you have probably added a lot of CPU overhead by using RAID 5. Cards that do 100% of the work are a lot of money but they may be worth it.
IMO if you want to implement a RAID 5 array completely in software then you would do well to make that machine a file server and run your application on another computer accessing the files over the network. Here's an interesting idea to make that work really fast. You could have a couple of gigabit NICs connecting the application server to the file server using a couple of crossover cables and binding the two NICS into one virtual NIC. This would directly connect the file server to the application server without using the network. 2 Gbps full duplex with no other network traffic should be pretty fast. I haven't done this yet; I'm just speculating.
The simpler solution would be to get a good RAID controller that implements RAID 5 100% in hardware. I saw some the other day that had 4 SATA ports for about $850(US).
Here is a good link for a recent article on RAID controllers. It focuses on performance tests for RAID 6 but it also tests the same cards for RAID 5 performance.
Thank you for your response.
Your first option is along the correct lines, so
any other ideas in this direction would be really helpful
The SCSI Raid setup is 100% hardware installed and configured
by the manufacturer and to OS just sees it as one big drive
The system is really just a sophisticated NAS device and all
of the processing is done one the individual desktops.
The users run the application locally but use documents stored
on the servers drives and when they open or save the documents
the time lag is very signifiant.
Some other things that you could consider are file system tuning, compiling the kernel to better meet your particular machine, looking for particular files that cause an excessive i/o bottleneck.
Ideas for file system tuning:
What file system type are you using? Although I've only used ext2 I've read here on LQ that the Reiser and XFS file systems are good performers.
You could mount the file system with the noatime option. This would reduce the amount of writes to the disk because file access times are not recorded. I heard about this here at LQ as well. You can see that we're all learning all the time. :)
What is the block size of your file system? If your files are large then you can use a large block size for your partition. Small block sizes reduce waste of disk space but take a lot of time when modifying large files. Large block sizes on large files speed up file operations.
What is the chunk size of your RAID read/write operations? This can make a big difference in performance. Regretably I think that you need to do your own testing to find the optimum size. I've seen some performance tests where 4K appears to be a good setting.
Is your RAID set to perform write through or write back operations? The write through is safer but takes longer. The write back is faster because it doesn't wait for the operation to complete before moving to the next disk operation. Write back can speed up disk operations. (The same is true for your CPU cache.)
Speaking of the CPU, you might want to consider looking at the motherboard tuning. I'm NOT saying to overclock your CPU or memory but you may be able to enable things like PCI bus mastering.
Back to the file system. If you have one file in the file system that is used a LOT while others are used less then maybe you could move that file to another physical disk. Spreading your i/o load over more spindles can make a big difference in disk response. Notice that I'm not saying to move files between partitons on the same disk. I'm talking about a new physical disk (set).
There is a commercial product called SARCheck. I haven't used it but it claims to be able to watch your computer work load and make recommendation about tuning the settings in the /proc area. It might be worth a look. I used to use something like this on VMS and it was my best friend regarding performance tuning. I've also heard good things about BMC Patrol but I haven't used that either.
Profile your workload. Find what resources are being used and at what rate and at what time of day. Naturally you are mostly interested in the work load during the time that users are doing work with the machine. You could do this with a cron job that periodically runs vmstat or iostat or sar for a limited time, putting the data into a file that you can read at your leisure. Vmstat, iostat, and sar will show you things like the number of blocked processes, the page in/page out rate, and CPU wait times. You can get a really good view of the disk load by using iostat with the x and p parameters. That will show you the i/o on each disk partition.
You can and should keep working your way outward toward the client computers. Look at your NIC(s). Are you using 10/100 Mbps or gigabit speed? Are you certain that they're running at full duplex? I If you are using gigabit are you sure that your hub is gigabit speed on all ports? Many hubs advertised as gigabit speed actually only have gigabit speed on the port that connects them to the LAN while the ports that connect the computer NICs are 10/100 speed. You can look at network topology. Put a sniffer on different parts of the LAN and measure network traffic. Look for saturation and resulting bottlenecks. Also look for packet patterns that indicate faulty NIC hardware or faulty configuration. I recently found a client that was experiencing a lot of duplicate ACKs, packet retransmissions, and packet out of order problems. I replaced the NIC and the performance on that machine increased dramatically. Apparently the NIC was bad. Finally look at the client computers. Maybe they are overloaded with this application. Of course you can and should do several of these things concurrently, and all of these things should be part of your normal system administration routine.
Post what you end up doing and the results. I'm very interested in following your progress.
Using KDE System Guard I have kept tabes on the the running processes
on the system. The CPU using is almost always less than 5% and the
memory usuage is at about 850 Mb of the total 1Gb with no swap file usuage.
The system is used for only one purposes and that is as a file server
for cad drawings. The drafters run the application at their desktop
and access files that are stored on the server share, so server doesn't
really process much. The users are expericencing greatly increased times
in opening or saving files than before.
I am thinking that reinstalling the OS will be the solution I need to do
If I do this I really don't want to renter all of the user information
again (that is a lot of typing and time). I saw on another post, a list
of files that would be the minimum I would need to copy to transfer the user info
I think that this is all of the information I would need to copy from the
previous system to get all of the user ID and password information
The server setup so that the OS is installed on IDE Drive A and the Network accessed Samba Share
is installed on the raid array. Is it possible to just reformat the IDE drive and
reinstall the OS and the insert the already partitioned raid drive into the file system
maybe through editing the FSTAB file
Don't forget to consider possible problems from outside your box. Networks can be a real bottleneck and performance bugaboo.
I got the problem fixed. It was in the config files in the ETC directory.
I backed up
and then reformatted the drive that had the OS installed on it and reinstalled the OS.
I am using Red Hat ES 3.0 and during the install process it uses Disk Druid to
setup the drives. with it I was able to insert the raid array into the file system
without reformatting the drive or needing the edit the fstab file after installation.
The server is now operating much quicker when accessed the samba share
|All times are GMT -5. The time now is 11:20 AM.|