LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   slow disk write in Slackware 14.1 (https://www.linuxquestions.org/questions/slackware-14/slow-disk-write-in-slackware-14-1-a-4175489580/)

roderich 12-30-2013 08:52 AM

slow disk write in Slackware 14.1
 
I have successfully upgraded my Slackware 14.0 system to 14.1, but now I am facing a severe degradation in disk write speed, file copies go along with about 5 MB/s.
Reads, OTOH, look OK, when I copy to /dev/null or to a network drive, I see "normal" speeds.
Looks like something fundamental has changed in the 14.1 system, but I have no real idea where to look, the reason could be at every level: HW driver, LVM, filesystem. My main suspect would be the ext4-filesystem, does anybody know recent changes there?

Some data about my disk situation:

From lspci:

00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)

hdparm -i /dev/sda

Model=ST2000DM001-9YN164, FwRev=CC4G, SerialNo=Z1E1BP3D
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=3907029168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=yes: unknown setting WriteCache=enabled
Drive conforms to: unknown: ATA/ATAPI-4,5,6,7

hdparm -tT /dev/sda

Timing cached reads: 24918 MB in 1.99 seconds = 12491.03 MB/sec
Timing buffered disk reads: 614 MB in 3.01 seconds = 204.27 MB/sec

fdisk -l /dev/sda

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0001f80b

Device Boot Start End Blocks Id System
/dev/sda1 2048 104859647 52428800 83 Linux
/dev/sda2 104859648 109053951 2097152 82 Linux swap
/dev/sda3 109053952 3907029167 1898987608 8e Linux LVM

And all filesystems are ext4.

micnet 12-30-2013 09:12 AM

Look into Advanced Format feature on the disk drive.

When the logical size is less than the physical size, apparently speed is a casualty.

Partition boundaries should be divisible by 8, which yours are not, to prevent problems.

Lots of info about this on the Western Digital and Seagate sites. You clearly have a 4K physical sector and a 512byte logical.

Get those partition boundaries aligned for starters.

roderich 12-30-2013 09:40 AM

Quote:

Originally Posted by micnet (Post 5089034)
Look into Advanced Format feature on the disk drive.

When the logical size is less than the physical size, apparently speed is a casualty.

Partition boundaries should be divisible by 8, which yours are not, to prevent problems.

Lots of info about this on the Western Digital and Seagate sites. You clearly have a 4K physical sector and a 512byte logical.

Get those partition boundaries aligned for starters.

Yeah, this is obviously one of these disks.

*But*:

I see that the partitions boundaries *are* divisible by 8.
I think, this has been done automagically by the Linux partition tools when I began to use this disk half a year ago.

And this would have been a problem right from the beginning and not started now with the 14.1 upgrade.

micnet 12-30-2013 10:46 AM

That's the last time I'll use the microsoft calculator on my office PC to divide by 8.

I stayed with 14.1 and changed my disk drive to a 512/512 and things got better for me regarding speed. Better, but not good enough in other areas.

I am thinking of downgrading to 14.0. Too many things seem to be broken in 14.1 for my taste. For instance, I use the client server features in X windows and can't seem to make it work in 14.1.

Good Luck, and I will watch for other contributor's ideas about your problem.

metaschima 12-30-2013 12:10 PM

This is a difficult question and we don't have much info to go on. Try creating a large file using dd and see the throughput.
Code:

dd if=/dev/zero of=file bs=4M count=1000 conv=fdatasync

roderich 12-31-2013 08:46 AM

Well, dd brings similar results.

On my main disk: read 200 MB/s write 2-5 MB/s

I have meanwhile setup a (LVM) test partition on this disk and I see already only 5-7 MB/s when writing to the raw block device.

I have also made the same experiments with some external USB disks and get: read 30 MB/s write 2-5 MB/s.

So it is beginning to look to me as if something general in some lower disk IO level is going wrong. And I think this (bug?) is triggered by something special in my system configuration, as nobody else seems to have the problem (various searches in the net brought nothing similar).

metaschima 12-31-2013 11:32 AM

You could try changing the I/O scheduler and see if it helps:
http://www.cyberciti.biz/faq/linux-c...-for-harddisk/

roderich 12-31-2013 01:24 PM

Thanks, this looked interesting, because it could have been one of these cases where a default might have been changed silently in a newer kernel version but it did not have any effect. :(

I noticed another thing which looks different as it used to be, regarding kernel memory usage:

Code:

            total      used      free    shared    buffers    cached
Mem:      16585240  10533272    6051968          0    368408    4772448

Isn't it so that the Linux kernel has the tendency to use all available main memory for buffers and caches?
I have the feeling that it does not do this currently on my system.
And not enough buffer storage could also cause slowdowns.

metaschima 12-31-2013 02:08 PM

From the 'hdparm' command this HDD seems to be a very new, high performance one, am I right ? I just bought a new one (Seagate Barracuda 1TB) with slightly less performance than yours according to hdparm.

Code:

bash-4.2# hdparm -tT /dev/sda                                   

/dev/sda:
 Timing cached reads:  30494 MB in  1.99 seconds = 15358.98 MB/sec
 Timing buffered disk reads: 484 MB in  3.01 seconds = 161.05 MB/sec

bash-4.2$ dd if=/dev/zero of=file bs=4M count=1000 conv=fdatasync
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB) copied, 26.9838 s, 155 MB/s

Maybe try running a SMART long test using smartctl and see if the HDD passes. I'm not sure how to explain 5 MB/s write speed on a new HDD.

roderich 01-01-2014 07:32 AM

Quote:

Originally Posted by metaschima (Post 5089773)
From the 'hdparm' command this HDD seems to be a very new, high performance one, am I right ? I just bought a new one (Seagate Barracuda 1TB) with slightly less performance than yours according to hdparm.

Yes, it is a similar disk, a 2TB Seagate Barracuda named ST2000DM001-9YN1

Quote:

Originally Posted by metaschima (Post 5089773)
Maybe try running a SMART long test using smartctl and see if the HDD passes. I'm not sure how to explain 5 MB/s write speed on a new HDD.

I still think, this is not the real disk speed, see my next message for the current state of research.

roderich 01-01-2014 07:51 AM

The whole affair is still very strange.

First, I seem to have fallen again into a classical trap of spurious correlation, when I thought that the Slackware 14.1 upgrade was the cause for my problem.

Following this track I had now re-installed the kernel files of the previous Slackware version, but did not really get the expected improvements. Only the necessary reboots changed something, I have now similar results with both kernel versions. The write speed is still not fast, but tolerable, no longer inconvenient.

The actual figures are now:
12 MB/s with the kernel 3.2.29
14 MB/s with the kernel 3.10.17

And I see the same speed with my internal SATA disk and several external USB disks. !!!!
Therefore I still believe that it is not a problem with the disks as such, but something in the Linux low level disk IO, buffer handling, bus access or whatever.

For the moment I can live with the situation, I have all my usual programs running and so far the speed has not gone down again, let's see how it develops in the coming days. ;)

metaschima 01-01-2014 12:14 PM

I remember someone complaining of unexplained low HDD throughput and it turned out to be vibration or movement as the main cause. If this is a laptop try putting it on a solid surface.

I still recommend running the SMART test, as a HDD will slow down before it breaks.

roderich 01-02-2014 09:22 AM

Quote:

Originally Posted by metaschima (Post 5090224)
I remember someone complaining of unexplained low HDD throughput and it turned out to be vibration or movement as the main cause. If this is a laptop try putting it on a solid surface.

Mine is a classical desktop tower standing firmly on the floor. ;)

Quote:

Originally Posted by metaschima (Post 5090224)
I still recommend running the SMART test, as a HDD will slow down before it breaks.

Hmm, should do no damage. Could you elaborate a bit? How would I go about to run such a test? Linux smartmontools? Is this a non-destructive test which can be done in the running system?

metaschima 01-02-2014 11:06 AM

Yeah, run:
Code:

smartctl -t long /dev/sda
Wait for it to finish and then run:

Code:

smartctl -a /dev/sda
to check the results. The test can be done on a running system without problems. However, try keeping disk usage to a minimum if you want the test to be done on time, otherwise it will take longer. It takes longer on a larger disk. For example it takes 2 hours on a 1TB disk.

roderich 01-03-2014 04:52 AM

Quote:

Originally Posted by metaschima (Post 5090718)
Yeah, run:

The test can be done on a running system without problems. However, try keeping disk usage to a minimum if you want the test to be done on time, otherwise it will take longer. It takes longer on a larger disk. For example it takes 2 hours on a 1TB disk.

Thanks for the smart hints ;) On my disk it took nearly 4 hours. I let it run over night and here are the results:

Code:

smartctl -l selftest /dev/sda

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error      00%      5932        -
# 2  Short offline      Completed without error      00%        3        -

I think this rules out hardware problems with the disk as such.

So again I suspect some strange Linux kernel errors triggered by my configuration.
Funny thing is that the PC really has abundant resources for everything (it was on purpose bought like that ;)
and should be blatant fast.
Intel Quad-Core i7-3770 CPU (8 virtual CPus)
16GB RAM
2TB SATA-III disk
and as far as I see in the startup logs all the disk goodies are enabled (UDMA, AHCI, NCQ).

My current suspects go into the direction multiprocessing, memory management.
The kernel runs in 32-bit PAE mode, because for various reasons I did not (and still do not) want a full 64-bit system.
I will try to make some tests in a quiet hour with other Linux Live CDs and other kernel options (non-SMP, non-PAE, 64-bit) to see if these make any difference.


All times are GMT -5. The time now is 01:51 PM.