LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 10-12-2019, 07:37 AM   #16
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled

Yeah laptop is definitely dodgy; I plugged the drive to another computer and could boot on it and use it fine. Unfortunately that other computer is amd-based so the intel_idle cstate should be meaningless.

I managed to freeze the laptop without its drive from the liveUSB, which makes things even weirder. I'm testing RAM again now. I'm still really puzzled by the processor.max_cstate having no effect.
 
Old 10-12-2019, 09:36 AM   #17
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
@Twigster

In post #11 I advised you to disable "Device-Initiated Power Management" for your HardDrive in the BIOS settings, have you tried that?

The problem you report with your lilo kernel boot parameters issue, not passing the processor.max_cstate=1 parameter, is indeed puzzling. I just tried it now and it works on a standard 14.2 Slackware installation (no multilib).
According to this post:
https://access.redhat.com/articles/65410
I also added idle=poll
My lilo.conf append line (note that I removed the space before vt.default_utf8=0):
Code:
append="vt.default_utf8=0 processor.max_cstate=1 idle=poll"
dmesg result:
Code:
dmesg | grep cstate
[    0.046683] Kernel command line: BOOT_IMAGE=LinuxNew ro root=801 vt.default_utf8=0 processor.max_cstate=1 idle=poll
processor.max_cstate is listed as supported in the kernel documentation:
https://www.kernel.org/doc/Documenta...parameters.txt
and it doesn't look to be dependent on the CPU pm driver (intel-pstate or acpi-cpufreq)
I just checked, on this older Atom CPU where I did the tests I'm using acpi-cpufreq - inte p_state works only on new processors:
Code:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
acpi-cpufreq
If on your system intel-pstate is loaded by the kernel, you can disable it by adding the intel_pstate=disable kernel boot parameter, acpi-cpufreq will be used instead.

I believe your CPU has the C-states implemented, that's according to this old Intel doc:
https://software.intel.com/en-us/blo...-more-c-states

Finally, you should try using an older second-hand HDD drive, a spinning one. You could use some older ones that were destroyed by Windows 7 (notorious for that). Usually a zone at the beginning of the drive - first 20-30GB are filled with bad sectors due to the swap file. If you omit that zone and define the partitions after the first, say 50-100GB, then you can use the drive for long-long time I have a few such drives that are working perfectly.
 
1 members found this post helpful.
Old 10-13-2019, 04:21 PM   #18
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
Hello,
Quote:
In post #11 I advised you to disable "Device-Initiated Power Management" for your HardDrive in the BIOS settings, have you tried that?
yep, DIPM is disabled, and so is SpeedStep.


I've done the steps you described :
Code:
~# cat /etc/lilo.conf | grep append
append="vt.default_utf8=0 processor.max_cstate=1 idle=poll"
Code:
~# dmesg | grep cstate
[    0.000000] Command line: BOOT_IMAGE=Linux ro root=802 vt.default_utf8=0 processor.max_cstate=1
[    0.000000] Kernel command line: BOOT_IMAGE=Linux ro root=802 vt.default_utf8=0 processor.max_cstate=1
It does not care about the idle=poll parameter, somehow, it's probably due to :
Code:
~# dmesg | grep intel_idle 
[    0.553238] intel_idle: does not run on family 6 model 15
So that could explain :
Code:
~# cat /sys/module/intel_idle/parameters/max_cstate 
9
if intel_idle does nothing.


Also, my computer does not have any cpufreq directory so I can't tell which pm driver is used:
Code:
:/sys/devices/system/cpu/cpu0# ls
cache/          microcode/  thermal_throttle/
driver@         power/      topology/
firmware_node@  subsystem@  uevent
Here is more info about loaded modules. According to this, I do not have anything regulating CPU frequency and it should stay at the maximum of 1.6Ghz all the time.
/proc/cpuinfo shows nothing on the "power management" line, do you get an entry with your atom CPU?


Quote:
Finally, you should try using an older second-hand HDD drive, a spinning one. You could use some older ones that were destroyed by Windows 7 (notorious for that). Usually a zone at the beginning of the drive - first 20-30GB are filled with bad sectors due to the swap file. If you omit that zone and define the partitions after the first, say 50-100GB, then you can use the drive for long-long time I have a few such drives that are working perfectly.
Thing is, the drive seems out of cause when a) it works good when used as a boot drive on another pc and b) the pc can freeze without any drive inserted and only from the liveUSB

Last edited by Twigster; 10-13-2019 at 04:43 PM.
 
Old 10-13-2019, 06:36 PM   #19
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
It's been a wile since I played with Core 2 CPUs, retired them all and only running on new Core "i" now, but AFAIK a performance scaling driver was running on these Core 2 systems - acpi-cpufreq should load by default on older Intel CPUs. Note that these drivers are not built as external modules, at least not in the huge kernel I'm usually running, therefore you won't see them with lsmod.

The only very old system I own and still running (doesn't want to die) is an Atom N270, pretty much the same age as your laptop, and on this system I'm using the SpeedStep technology (enabled in BIOS) and have acpi-cpufreq active and handling the CPU clock.
You did disable SpeedStep in you BIOS and maybe that's why you don't have a performance scaling driver activated. I didn't recommend disabling SpeedStep, but was only focusing on the c_states, speculating that the CPU entering a deep c_state might turn the system unstable.
I only advised to disable the Device-Initiated Power Management, which you did and it didn't help.

On idle=poll, I believe it's related to cpuidle and not intel_idle. In one of your older dmesg logs you have:
Code:
[    0.129008] cpuidle: using governor ladder
[    0.133007] cpuidle: using governor menu
If you want to dig deeper into these CPU related drivers:
https://www.kernel.org/doc/html/v5.0...m/cpuidle.html
https://www.kernel.org/doc/html/v5.0...m/cpufreq.html

I just now read your post #16 and realized that your system MB&CPU&RAM is behaving weird, before #16 I was still considering the HDD drive (SSD) ATA-AHCI standard to be a possible cause for the instability.

One thing I'd try is to remove the CD-ROM (I believe it's PATA on your old system), that unit is affecting the SATA controller behavior and could also be the cause of instability. Just a try ... before you dump that system.

P.S. Actually you could temporarily remove everything that's modular and easy to dismount, the WiFi module comes to mind. Try to narrow down the issue, replace the RAM modules too. Run it on batteries only or on DC Adapter only. etc.
(On those older systems you could even replace the CPU - it wasn't soldered on the MB, but had a thin plastic CPU socket)

Last edited by abga; 10-13-2019 at 06:49 PM. Reason: P.S.
 
Old 10-14-2019, 03:49 AM   #20
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
Hello, and thanks again for your time
You were right about acpi-cpufreq, that is what my computer uses in
Code:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
So If I understood what you said,
Code:
processor.max_cstate
should have an effect no matter if the power driver is intel_idle or acpi_cpufreq. Now my issue is that I do not know how to confirm what cstate parameter is taken into account by acpi-freq.

If I look here : https://github.com/torvalds/linux/bl...acpi-cpufreq.c
https://forums.opensuse.org/showthre...requency/page3

it seems I have to pass acpi_pstate_strict=1 as well :

Code:
root@lagann:~# dmesg | grep acpi_p
[    0.000000] Command line: BOOT_IMAGE=Linux ro root=802 vt.default_utf8=0 processor.max_cstate=1 idle=poll acpi_pstate_strict=1
[    0.000000] Kernel command line: BOOT_IMAGE=Linux ro root=802 vt.default_utf8=0 processor.max_cstate=1 idle=poll acpi_pstate_strict=1
~# cat /sys/module/acpi_cpufreq/parameters/acpi_pstate_strict 
0
Weird, right? I would assume the kernel would tell me that it failed to set the parameter or something.

Last edited by Twigster; 10-14-2019 at 04:09 AM.
 
Old 10-14-2019, 06:30 AM   #21
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
Usually the kernel will inform you that a chosen parameter could not be considered, the code should be mature enough to handle exceptions properly.
I wish I could help you more with the CPU PM states, but I'm also using the internet to learn about particularities. The last time I did some more investigation was in this thread (you might find the discussion and some of the links I posted useful):
https://www.linuxquestions.org/quest...or-4175637326/

You said that your system was working well under Windows XP and that made me to believe that there could be some PM related issues under Linux (Kernel+Drivers) that are causing your system to behave unstable. First I considered the CPU c-states, just because I found references about them causing instabilities on the internet.
In parallel to the c-states I was also thinking to advise you to disable all the PM related features in the kernel (acpi), but I considered the kernel code sound enough and advised you to play with the BIOS first (disable all PM there).

I was looking for Linux experiences on the D620 and found a few interesting links on the internet, out of which only one reports instability issues like yours:
http://seclab.cs.stonybrook.edu/seka...untu-d620.html
" Resumption fails once in a while -- may be once in 20 times. It is rare enough that it does not seem to bother me. In fact, it seems no more frequent than random lockups I experience once in a while that require a reboot. (May be once in 10 days.) I wonder if these lockups have any thing to do with bugs reported in Core 2 duo processors --- I bring this up because the lockups leave absolutely no error messages or indication of any thing at all going on at the time of lockup. "
Given the above observation I'd suggest to start your system with the acpi disabled (apic & lapic too). Your new lilo.conf append line should look like:
Code:
append="vt.default_utf8=0 acpi=off noapic nolapic"
You can find references about these parameters in the kernel doc:
https://www.kernel.org/doc/Documenta...parameters.txt
And some extra info in the following links:
https://access.redhat.com/solutions/58790
https://askubuntu.com/questions/5209...apic-etc/52100

Some other old links I found detailing Linux experiences on D620 (none of them mentioning stability issues):
https://www.fzu.cz/~kolorenc/d620/
https://wiki.archlinux.org/index.php/Dell_Latitude_D620
https://wiki.ubuntu.com/LaptopTestin...llLatitudeD620

I mentioned in an older post that I used such an D620 system with a company I worked for, it was loaded with Windows XP, but I remember some of my colleagues choose to have Linux on them and I don't recall any stability issues. I stayed on Windows XP because there were some tools that we were using for testing and configuring the products we developed and deployed, those tools were coded and working better under Windows...

Last edited by abga; 10-14-2019 at 06:31 AM. Reason: you=your
 
2 members found this post helpful.
Old 10-14-2019, 08:13 AM   #22
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
I'm still reading your links (you are a true detective i've tried searching before posting here and never found those) I tried the append line you mentioned and now I lo longer have the power manager applet in xfce (I guess thats normal if we do not load any power manager program) but my screen is stuck to 1024x768..

Xorg.0.log, I do not have any xorg.conf file
Xorg.O.log.old, as a comparison

We can see that the module "i915.ko" does not load now. How do I get more information about what is happening?

I do not see what links power management and the display driver that would cause such a problem
It's also worth nothing that I do not have any problems with hibernation.

EDIT: this is what happens when i try to load the module manually :
Code:
:/lib/modules/4.4.190/kernel/drivers/gpu/drm/i915# insmod i915.ko 
insmod: ERROR: could not insert module i915.ko: Unknown symbol in module

Last edited by Twigster; 10-14-2019 at 09:43 AM.
 
Old 10-14-2019, 08:59 AM   #23
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
In the Xorg.log you posted, the laptop display is enumerated with the native resolution of 1024x768, have no clue why it doesn't see it as 1280 x 800.
Disabling acpi for good is a radical solution that can have some other unwanted consequences. I advised you to do it just because I wanted to check if it's the acpi that's causing the stability issues. For the moment live with the 1024x768 and just observe your system, if it's stable now we could try some "softer" fixes. One would be to play with the acpi_osi kernel boot parameter (we could try acpi_osi=Linux ), some detailed info:
- look after the acpi_osi= section in:
https://www.kernel.org/doc/Documenta...parameters.txt
- and here some additional explanations:
https://unix.stackexchange.com/quest...ight-vendor-do

Here in this ArchLinux article you can learn about the kernel acpi modules and the devices that are supported and affected if acpi is disabled.
https://wiki.archlinux.org/index.php...are_available?

And some HW could not be properly identified/initialized - drivers/modules might need some manual tuning/loading:
https://en.wikipedia.org/wiki/Acpi#Architecture
"As ACPI also replaces PnP BIOS, it also provides a hardware enumerator, mostly implemented in the Differentiated System Description Table (DSDT) ACPI table. "
 
2 members found this post helpful.
Old 10-14-2019, 09:49 AM   #24
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
Ok I will use the system some more and see if hangs still happen. Thing is, I cannot reproduce the hangs easily. Now that we suspect either i915.ko or something acpi-related, is there specific actions I can do to stress those parts of my system and make it more likely to hang?
 
Old 10-14-2019, 10:10 AM   #25
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
I also noticed that there was an error loading the i915 module in your Xorg.0.log, but focused only on the display.
On your error:
Code:
:/lib/modules/4.4.190/kernel/drivers/gpu/drm/i915# insmod i915.ko 
insmod: ERROR: could not insert module i915.ko: Unknown symbol in module
Try:
Code:
/sbin/depmod 4.4.190
/sbin/modprobe i915
and check dmesg for info about success/failure.

No idea what to test to reproduce the hangs/reboots, just work with it as you did before.
 
Old 10-15-2019, 05:11 AM   #26
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
Hey. I'm still waiting to see if it's gonna hang.

In the meantime I've done what you recommended :
Code:
:~# depmod 4.4.190
~# modprobe i915
modprobe: ERROR: could not insert 'i915': No such device
Nothing showed up in dmesg.

When i did insmod i915.ko :
Code:
[ 3688.412638] i915: Unknown symbol acpi_lid_notifier_register (err 0)
[ 3688.419600] i915: Unknown symbol acpi_lid_notifier_unregister (err 0)
[ 3688.419923] i915: Unknown symbol acpi_lid_open (err 0)
 
Old 10-15-2019, 05:01 PM   #27
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
Well, you disabled acpi. I warned you in #23 that this is a radical solution and it might have some other unwanted consequences. There are already 3 such consequences available:
1. The Intel graphic adapter is not recognized by your system - a possible workaround - see:
https://www.kernel.org/doc/Documenta...parameters.txt
- in section acpi=, last line "See also.." try adding to your actual lilo kernel append line: pci=noacpi
2. Since your system doesn't recognize the Intel GPU, your X server is defaulting on the vesa frame buffer - just took a look again at your Xorg.log
3. The errors you got form your manual attempt: insmod i915.ko are acpi related, and, expected, since the kernel is instructed to drop the acpi support with the acpi=off parameter.

Just keep observing your system with the acpi turned off and if you don't experience any hangs/reboots, then we could try some "softer" measures, like cancelling the acpi=off and playing with acpi_osi= , although it'll be a sort of trial and error territory and I cannot guarantee any success.
 
1 members found this post helpful.
Old 10-16-2019, 04:00 AM   #28
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
Ok, I will keep the noacpi noapic nolapic options, give it another day, and then try your #23 and #27 suggestions, cheers
 
Old 10-16-2019, 04:43 AM   #29
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,634

Rep: Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929Reputation: 929
If your system is stable now with acpi=off, I'll suggest to try adding the pci=noacpi first (as advised in #27). The odds for success are higher and if pci=noacpi will sort your graphic adapter, then you have a solution. I mentioned playing with acpi_osi= is guesswork...

Your new lilo append line should look like:
Code:
append="vt.default_utf8=0 acpi=off pci=noacpi noapic nolapic"
You could even dump "noapic nolapic", I don't believe they are helping, I'm 90% sure acpi is the bugger.
 
Old 10-18-2019, 11:11 AM   #30
Twigster
Member
 
Registered: Oct 2019
Location: France
Distribution: Slackware64 14.2
Posts: 58

Original Poster
Rep: Reputation: Disabled
Hello, sorry for not having answered quicker, I actually got back to this laptop only today.

Here's what I tested so far :
append="vt.default_utf8=0" -> crash
append="vt.default_utf8=0 noapic nolapic acpi=off" -> no crash no video driver
append="utf8xx acpi=off" -> no video driver no crash
append="utf8xx acpi=off pci=noacpi" -> no vid no crash
append="utf8xx pci=noacpi" -> no boot
append="utf8xx acpi_osi=Linux" -> video ok, no crashes so far

It seems you solved my issue, thanks =)
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problems with HP 3500c scanner and Dell D620 laptop runing Ubuntu 9.10 wa3fkg Linux - Laptop and Netbook 1 12-03-2009 09:37 AM
SuSE on Dell D620 laptop cvzyl SUSE / openSUSE 4 04-08-2008 06:53 AM
FC5 and Laptop Dell Latitude D620 zillah Linux - General 3 11-06-2006 11:51 PM
Dell d620 graphics support bagpussnz Linux - Laptop and Netbook 2 05-10-2006 03:04 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 12:48 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration