LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 01-09-2018, 12:01 PM   #16
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,671

Rep: Reputation: Disabled

Intel microcode built into kernel.
 
1 members found this post helpful.
Old 01-11-2018, 01:25 AM   #17
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
Well, I now know one thing that running without acpi does: it stops the computer shutting down properly. It just types "Halted" and stares at you. But I always switch off at the mains anyway, so that's not really a problem. On the other hand I don't like the thought that I might be risking my processor overheating, so I don't think I'll go down that road.

Yesterday I was playing about with microcode (thank you Emerson for that Gentoo link). I tried it first in a kernel that I knew worked. It booted just fine, but dmesg showed that the acpi bug was still present. Then I tried it in 4.14.12 and got the usual panic. I can't prove that the acpi bug is causing the panic but it seems quite likely.

Today I shall download and build 4.14.6. I want eventually to identify the two kernels on either side of the regression, do a diff and find out which acpi files have changed. Then I want to try a hybrid tree. I read somewhere some time ago about doing this as a way for non-programmers who have time on their hands to help track down kernel bugs. The hybrid usually builds successfully and then you have a patch.

There is also another possibility. I read somewhere (don't know if I can find it again) that you can cure acpi syntax errors by dumping /dev/nvram into a file and running a program that disassembles it. Then you can can correct the error, reassemble it and write it back. A bit dangerous but a lot less so than flashing the bios.
 
Old 01-11-2018, 12:54 PM   #18
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
I've found the regression. It is in 4.14.1. This kernel panics for me, whereas the last kernel of the 4.13 series (4.13.16) doesn't. But I'm a bit at a loss on what to do next. Running diff on the acpi part of the tree gave a load of stuff. Most of it seems to be in drivers/acpi/acpica.

Maybe I should do parallel menu builds so that I can see if 4.14.1 has extra options which I should block.
 
Old 01-11-2018, 01:02 PM   #19
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,671

Rep: Reputation: Disabled
hazel,

not all options are visible and user controlled. Run 'make nconfig' and hit F4 to see all of them.
 
Old 01-13-2018, 02:31 AM   #20
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
Quote:
Originally Posted by Emerson View Post
hazel,

not all options are visible and user controlled. Run 'make nconfig' and hit F4 to see all of them.
So I had a look, but I don't see any ultimate use for this. If I can't change it, forgeddaboutit!

Yesterday, I completed my "hybrid kernel" project. Very finicky work but fascinating. Basically, I took 4.14.1, removed the acpi driver tree, and replaced it with the working tree from 4.13.16. Of course some of the old code wouldn't build in the new environment, so I had to selectively restore those files to 4.14.1. I knew that there were basically three possible outcomes:

1) The bug would be left behind and the kernel would boot.
2) The files that I was forced to restore to 4.14 would bring the bug with them.
3) Something else would go wrong and prevent booting because of the mixed code. But in that case, the panic traceback would look different.

In fact the outcome was 2) so the bug must be in one of those files that I was forced to upgrade again. I can provide a list of them if anyone is interested. Obviously no kernel developer is going to care about my obsolete hardware when they are up to their eyebrows in Meltdown- and Spectre-control so this is probably as far as it goes.

I've already said that I am not even going to try to flash my BIOS. As long as there are still reasonably up-to-date kernels that bigboy can run, I'm not going to risk killing him prematurely. But there is another possibility, which I found in the Arch Wiki. There is a program called iasl, which can read the acpi code from /sys/firmware/acpi/tables and disassemble it. Then you can correct the errors and reassemble it into a blob to be loaded into the kernel, much like vendor-supplied microcode. That isn't dangerous because you're not making permanent changes in anything. Whether at my age I still have enough brainpower to understand how to do this is another matter.

Last edited by hazel; 01-13-2018 at 02:34 AM.
 
Old 01-14-2018, 08:04 AM   #21
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
I built iasl in Crux because the documentation says it should be 32-bit and Crux is multilib while lfs isn't. It's the first time I've ever built a 32-bit program in a 64-bit system, and I had to jiggle the compiler flags a bit, but it runs just fine.

The instructions in the Gentoo wiki say to use iasl to decompile a copy of the dsdt file in /sys/firmware/acpi/tables and then recompile it, looking for compilation errors. Well, I didn't get any errors but I got over 70 warnings! That file was compiled by Microsoft's asl compiler and it's as full of bugs as a slum landlord's mattress. One of the warnings seems to relate to the "bug" that earlier kernels report at boot time; that seems the likeliest cause of the later kernel panic. Whether I can learn enough to correct it, I don't know.
 
Old 01-16-2018, 11:18 AM   #22
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
I found an outright syntax error in an anciliary file called ssdt. I corrected that and got the warnings in dsdt down from 74 to about 30. Then I assembled the new files, put them into an initrd image and tried to boot the 4.13.16 kernel with this initrd. It panicked! So obviously at least one of my acpi syntax corrections introduced a logic error somewhere else in the program. I shall have to do this step by step, I think: one "correction" at a time.

@Emerson The great advantage of using an initrd compared with putting a blob in the kernel is that you can do experiments like this without having to rebuild the kernel repeatedly.

The funny thing is that the panic messages from the 4.13 kernel were, as far as I could see, identical to those which 4.14 kernels give with the built-in acpi. I commented out the initrd instruction in lilo and 4.13.16 now boots normally.

Oh, well! Back to the drawing board.
 
Old 01-30-2018, 01:28 PM   #23
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
I've got the syntax warnings down to 22 now and can still boot a 4.13 kernel. The three acpi warnings given by the kernel are down to two. But those, as far as I can see are caused by a run-time logic error, not bad syntax. Even though I am reading up on the ASL language, I doubt if I will ever find out what is wrong there. And I don't know that those reported errors are actually what is causing the 4.14 kernel to panic, though it seems reasonable to suppose that.

In the mean time Linux-4.15 has come out, so I built that today and tried booting from it, hoping that they may have corrected whatever was wrong in the 4.14 series. No such luck! Clearly they are not going to correct it just to satisfy someone who can't be bothered to upgrade their hardware. For the time being, there are two LTS kernels, 4.9 and 4.12, that I can use. But one day in the future, I am going to run out of options and will need a new computer. Boohoo!

PS: The next LFS (which will be out soon) uses 4.15. I'll just have to build it with an older kernel.

Last edited by hazel; 01-30-2018 at 01:31 PM.
 
Old 01-31-2018, 02:16 AM   #24
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,496

Rep: Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373
Time imho to get on to maintainers. There's a list or forum on linux-, laptop.net where some used to hang out and I would post there. There's also countless options whose main function is to disable sections of code for Thinkpsds and other dodgy laptops and you may make headway with them?
 
Old 01-31-2018, 02:37 AM   #25
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
Quote:
Originally Posted by business_kid View Post
Time imho to get on to maintainers. There's a list or forum on linux-, laptop.net where some used to hang out and I would post there. There's also countless options whose main function is to disable sections of code for Thinkpsds and other dodgy laptops and you may make headway with them?
But is there some group like that for desktop towers? Because that's what Bigboy is.

Today, out of curiosity, I copied over the dsdt for Littleboy, my Samsung laptop, hoping to see how that handles the _OSC method. Unfortunately it doesn't use that method at all. Unsurprisingly, a quick look at dmesg|grep ACPI showed that the reported errors I have been chasing on Bigboy don't occur on Littleboy at all. Laptops are obviously a very different kind of beast. I'll build a 4.14 kernel on it and see if it boots; I suspect it will without problems.

Next week I shall be paying a visit to my friend the former computer virgin to give her a lesson on LibreOffice templates. She uses a Dell Dimension tower which I fixed up for her. I shall copy her dsdt and see what it looks like.
 
Old 01-31-2018, 05:10 AM   #26
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
Here it is:
Quote:
Originally Posted by ACPI specification
Note: Creation of a named object more than once in a given scope is not allowed. As such, unconditionally creating named objects within a While loop must be avoided. A fatal error will be generated on the second iteration of the loop, during the attempt to create the same named object a second time.
And here is the code in my dsdt.dsl file:
Code:
 While (Local0)
            {
                CreateDWordField (CAPB, Local1, CAPD)
                If (And (CAPF, 0x01))
                {
                    If (LEqual (Local1, 0x01))
                    {
                        And (CAPD, 0x09, CAPD) /* \_SB_.PCI0._OSC.CAPD */
                    }
                    Else
                    {
                        Store (Zero, CAPD) /* \_SB_.PCI0._OSC.CAPD */
                    }
                }
                ElseIf (LEqual (Local1, 0x01))
                {
                    If (And (CAPD, 0x08))
                    {
                        Store (Zero, \_SB.PCI0.PEG1.PMGE)
                        Store (Zero, \_SB.PCI0.PCX1.PMCE)
                        Store (Zero, \_SB.PCI0.PCX2.PMCE)
                        Store (Zero, \_SB.PCI0.LPC.BPEN)
                    }
                }

                Decrement (Local0)
                Add (Local1, 0x04, Local1)
            }

            Return (CAPB) /* \_SB_.PCI0._OSC.CAPB */
        }
That CreateDWordField inside the while loop is causing the trouble. The question is, how do I reword it? Maybe I could take the Create out of the loop (so that the field only gets created once) and replace it within the loop by something equivalent like Store. Or alternatively use a CondRefOf to create a conditional reference to the field and only create it if it doesn't yet exist.

PS: NO! Neither of those solutions will work. The CAPD field is actually a series of fields travelling along the buffer at an offset that increases by 4 bytes at every iteration. I think the old version will have to be removed each time a new version is created.

Last edited by hazel; 01-31-2018 at 08:56 AM. Reason: Added PS
 
Old 01-31-2018, 11:18 AM   #27
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
Back to the drawing board!

According to the ACPI spec, the _OSC method is optional. Its purpose is to allow legacy OS's to use newer hardware. I have the opposite problem: legacy hardware and an up-to-date OS. Also Littleboy manages to run Linux without having this method defined. So I removed it from Bigboy. Now my older kernels boot without reporting any acpi errors at all. Problem solved? No, unfortunately. Because the 4.14 kernel still panics on boot.

I had been working throughout on the assumption that the acpi errors reported by older kernels were what was causing the newer kernels to panic. Seemingly I was wrong, unless these kernels need _OSC to be present and syntactically correct. But the two acpi driver files that actually use _OSC don't seem significantly different between 4.13.15 and 4.14.1.
 
Old 02-01-2018, 03:25 AM   #28
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,496

Rep: Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373
Back to the drawing board? You have my sympathy.

Let's start with a photo/transcription of the panic. Go for the latest kernel in your collection, because if you go to the LKML, They will want the latest git
 
Old 02-01-2018, 05:48 AM   #29
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,704

Original Poster
Blog Entries: 19

Rep: Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503Reputation: 4503
Quote:
Originally Posted by business_kid View Post
Back to the drawing board? You have my sympathy.

Let's start with a photo/transcription of the panic. Go for the latest kernel in your collection, because if you go to the LKML, They will want the latest git
I can give you a transcription of the last 15 or so lines of the panic traceback. Unfortunately I can't get at the beginning of the traceback, which is where I would expect to find the actual error. Since the ACPI driver is loaded fairly early in the boot sequence, there are no logs that show it, only what is visible on the screen.

The latest kernel I have tried is 4.15.0 and that panics too.

Do you really think the kernel devs would be interested in this? I'm pretty sure they'd just say, "Get your BIOS flashed or buy a proper modern computer. Nobody else is having these problems."
 
Old 02-01-2018, 11:29 AM   #30
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,496

Rep: Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373
The short answer is no, they probably wouldn't like it but they would react.

In 2008 I bought a laptop with a crappy ATI video card - really crappy. In 2014/15 that hit a console blanking issue (Consoles would blank to white). I had to build gits of 3.19.0-rc1, and got a patch on that. So they6 did react and patch it - with some annoyance, but they patched it.

What made this funny was that there was an elementary syntax error in the patch, which even my primitive knowledge of C could pick up. So when it didn't work, I edited the patch, and that worked. If the guy wasn't irritated already, he was now! But my edit to the maintainer's patch went upstream. It's just what they do.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
/etc/acpi/acpi_handler.sh and running a confirmation script. Romanus81 Slackware 10 01-21-2009 10:25 AM
Running 'xfce-session-logout' from acpi script ewolf Linux - Software 3 12-06-2007 11:05 AM
ACPI: 2.6.17.7; nothing in /proc/acpi/fan; fans not running; problem not seen before. zetabill Linux - Kernel 1 07-30-2006 04:27 PM
Changing ACPI cpufreq when running on battery Yalla-One Slackware 4 12-12-2005 08:15 AM
acpi help on hp pavilion ze4420 laptop running on mandrake 9.2 tinabeans Linux - Laptop and Netbook 0 06-14-2004 05:23 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 05:25 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration