LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Blogs > hazel
User Name
Password

Notices


Rate this Entry

Bisecting a kernel

Posted 09-04-2018 at 09:12 AM by hazel
Updated 11-27-2019 at 11:53 AM by hazel

Sometimes you find yourself doing things that you would previously have considered as only suitable for geeks.

I don't consider it particularly geeky to build your own kernel. When I was starting out with Linux, hardware was pretty limited, and building a custom kernel was often the best way to get a quick boot. Nowadays I prefer to use the stock kernel if there is one, but two of my regular distros (Crux and LFS) require a hand-rolled kernel.

I have always regarded the Linux kernel as a singularly well-mannered program. It seldom causes any trouble. If you get a kernel panic when booting a new kernel for the first time, it's usually because you did something silly during configuration, like forgetting to build in the filesystem driver for your root partition. So I was quite surprised when I booted the 4.14 kernel which I had built for LFS-8.2 and it panicked while initialising acpi. Rebooting with "acpi=off" appeared to solve the problem.

Obviously that's not much of a solution. You really need acpi on a modern computer. I found a site https://01.org/linux-acpi/documentat...ux-acpi-issues that provided guidance on troubleshooting acpi problems by using various levels of acpi restriction, but I found that even the most conservative one did not stop the panic. Only when acpi was completely switched off at boot (or the acpi driver not built in the first place) would this kernel boot on my machine.

This is what is known as a regression: something that worked perfectly well before suddenly stops working after an update. The recommended way to troubleshoot regressions is to do a "bisect". This requires you to install (or, in the case of LFS, build) git and then download the git tree for the program. The git tree for the kernel is about 3 GB, equal to my entire monthly download allowance. Fortunately a friend agreed to download it for me and I carried it home on a memory stick.

The next job is to find two endpoints for your search: a version that definitely works and one that doesn't. For a kernel bisect, you can identify the versions either by their git commit codes or by normal version numbers. However, if you use version numbers, you need to follow exactly the approved syntax, for example "v14.0". Use "v." instead of "v" and you'll be told there is no such version!

Enter the git directory that you have created and type (for example)
Code:
git bisect good v13.0
followed by
Code:
git bisect bad v14.0.
git will chunter along for a while and then check out a kernel exactly halfway between the good and bad versions. It will also tell you how many versions are still to check. This will probably be a huge number, over 1000, but don't worry: it shrinks very fast.

Now copy over a suitable configuration file as .config (or type
Code:
make defconfig
to create one. You can use
Code:
make menuconfig
to add your own configuration choices. Remember to build in the sata driver and your root filesystem driver or it won't boot without an initrd. Another useful option is to reboot after a certain number of seconds in case of panic (you'll find it under "kernel hacking").

Copy arch/x86/boot/bzImage to your /boot directory and add it to your bootloader's menu. Also install the modules with
Code:
 make modules_install.
Now reboot and see what happens. Go back into your git directory and type either
Code:
git bisect good
or
Code:
git bisect bad.
as appropriate. git will check out a new kernel tree and tell you how many more bisection steps there are. It will only be half the previous number!

Continue to build and test kernels in this way until you reach the commit that caused the problem. You will know when you have got there because git will display a full description of that “bad” commit. Do not stop building and testing until you see this final display, or you will fix on the wrong commit!

A full log of the bisection will be found in .git/BISECT_LOG. Copy it to a safe place for reference. Subsequently you can get a fresh description of the bad commit by using:
Code:
git bisect show
The next step is to correct the problem. In my case, the "bad commit" added several snippets of text to /arch/x86/mm/ioremap.c. Most of these were comments, but there were two lines of active code that had been deleted as unnecessary. I reintroduced them by hand. When the edited kernel was rebuilt, it no longer panicked.

Notice that although the panic occurred in the acpi driver, it was actually a memory mapping problem. I would never have guessed it!

The final stage is to create a patch that can be used on published kernels. This can be done as follows:
1) Untar the kernel you want to use and make the necessary edits.
2) Configure, build and install the kernel. Make sure it behaves properly.
3) Use "make mrproper" to clear the kernel tree of built files. Then change the top level directory name, for example from linux-x.y.z to linux-x.y.z.new.
4) Untar the kernel again. You now have two parallel trees, one of them modified. Now run diff to make the patch:
Code:
diff -aur linux-x.y.z linux-x.y.z.new > linux-panic-patch
The file linux-panic-patch can then be used to patch other kernels.
Posted in Linux kernel
Views 108 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 05:26 PM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration