LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   how to add system call (https://www.linuxquestions.org/questions/linux-general-1/how-to-add-system-call-584365/)

rakesh_roy25 09-13-2007 04:48 AM

how to add system call
 
Any body here can tell how can I add new system call without recompiling the kernel i.e. dynamically at runtime?

osor 09-13-2007 06:57 PM

Quote:

Originally Posted by rakesh_roy25 (Post 2890467)
Any body here can tell how can I add new system call without recompiling the kernel i.e. dynamically at runtime?

First of all, this is very certainly a bad idea. Perhaps you might explain why you might want to do this in the first place. Very likely, its solution can be made to work around this implementation nightmare. Second, there are a few caveats to keep in mind. Obviously, the solution will not be portable across arches. Moreover, the solution I present might not even be portable across kernel versions for the same arch. Also, as is customary with any new syscall (especially one which is temporary), no one else will be using it, so the only way to access it is either through assembly or through C with a wrapper such as the syscall() function found in <sys/syscall.h>. The problem with this is that the syscall numbers (__NR_foo) will not accurately reflect your new syscall unless you patch the appropriate headers you use for any application wanting to make use of it.

Now that you’ve read and understood my disclaimer, I’ll try to answer your original question:
YOU CANNOT ADD A NEW SYSCALL WITHOUT RECOMPILING THE KERNEL! The sycall table is a fixed-size array of pointers to functions (making it incredibly difficult to find and expand in-place). Additionally, __NR_syscall_max is used throughout the magic that makes kernel syscalls work, and it would be difficult to patch the live kernel image to change all occurrences this value (since it is not a symbol but a preprocessor macro).

You can, however, do something almost as effective. You see, along the timeline of kernel development (with respect to kernel-userspace relations), system calls have been replaced with newer system calls. But in order to keep a constant userspace API, the system call numbers have been left the same. So what happens to the deprecated system call? Well, for awhile it’s kept and both syscalls can be used simultaneously. Eventually, kernel development catches up with it, and it can no longer be maintained. As such the now-obsolete function is removed from the syscall table. What’s left is a “hole” (a number in the table which corresponds to no function). Nowadays, this hole is made to point to a placeholder function known in kernel-space as sys_ni_syscall(). Since nobody uses any of these holes any longer, the holes can be filled with whatever function you want. The problem is that the holes occur in differing amounts and in different locations depending on the architecture. Additionally, you can always go back to using an old enough kernel in which the hole is no longer a hole. In such a case, it might not really matter unless someone uses it. But if you use an ancient userspace with an ancient kernel, and your module usurps a system call which is not a hole, you will seriously break something. So as a rule of thumb, it would be best to fill the oldest holes first. You to find all holes for a given arch, run a recursive grep of “sys_ni_syscall” from within the arch’s subdir in the kernel tree. For example, in 2.6.22 and i386, we find holes in syscalls numbered 17, 31, 32, 35, 44, 53, 56, 58, 98, 113, 127, 130, 137, 167, 188, 189, 222, 223, 251, 273, and 285.

Basically, the hole replacement technique goes something like this:
Code:

#define __NR_fill_hole 17

asmlinkage long sys_fill_hole(int arg1, int arg2)
{
        printk(KERN_INFO "sys_fill_hole: arg1 is %d and arg2 is %d.\n", arg1, arg2);
        return 0;
}

extern void (*sys_call_table[])();

int init_module()
{
        sys_call_table[__NR_fill_hole] = sys_fill_hole;

        return 0;
}

Now, you will have to make some userspace headers to accommodate this new syscall.

wjevans_7d1@yahoo.co 09-14-2007 07:09 AM

Wow. This. Is. Positively. Evil.

Can't wait to try it.

osor 09-14-2007 11:20 AM

Another caveat: this becomes much more complicated for archs that implement vsyscalls. For example, on x86_64, you will have to fill a hole in the normal sys_call_table, and you will need to fill a (different numbered hole) in the i386 table conditional on whether the kernel supports them (which it usually does). Then, to access to the system call would depend on whether you use 32-bit or 64-bit assembly.

Here’s a simple example calling the sys_exit syscall from userspace. In the first code sample, we merely call the x86_64 version of sys_exit(0):
Code:

.section .text

.global _start
_start:
        movq $60, %rax
        movq $0,  %rdi
        syscall

The next one calls the i386 version of sys_exit(0):
Code:

.section .text

.global _start
_start:
        movl $1,  %eax
        movl $0,  %ebx
        int $0x80

Notice the difference in the system call numbers between the first and second. For x86_64, __NR_exit is defined as 60. For i386, __NR_exit is defined as 1. Even if the second one is assembled to produce a “64-bit” ELF executable, it will still call the i386 syscall (since it uses “int $0x80”).


All times are GMT -5. The time now is 02:58 AM.