Will rewrite kernel with length prefix string make things faster

lloydsu · 06-19-2014, 11:03 AM

I'm new to linux kernel, during the study I start wondering if rewrite entire kernel with length prefix string instead of null terminal string (for constants also) can make kernel be more efficient?

rtmistler · 06-19-2014, 01:34 PM

There's nothing stopping someone from using length prefixed string types in any of their programming, be that kernel, device drivers, or just regular application programming.

I'm wondering how many strings the kernel is using, except debug output or logs?

Can the kernel be more efficient? I'm sure it can be, I'd recommend you visit Kernel.org and read up on their efforts; if you are able to make improvements to either data types, as above or to performance, the kernel and Linux community would certainly be happy to benefit from your efforts.

I guess the first steps would be to establish a way to measure the performance metrics of your kernel.

Build the kernel the current way and obtain your metrics.

Then rebuild the kernel by changing the type definition of a string to use length prefixed strings; figure the most painless way to make that conversion, and then rebuild and take the metrics again.

sundialsvcs · 06-22-2014, 08:07 AM

... but also ... try to make a business-case for why such a potentially massive and therefore disruptive change would be justifiable, and exactly how length-prefix strings would seamlessly, reliably (100% accuracy is required) work with all the existing zero-terminated string constants.

lloydsu · 06-22-2014, 06:27 PM

Thanks for the reply

Like I said, I'm new to kernel, it is exactly the question that whether it is potentially worthwhile to do such data type change, and this is pure curiosity about why null terminal string is used, what's the reason behind it (except because we are using C),
At lease I think length prefix string is faster in strlen, strcmp and strcpy etc.?
So, I asked here and want to see what's the suggestion from you experienced guys
I think it might be difficult, but not impossible to change that.

As Rtmistler suggested there might be no much string operation there, so, I would try to find how many string operation there in the kernel first, thanks for the suggestion.

jpollard · 06-22-2014, 06:53 PM

Quote:

Originally Posted by rtmistler

There's nothing stopping someone from using length prefixed string types in any of their programming, be that kernel, device drivers, or just regular application programming.

I'm wondering how many strings the kernel is using, except debug output or logs?

Can the kernel be more efficient? I'm sure it can be, I'd recommend you visit Kernel.org and read up on their efforts; if you are able to make improvements to either data types, as above or to performance, the kernel and Linux community would certainly be happy to benefit from your efforts.

I guess the first steps would be to establish a way to measure the performance metrics of your kernel.

Build the kernel the current way and obtain your metrics.

Then rebuild the kernel by changing the type definition of a string to use length prefixed strings; figure the most painless way to make that conversion, and then rebuild and take the metrics again.

It isn't just the kernel strings - those are fairly constant (MAJOR exception are the formatted strings, and the entire sysfs and procfs are loaded with those). There are also a ton of strings passed to the kernel by userspace applications as well (again, mostly through the sysfs and procfs).

The usual problem is knowing "how large a number to allow" for the number of bytes. An 8 bit count is totally insufficient. A 16 bit count would work... except maybe for the drivers sending lots of strings (such as syslog) - messages are concatenated into one buffer... Then you get into the problem of "how to represent the string".

sometimes the string is static - in which case, the length must also be static... sometimes the string is dynamic, possibly from format conversions... So now, it becomes harder to use generic buffers (not impossible, just harder). It also becomes necessary to know which is which.

The usual problem with strings is that you need two lengths - size of the storage allocated, and the length of the used storage for data... And a storage class (static, dynamic).

What gets worse is trying to handle strings in a generic manner - when you don't know the length ahead of time, how do you determine how much you need. And you need to ensure suitable alignment for the lengths, and flags (if any).

The usual problem is that handling strings that way is slow.

lloydsu · 06-23-2014, 03:18 AM

Quote:

Originally Posted by jpollard

It isn't just the kernel strings - those are fairly constant (MAJOR exception are the formatted strings, and the entire sysfs and procfs are loaded with those). There are also a ton of strings passed to the kernel by userspace applications as well (again, mostly through the sysfs and procfs).

The usual problem is knowing "how large a number to allow" for the number of bytes. An 8 bit count is totally insufficient. A 16 bit count would work... except maybe for the drivers sending lots of strings (such as syslog) - messages are concatenated into one buffer... Then you get into the problem of "how to represent the string".

sometimes the string is static - in which case, the length must also be static... sometimes the string is dynamic, possibly from format conversions... So now, it becomes harder to use generic buffers (not impossible, just harder). It also becomes necessary to know which is which.

The usual problem with strings is that you need two lengths - size of the storage allocated, and the length of the used storage for data... And a storage class (static, dynamic).

What gets worse is trying to handle strings in a generic manner - when you don't know the length ahead of time, how do you determine how much you need. And you need to ensure suitable alignment for the lengths, and flags (if any).

The usual problem is that handling strings that way is slow.

Thanks for the reply.
I think the size of storage is irrelevant, null terminal string has the same problem.
And for some cases, like strncat, if you know the length of both string, you can determine the length of new buffer faster.
Why you think handling string that way is slower? I expect it would be an faster because determine the length of string is very slow.
Let me take a look on sysfs and procfs

It is a good point about how many bytes are needed to keep the length, usually sizeof(int) should be enough?
Seems it does complicate things when try to use PAE, serialize it or transfer across different machine with different CPU word length.
It will need convert between null terminal (a buffer) and length prefix string back and forth sometime (a lot?).
A lot of enlightenment, thanks

jpollard · 06-23-2014, 05:12 AM

Quote:

Originally Posted by lloydsu

Thanks for the reply.
I think the size of storage is irrelevant, null terminal string has the same problem.
And for some cases, like strncat, if you know the length of both string, you can determine the length of new buffer faster.

Why you think handling string that way is slower? I expect it would be an faster because determine the length of string is very slow.

Look at the API for strncat -

Code:

char * strncat(char *dest, const char *src, size_t n);

The destination is modified.

Without knowing the storage available (by subtracting the used space of dest from the allocated space), you always have to copy both strings to new storage instead of just appending to the end of the existing used storage of a string.

The allocation is slow, as is having to copy the first string. The existing strncat will just copy the second string to the end of the first, for at most n bytes of src. The n is the amount of remaining unused bytes of dest. No allocation done, and determining the used bytes of src takes half the time of a copy.

Next, remember that dest is being modified, and the return value must be the new allocation - so using a fixed size buffer (one that is being updated) is now impossible, so code sequences like:

Code:

char buffer[1024];
...

buffer[0] = '\0';
strncat(buffer, str1, 1024);
...
strncat(buffer, str2, 1024-strlen(buffer))

can't be used as buffer cannot be modified. These now can't use a string (the buffer) that is just updated, so all the uses of strncat have to be modified as well as the library.

Doing counted strings is completely reasonable in user space, but in the kernel there is a LOT of buffer management of fixed size blocks (it reduced memory fragmentation). So reallocation (meaning that sometimes you have to deallocate dest, after allocating a new string) just to append two strings when the storage space in the first string is available would be frowned on.

Also remember - some buffers are on the stack... thus not dynamic, and can't be deallocated except by the containing function return (this is why I indicated having flags for the storage class; at a minimum you need "const", "static", "stack", and "heap").

The last architecture that I used with a string type (not character array) was the VAX. Strings there were always represented by a structure that included a pointer to the actual character array. Thus the character array could be replaced with a new one (a reallocation if the result was larger than the actual storage), or just updated (if there was enough room). Of course, constant strings couldn't have either operation applied to them (only copied). But it also meant that the structure had to be passed around, and sometimes it was "const", "static", "stack", or "heap" - thus the storage class of the header has to be known as well (and allocated sometimes). BTW, I don't think the kernel used the hardware supported strings... too slow due to the frequent reallocations required.

BTW, another place counted strings don't work is in functions like strtok where the input string is modified (a byte gets replaced by a null)...

Take a look at http://www.gnugeneration.com/mirrors...api/book1.html for the string functions used in the kernel. Then take a look at (a now dated version, but it should still give you an idea of how much added overhead counted strings would impose)
https://www.kernel.org/pub/linux/ker...4/lib/string.c

Quite frequently "strings" just point into other "strings"... This is trivial with character arrays, but non-trivial (actually, impossible to do) with counted strings. It is possible if there is a separate "string header" that can point to the storage. But now you have the added overhead of managing the string headers...

Quote:

Let me take a look on sysfs and procfs

It is a good point about how many bytes are needed to keep the length, usually sizeof(int) should be enough?

More than enough, and it would maintain a common alignment.

Quote:

Seems it does complicate things when try to use PAE, serialize it or transfer across different machine with different CPU word length.
It will need convert between null terminal (a buffer) and length prefix string back and forth sometime (a lot?).

A lot of enlightenment, thanks