Replacement utilities written in pure shell

gnashley · 08-15-2007, 12:19 PM

I recently had a need for some utilities that would run on a pretty bare system and had the idea to see if I could write them in pure shell. While searching around and writing a few, I found a couple written by others.
So far I've put together over a dozen programs which provide at least the basic functionality of their C counterparts, including these:

basename
cat
cut
dirname
grep
head
rev
sort
tail
tr
wc
which

I'm wondering if anyone else has written any BASH-only utilities, or would like to contribute by writing one or some.

I've been writing them using versatile functions which can be used in other programs of the group. When the number of programs included gets large I may write them all into a multi-call program like busybox.

The individual programs do not call any external binaries at all. All processing is done using the substitution, expansion and mathematical internals of the shell. I've onyl tested them with bash>=3.0 but they are meant to work with dash also. I do not currently use the internal getopts as I'm not sure that it will always be present, though this may change.

Anybody interested in this? I find it to be an interesting challenge.

ta0kira · 08-15-2007, 01:22 PM

I'd check out BusyBox. That's what the Slackware install CDs run from.
ta0kira

PS Most of those things require libc and can't be done in terms of shell built-ins alone. Bash itself is a program, so you just need to mitigate external dependencies since you won't be getting rid of the binary aspect. I really can't think of anything useful written purely of built-in statements. Even a comparison statement ([ "x" = "y" ]) is a call to the external program [.

gnashley · 08-15-2007, 01:36 PM

I know all about busybox and have even created patches to include some stuff in it which doesn't come with it. I've also created a version of ash/dash which includes over 40 builtins.
The whole point of this exercise is to reproduce the functionality of many of these utilities but using pure internal shell functionality. Obviously many programs can't be written this way for one reason or another. But I believe enough of them can be put together to create a system which starts with just a statically-compiled shell and using these shell utilities be able to bootstrap a system. Actually the tools above will nearly do it, though being able to emulate a bit of sed and awk will help alot. You know, you need sed to compile busybox...
Actually sed is very kind and provides a bootstrap script which can be used to compile sed without 'make'. I don't dare put 'make' on the TODO list until I've dealt some more with sed and awk. The cut and grep programs provide testbeds for some of the same sort of code like search/replace operations and dealing with regular expressions.

mr-roboto · 08-15-2007, 02:59 PM

Perhaps I'm missing the point of your exercise, but have you considered Python for something like this ? While I'm not an ancient Linux head, I've been around computers (professionally) for nearly thirty years and can't believe the "Software Tools" thing is still alive and kicking. Anyway for many months, I've wondered wouldn't a Python-based "Busybox" be more efficient code-wise and time-wise (WRT long-term maintenance.) Right out of the box (so to speak), Python is very shell-like. It has a more coherent syntax than the jumble of misc programs that make up the "Software Tools" approach. It also lends itself to object-oriented programming and code reuse. Finally, it's pretty fast, certainly as compared to pure Bash code.

This is all academic for me, as I can't justify the time to contribute to rewriting BusyBox, unless someone were paying me to do so. However, I *am* working (part-time) on using Python for GUI-type "Software Tools". The work is slow, bec it's harder to design reusable little programs, as in sed, tr, etc for GUI apps.

FWIW, that's one person's take on your idea. Good luck....Jet

ta0kira · 08-15-2007, 06:02 PM

Quote:

Originally Posted by mr-roboto

It has a more coherent syntax than the jumble of misc programs that make up the "Software Tools" approach. It also lends itself to object-oriented programming and code reuse. Finally, it's pretty fast, certainly as compared to pure Bash code.

I can think of many reasons to compartmentalize functionality. One is to limit what a setuid program can do. Another is that integration of that sort of thing is a lot of work, so new commands couldn't just be written as C programs. They would have to be either plug-ins or scripts. I don't think fundamental utilities, written as either, which are needed before / is remounted R/W are at all appropriate, especially since Python lives on /usr, and on some systems the only shell in /bin is sh. Additionally, you can't have a setuid script, which would only cause security holes to work around.

Grouping things together to the point they become an immense project of unrelated tools subjects all to the same slow development process. Whereas a simple change to a simple tool would be simple if it were independent, those changes become a low priority when pooled together with a major package which may have broad coordination issues that take away the focus.
ta0kira

gnashley · 08-16-2007, 12:56 AM

([ "x" = "y" ]) is a call to the external program '['
This is not true with most shells which have their own internal version of '[. You can prove this to yourself by typing '[ --help', then typing '/usr/bin/[ --help' The internal function gets used unless you specify the path to /usr/bin/[

mr-roboto · 08-16-2007, 09:32 AM

Quote:

Originally Posted by ta0kira

Grouping things together to the point they become an immense project of unrelated tools subjects all to the same slow development process. Whereas a simple change to a simple tool would be simple if it were independent, those changes become a low priority when pooled together with a major package which may have broad coordination issues that take away the focus.
ta0kira

That was a mouthful, but I think you're missing my point completely. Python is itself a shell as well as a powerful programming language. It can actually take the place of BusyBox w/ some work, the same caliber of work as suggested w/ rewriting the "Software Tools" as scripts.

My point is merely an academic one. I've never gotten into shell scripting (exc for the most simplistic scripts), bec I hate the syntax. I'm merely making a case for a 2nd-gen "Software Tools" architecture.

Take TR for example. In my 2nd-gen world, TR could be a function, an object class, a stand-alone script, as well as a stand-alone executable, or any combination of the four. TR could employed in scripts that have a nice, clean, modern syntax.

BASH and its cousins will never go away, nor would I argue that they should go away. My central issue is that for all of the progress made w/ operating systems over the last thirty-plus years (how long it's been since Kernighan and Plauger wrote their classic "Software Tools"), this part seems frozen in the past, often evangelized by people who weren't even alive when the book was published. The ideas are still valid, maybe it's time to think about a revolutionary change to the underlying platform.

Soap box mode off....Jet

ta0kira · 08-16-2007, 11:18 AM

I agree with the principle to a point, but what you are talking about is essentially a new operating system with few ties to Unix. After all, why go to the trouble of integrating legacy tools into a shell for a revolutionary system? I often scrap a project and rebuild it from their best ideas. It's good for individual projects, but you can't go doing that with an OS people depend on and expect to upgrade.

Here is another question: can you exec a built-in? If you could, would the entire shell process have to be duplicated? Would all of the "Software Tools" reside in the same shared library, in a static binary, or in categorized libraries?

What's nice about "Software Tools" is they don't have to be around forever. If ever there were a superb replacement for sed, sed would eventually go away, etc.

It doesn't matter who or who wasn't alive to appreciate the work in it's true societal context or whether or not it advocated progress which wasn't heeded. When systems try to abide by a standard and appear familiar they will follow that standard and what people are used to doing. It seems to me that MS tries to consolidate all of their tools into one, but IMHO it isn't for the better (MS opinions aside.)

Back to consolidation: One program can't hold all command line tools, so there is no way around incorporating external commands to some point. My personal opinion is that the default shell on all Linux systems should have a built-in for all of the user-relevant system calls like chmod, chown, setuid, etc. and essentially get rid of all related binaries. As I implied before, though, that does introduce security risks since that shell then must be a setuid program, giving potential access to scripts or shells without the need. This would also require that all setuid built-ins be owned by root; another security risk. And then, of course, there would be no splitting sudo hairs. A shell like that would become a single point of security failure for a system.
ta0kira

cfaj · 08-17-2007, 11:05 AM

Quote:

Originally Posted by gnashley

([ "x" = "y" ]) is a call to the external program '['
This is not true with most shells which have their own internal version of '[. You can prove this to yourself by typing '[ --help', then typing '/usr/bin/[ --help' The internal function gets used unless you specify the path to /usr/bin/[

In fact, [ (and its synonym, test) has been built into every Bourne-type shell for more than 25 years.

cfaj · 08-17-2007, 11:23 AM

Quote:

Originally Posted by ta0kira

I really can't think of anything useful written purely of built-in statements.

I have many useful scripts that do not call any external commands, even without using dynamically loadable builtins.

Among other things, I use a script to retrieve my e-mail from remote POP3 servers. It it weren't for spam filtering, there would be no external commands used.

Quote:

Even a comparison statement ([ "x" = "y" ]) is a call to the external program [.

That hasn't been true for more than 25 years.