LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 12-14-2011, 05:55 AM   #1
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467
Blog Entries: 60

Rep: Reputation: 51
Aren't text-based interfaces inherently inefficient? (like procfs)


A certain aspect of the Linux system paradigm doesn't quite make sense to me: In the procfs and sys directories, there are many files that provide various statistics on the system, in text format; e.g., /proc/meminfo will tell you all about your system memory, and /proc/cpuinfo will tell you all about your CPUs. And the expectation is that programs are supposed to parse these files to get said statistics. (At least from the articles I have read.) Indeed, I am aware of a number of programs that depend heavily on procfs.

Nevertheless, is not a text-based interface an inherently inefficient way to receive information from the system? For example, let's say I was developing a program that needed to know the amount of free memory: Not only would the program have to go to the trouble of skipping over all the ASCII for "MemFree:" (and a bunch of spaces) but then it would have to read the number in ASCII. Now, if the amount was 319956 (as it happens to be in my system) it would need to process a total of 6 bytes (not including a necessary delimiter) to understand a number that, in binary, could be represented in 3. (The problem being more marked when dealing with larger numbers like 34359385596; though perhaps, depending on the design, we might use a number of bits more convenient for our system architecture.) Furthermore, the application is likely going to need to convert the ASCII number to binary, for the purposes of more efficient calculation or storage. (This is the greatest evil.)

Now, one could answer that the text-based representation makes development easier for the programmer. But the reply would be that nobody really cares about the programmer, once the compiling is over, because the performance is what everyone has to continue living with. Furthermore, programmers can always provide for themselves development libraries which hide the complexities of dealing with a binary-based interface.

Last edited by hydraMax; 12-14-2011 at 05:58 AM. Reason: typo
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 12-14-2011, 06:30 AM   #2
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
The beauty of using ascii chars to present system infos is that you can use simple text tools like cat, grep, cut, awk to display them, in any format you want. Why would you need hyperspeed performance to read system infos anyway ?
Binary infos format means specific program to read them, so more program to install.
I don't agree that the application is necessarly going to convert ascii infos to binary for displaying infos purpose, in most of case it is ascii in > ascii out
 
2 members found this post helpful.
Old 12-14-2011, 07:22 AM   #3
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269
I can see that you are using Gentoo...

Well, all I can say is that:
1) It is more important that the info be available in a readable format than for it to be available in a format that is most efficient for programmers.
2) Using procfs is not the only way to get this info, as least some of it.
 
Old 12-14-2011, 07:31 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,428

Rep: Reputation: 1055Reputation: 1055Reputation: 1055Reputation: 1055Reputation: 1055Reputation: 1055Reputation: 1055Reputation: 1055
Huh ???
You think it is "more efficient" to have to write code to reformat data so I can work out how much memory I have - or process ids, or ... ?.

Feel free to write a kernel module to acquire the data in any form you want.
BTW, do you happen to use X - and still have the audacity to raise a query about efficiency ?.

Balderdash.
 
Old 12-14-2011, 02:35 PM   #5
dugan
Senior Member
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 4,872

Rep: Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523
Storing them in text format makes them human-readable as well as machine-readable. That's an advantage.

Last edited by dugan; 12-14-2011 at 02:39 PM.
 
Old 12-15-2011, 09:56 AM   #6
Sed_Awk
Member
 
Registered: Dec 2011
Location: USA
Distribution: Crux 2.7.1
Posts: 41

Rep: Reputation: 0
Another reason for text interfaces is some people don't install X servers/clients on their servers. They login via ssh and use text tools to get system info.
 
Old 12-15-2011, 10:30 AM   #7
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
Also performance wise, read chars directly is less processor cycles than read binary numbers and convert them to chars before send them to stdout. I mean if any byte counts, let's take in account processor cycles
 
Old 12-15-2011, 10:39 AM   #8
Sed_Awk
Member
 
Registered: Dec 2011
Location: USA
Distribution: Crux 2.7.1
Posts: 41

Rep: Reputation: 0
Quote:
Originally Posted by Cedrik View Post
I mean if any byte counts, let's take in account processor cycles
Yeah, why waste precious computer cycles drawing GUI objects

Last edited by Sed_Awk; 12-15-2011 at 10:47 AM.
 
Old 12-15-2011, 01:07 PM   #9
DavidMcCann
Senior Member
 
Registered: Jul 2006
Location: London
Distribution: CentOS, Salix
Posts: 3,182

Rep: Reputation: 801Reputation: 801Reputation: 801Reputation: 801Reputation: 801Reputation: 801Reputation: 801
This sums up what happens if you rely on unreadable code:

http://www.linuxquestions.org/questi...ml#post4550801

And think of all the problems people have with the Windows registry.
 
1 members found this post helpful.
Old 12-16-2011, 01:32 AM   #10
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467
Blog Entries: 60

Original Poster
Rep: Reputation: 51
Many, I dare say, have missed the point of my original post. I never wrote that it was bad to be able to get a text-based representation of system information. What I wrote was that it is inherently inefficient for programs to get system information via a text-based representation of it. Yet, this is the general expectation and ideal held out for us. I could not pull the references off of the top of my head, but I have read several articles in which programmers were encouraged to get system information from the proc and sys files. It should also be noted that many proc and sys files, despite being text-based, are in fact not at all formatted for easy human reading. For an example, run cat /proc/1/stat. One article I read encouraged users not to read proc files directly, but rather to always view the information through an intermediate program (ps for example).

Rebuking me for the "inefficiency" of using X11 is ridiculous. Cycles spent drawing GUI objects are not wasted because they provide me with a direct service. My point was that having our programs interact with the system through an ASCII interface wastes cycles to no benefit; except perhaps for the programmer; though as I mentioned the programmer's comfort is no longer a factor once the program has been compiled.

Drawing a parallel to text configuration files is also ridiculous. Text files exist solely for the providing a human interface to the configuration of a program or system. Furthermore, configuration files need (generally speaking) to be read only once by the program upon execution, or, as in the example cases of certain postfix configuration files, translated into a binary database or hash format before being first utilized.

Cedriks point seem to be more relevant to the discussion, but I believe it can be answered: First of all, we are not taking into account that fact that, in order to provide the text-interface, the system itself must first translate the system data from binary format to ASCII format. Unless, of course, the system is storing and processing all those numbers, UIDs, and so forth, in ASCII format, which I truly hope is not the case! And to answer the principle objection, we should recognize that, although there are some programs that simply take the system data and output it raw to stdout, this cannot be assumed to be the case. (One could give many counter-examples.) The data should be provided in its raw form; it should be up to the application layer to decide how it wishes the data to be displayed (if at all) and to format it for its own purposes.

And finally, I will respond to the question of "why would you need hyperspeed performance to read system infos?" by stating that hyper-speed performance is an inherent good, and that, being as system information is quite often read, this would seem like a very sensible place for optimization.

Last edited by hydraMax; 12-16-2011 at 01:32 AM. Reason: typo
 
Old 12-16-2011, 05:26 AM   #11
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
The "System" (the Kernel) convert binary into ascii char once to make /proc infos for the most case, while text utilities that access them can be used more often.

Standard output uses ascii chars, the infos is usually displayed in stdout, so in order to display infos, you need ascii chars... Efficiently wise, you cannot make better than read ascii chars and output ascii chars

You can always get these infos in raw form if you want, eg syscall for getting uid will output binary number
Code:
mov     eax,24; syscall 24 = get uid
int     80h
;uid in 32 bits binary number in eax)
But to display this binary number, you have to write a function that convert 32bits binary number into a few 8bits chars = more CPU cycles = slower than read 8bits char, display 8bits chars

[edit]
Sorry uid example is actually irrelevant, there is no uid infos in /proc (ok yes, if you cat /proc/self/status...)
But the theory stays, and the more infos you have to read, the more binary/chars conversions you have to make if they are provided in binary...
To resume, access infos in binary may accelerate reading, but slows down displaying, makes CPU to work more, needs specific program to access them etc..

Last edited by Cedrik; 12-16-2011 at 05:46 AM.
 
Old 12-16-2011, 05:34 AM   #12
jlinkels
Senior Member
 
Registered: Oct 2003
Location: Bonaire
Distribution: Debian Wheezy/Jessie/Sid, Linux Mint DE
Posts: 4,187

Rep: Reputation: 513Reputation: 513Reputation: 513Reputation: 513Reputation: 513Reputation: 513
Yes, it is inefficient. But the inefficiency is by far outweighted by the advantage that this is readable text, accessible by anything.

Text based configuration and interfaces are Unix philosophy and later adopted by Linux to assure openness.

Processing the /proc information might be 10 times as inefficient as binary, but since you use this information so seldomly (maybe a few times per second) the cost is 10 times nothing.

jlinkels
 
Old 12-16-2011, 08:15 AM   #13
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,423

Rep: Reputation: 1158Reputation: 1158Reputation: 1158Reputation: 1158Reputation: 1158Reputation: 1158Reputation: 1158Reputation: 1158Reputation: 1158
The "inefficiency" is irrelevant. Computers routinely execute hundreds of millions of instructions per second now, sometimes "per CPU," and what really matters most is that such information is exceptionally easy to get to by any sort of procedure that you care to use. Including the cat command. (Meow.)
 
Old 12-16-2011, 12:49 PM   #14
dugan
Senior Member
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 4,872

Rep: Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523Reputation: 1523
Quote:
Originally Posted by jlinkels View Post
Processing the /proc information might be 10 times as inefficient as binary, but since you use this information so seldomly (maybe a few times per second) the cost is 10 times nothing.
That is correct. There would be absolutely no benefit to this optimization. At all. Whereas the cost would be to make this information inaccessible to Unix's standard toolchain and therefore be massive.

Quote:
being as system information is quite often read
Is it? It looks to me as if the exact opposite is true.

Now, have you ever seen a source code profiler detect a delay in the part of a computer program that reads this information? No? That's what I thought. That means that if you think that this very costly optimization would bring you "hyper-speed performance", or any performance increase at all, you're lying to yourself.

Last edited by dugan; 12-16-2011 at 05:59 PM.
 
Old 12-16-2011, 05:15 PM   #15
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467
Blog Entries: 60

Original Poster
Rep: Reputation: 51
Quote:
Originally Posted by Cedrik View Post
The "System" (the Kernel) convert binary into ascii char once to make /proc infos for the most case, while text utilities that access them can be used more often.
So, the kernel actually stores all this data in ASCII format in volatile memory, and then simply copies it out on request? Or does it make a binary to ASCII conversion whenever a call is made to read data from a proc file? You seem to be affirming the former, but as it is an important technically point I was hoping for clarification.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Pygame slow or just programmers are inefficient? General Programming 5 03-23-2010 05:06 AM
My system feels inefficient in its use of CPU power Changes Linux - General 7 10-06-2009 11:49 PM
Is Linux inherently unreliable? jacatone General 72 04-28-2009 04:39 PM
rsync - inefficient behaviour? Unclesmiff Linux - Software 4 01-02-2008 12:13 PM
Yum/Yumex Inherently Slow? sancho Linux - Software 3 09-06-2006 03:37 PM


All times are GMT -5. The time now is 03:52 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration