Latest LQ Deal: Latest LQ Deals
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 08-20-2012, 01:10 AM   #1
Registered: Jun 2012
Location: ghaziabad , delhi , india
Posts: 105

Rep: Reputation: Disabled
how Command interpreter understand the meaning of command

every time i use different types of commands in shell , but i dont know how command interpreter works , means when i type cat > f_1 , how this command is interpreted by CLI ,

means format of understanding , like binary conversion of commands or direct understanding !
Old 08-20-2012, 01:31 AM   #2
Registered: Nov 2004
Location: Russia (St.Petersburg)
Distribution: Debian
Posts: 666

Rep: Reputation: 68
probably you'd like to read this:
Old 08-20-2012, 01:46 AM   #3
Wim Sturkenboom
Senior Member
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Ubuntu 12.04, Antix19.3
Posts: 3,794

Rep: Reputation: 282Reputation: 282Reputation: 282
The shell will first parse the string that you entered. In your example it will say
  1. "hey, user wants to run the command cat"
  2. "next it asks itself if it knows '>'"; yes, user wants to redirect results to a file
  3. "wait, what does redirection require? Oh, a file so f_1 must be the file where the user wants to store the result"
So at the end it will have a command, a redirect instruction and the file where to write the output. It will also keep track of arguments that you specify; arguments are basically pieces of text on the command line that the interpreter itself does not know about (like -A, a filename etc)

Next the shell first looks if it knows the requested command internally in the shell.
  1. If it finds it, it executes the internal function and it passes the arguments (-A, filename etc)
  2. If it does not find it, it checks the directories in the PATH variable for a file (program) called 'cat'. If it finds it, it will call that file with the arguments that you passed (-A, filename etc).
  3. If it can't find the file, the shell will inform you with the error message.
The executed program will process the arguments (-A, filename etc) and do whatever it is supposed to do. The data (output) is send back to the shell. If no redirection was requested, the shell will dump the result on the screen (stdout), else (in your case) it will redirect it to the file.

Last edited by Wim Sturkenboom; 08-20-2012 at 01:47 AM.
2 members found this post helpful.
Old 08-20-2012, 10:08 PM   #4
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,329

Rep: Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745Reputation: 2745
Some cmds are built into the shell, some are standalone executables eg

 file `which /bin/cat`
/bin/cat: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
If you want to know how it really works, you'll have to read the src code
Old 08-21-2012, 10:16 AM   #5
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
It sounds like you are interested in the underlying mechanism for parsing the commandline string. One method that may be used in modern shells is based on tools such as lex and yacc (or in GNU/Linux, flex and bison). These are tools that are use to create parsers. Using these tools, parsers can be created that do their work in two parts.
One part is a called a lexical analyzer. The lexical analyzer simply breaks up the input stream (the commandline, in the case of a shell) into tokens. The tokens have 'types' and values, but do not have semantic properties. For instance, a token may be a 'word' consisting of those characters that would be valid as a program name. Or a token may be numeric, either integer format, some real number format, perhaps even imaginary. A token may be an 'operator', such as arithmetic operators, assignments, etc. Tokens are parsed by a lexical analyzer without any contextual relationships. In other words, an operator is just an operator, regardless of the position in the input stream. Same for any other type of token. The lexical analyzer only identifies the type and value of the tokens.
The other part of a parser is the grammar. The grammar is a rigorous specification that defines how sequences of tokens can be interpreted. The grammar specifies one or more general expressions, made up of sequences of tokens of specified types. Also associated with an expression is a potential action to be performed when such an expression is recognized in the input stream. THe grammar may use the values of some of the tokens in order to perform the action.
In a shell commandline interpreter, an example of an expression might be a simple command to be executed. The grammar would define this as simply a token in isolation, containing only those characters that can be used to compose a filename. The shell, having recognized such an expression, would take the value of the token (a string containing the token itself), and use it as an argument to an exec() function. Another example of an expression in the grammar of a shell commandline interpreter would be a comment. The grammar would describe this as something like 'zero or more whitespace characters, followed by the '#' character, followed by zero or more of any characters.' The action to perform upon recognizing such an expression would be to do nothing at all.
Yet another example of an expression would be an assignment: a token of type 'variable', followed by the '=' character, followed by some other expression. The action would be to evaluate the expression (remembering the value), create an instance of the variable (the grammar doesn't specify how this is done), and associate the instance of the variable with the value of the expression already evaluated.
The grammar used by a shell commandline interpreter would of course be fairly complex. You can use the tools flex and bison to create parsers for applications of your own, and for some of us, it is a subject of some fascination. There are also other types of parsers that use other techniques, and other tools that generate other types of parsers. I have no knowledge of whether modern and common shells do use flex & bison to generate their parsers, but I strongly suspect that some formal parser generators are used in at least some of them.
--- rod.
Old 08-21-2012, 05:25 PM   #6
David the H.
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
The exact parsing order is still a bit unclear to me in a few details, particularly the first few steps, but I'm pretty sure it goes something like this:

1) The line is first broken up into words/tokens, separated by whitespace. Quotation marks and escape backslashes will be processed to determine the exact tokens, and removed from the line.

2) The line is parsed to look for command list separators and multi-line constructs. If necessary, additional lines will be grabbed, and/or it will be broken up into multiple commands for separate processing.

3) The first word of each recognized simple command is inspected, and checked for matching aliases. If one is found, it substitutes it, and the resulting command line is re-processed (minus recursive alias checks). Somewhere in here any environment parameters that are being passed to the command will also be processed and saved for the execution step.

4) The command is scanned for redirection patterns. The necessary file descriptors will be set up, and those tokens removed from the line.

5) Brace and tilde expansions are performed on any tokens that were not initially protected by quotes. This may result in new tokens being created.

6) Each token is processed in left-to-right order, with variable substitutions, command substitutions, process substitutions, and arithmetic expansions completed and substituted, unless they were protected by single quotes/escapes.

7) Word splitting is also performed on any expanded tokens that were not initially protected by double-quotes. Unlike the original tokenizing in the first step, this word splitting is done based on the setting of the IFS variable, which may not be whitespace.

8) Pathname (globbing) expansions are done, again on unquoted patterns, which also may result in new tokens being created. Note that since it's done after the word-splitting step, spaces in the filenames to not result in new tokens being created.

9) The first token becomes the command name (function/executable/keyword/whatever), and its location determined (i.e. the PATH is checked), so that the final command can be assembled and error-checked accordingly.

10) The final command is assembled and passed to the system for execution.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
GUI to Command interpreter Alans07 Linux - Newbie 4 07-13-2009 08:06 AM
simple command interpreter Ayman.mashal Programming 7 06-04-2007 09:06 AM
Command Line Interpreter prady Linux - Newbie 1 02-19-2006 10:58 PM
command meaning ramrann Linux - Newbie 2 10-18-2005 02:56 PM
taming python's interpreter 'help()' command southsibling Programming 7 06-04-2005 07:58 PM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 09:45 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration