LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 08-25-2010, 08:47 PM   #1
jason_m
Member
 
Registered: Jun 2009
Posts: 33

Rep: Reputation: 12
Passing command line arguments


I'm looking for some confirmation / clarification passing arguments to a program.

The code I'm going to post below is assembly, but I think my analysis holds generally for any language. I'm running linux (ubuntu 10.04), my shell is bash, and my processor is an Intel Core i3 (64-bit)

What I set out to determine is whether I can pass integer (as opposed to "C String") arguments to my program. Generally, command line args are treated as null-terminated strings. The prototype for main in C/C++ illustrates this:
Code:
int main(int argc, char* argv[])
At the end of the day though, whether you consider the value in a word/double/quad or whatever to be character data, or something else like integer data is only a matter of interpretation. 0x48692121 is both a valid character array of length 4 ("Hi!!") as well as a 32-bit integer.

To test whether I could do this or not, I wrote the following little program:
Code:
	.text
	.global _start

_start:
	pop	%rcx  #argc
	pop	%rcx  #address of argv[0]
	mov	(%rcx), %rdi #store value of argv[0] in rdi
	
	pop	%rcx  #address of argv[1]
	mov	(%rcx), %rdi #store value of argv[1] in rdi

	pop	%rcx  #address of argv[2]
	mov	(%rcx), %rdi #store value of argv[3] in rdi
...
My understanding is that the stack pointer will be pointing at the argument count when the program beings execution. Directly "on top" of that (8 bytes toward higher memory addresses) will be the address of the first command line/passed-in parameter. "On top" of that will be the address of the second command line parameter, etc., until all of the arguments are accounted for.

So I fired this program up in gdb and gave it the following input
Code:
(gdb) run $'\x00\x00\x00\x42' $'\x01\x02\x03\x00\x04'
I will explain the results I am seeing as well as the arguments that I chose. The first argument turns out to be the full path to the executable file. This is implicitly passed to the program and I was expecting this to be the case.

The next argument demonstrates a difficulty I am having passing in non-character data. It appears that the environment is seeing a null and not looking at the rest of the argument. That is, I don't think the 0x42 ever makes it onto the stack because it is preceded by 0x00, which is interpreted as a null terminator, ending the "string" argument.

Similarly, the last argument is a test to see if the 0x00 prevents the 0x04 from ever making it onto the stack.

So here's what gdb tells me:
Code:
Breakpoint 1, _start () at better_args.s:9
(gdb) n
_start () at better_args.s:10
(gdb) n
(gdb) p (char*)$rcx
$18 = 0x7fffffffeb2e "/home/jason/Development/asm/better_args"
(gdb) p
$19 = 0x7fffffffeb2e "/home/jason/Development/asm/better_args"
(gdb) n
_start () at better_args.s:13
(gdb) p/x {long}$rcx
$20 = 0x42524f0003020100
(gdb) p/x {long}($rcx+2)
$21 = 0x544942524f000302
(gdb) n
(gdb) n
_start () at better_args.s:16
(gdb) p/x {long}$rcx
$22 = 0x4942524f00030201
(gdb)
So, let's look one at a time:
Code:
(gdb) p (char*)$rcx
$18 = 0x7fffffffeb2e "/home/jason/Development/asm/better_args"
Here we can confirm that argv[0] is the full path and filename.

Code:
(gdb) p/x {long}$rcx
$20 = 0x42524f0003020100
This should be pointing to the second "string" argument. Curiously enough, the left most byte is 0x42. However, my testing tells me this is purely by chance - I normally do not see this. For further evidence that this is just junk memory values, let's see if there is a null terminator after this "character".
Code:
(gdb) p/x {long}($rcx+2)
$21 = 0x544942524f000302
Nope, 0x49 comes next. I feel comfortable saying that *my* 0x42 never made it onto the stack. My question is: is this due to the way that bash parses the command line arguments? Could something else deliver my argument to the program? Perhaps if I set up argv[] myself in C and ran the program using execve()? Testing that is on my TODO.

Finally, if you didn't notice it in the previous output, my last argument is there, and the 0x04 didn't make it.
Code:
(gdb) p/x {long}$rcx
$22 = 0x4942524f00030201
One take away for me is that the string is actually "backwards" in memory. Or at least that is how it seems to me, but maybe my notion of "forwards" and "backwards" needs work. But that's no big deal.

So that was a little long winded, but I wanted to put the facts out there as well as everything I have looked into and my understanding of the results I'm observing. Maybe this is just re-stating the obvious, but I didn't know what the results were going to be, so it wasn't obvious to me.

Where this is going is eventually another program is going to load up a little compiled binary that a parser of mine spits out. The second program basically just applies a (parsed and compiled) user-input algorithm to some data, and then returns the result back to the first program when finished. I'm thinking of using write() with the write end of a pipe to pass the result back.

I'd like to get to the point where I could write this little framework test: First program obtains a pipe, fork()s a new process, starts up the second program, passing it the write file descriptor for the pipe, and two integer arguments to add together. The second program adds the arguments together and writes the result to the pipe. Originally I started by passing string arguments and converting them to integers, which isn't all that difficult. But if possible and safe, I'd love to skip the conversion and pass the integer values directly. That is, I could pass the integer value 1, and after doing a
Code:
pop %rcx
mov (%rcx), %rsi
%rsi would contain the value 1. It is looking like I cannot do this directly. Leading bytes with all 0's, or any intermediate 0 byte will ruin the value I'm trying to pass. Is sticking with passing character data, and then parsing the integer value out of it my best course of action? What if instead of integers, it was double-precision floating point values that were being passed in? Is it still better to pass in a character representation and parse the double value out of it? I have to believe there is something better.

Any thoughts or comments on the testing I have done so far, my observations, and what I am trying to accomplish are greatly appreciated.

Thanks,
 
Old 08-25-2010, 09:58 PM   #2
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 451Reputation: 451Reputation: 451Reputation: 451Reputation: 451
Quote:
Originally Posted by jason_m View Post
I'm looking for some confirmation / clarification passing arguments to a program.

The code I'm going to post below is assembly, but I think my analysis holds generally for any language. I'm running linux (ubuntu 10.04), my shell is bash ...
'bash' is not black magic, it's rather a program written in "C" and using "C" standard library to start processes.

Because of this one's options are limited to what standard "C" library has WRT process launching: 'man 3 exec'.
 
Old 08-26-2010, 07:59 AM   #3
jason_m
Member
 
Registered: Jun 2009
Posts: 33

Original Poster
Rep: Reputation: 12
So you're suggesting it is bash breaking up the arguments (and not some "higher power")? I can understand that, I just wasn't sure ahead of time if that was going to be the case or not. I figured bash still has to parse the entire string I give it, in this case to the final single quote, so maybe it would just toss those bytes in a buffer, slap a 0x00 at the ends and send the arguments off to the program.

Are you suggesting that I could accomplish this by setting up char* argv[] myself? If I have time, I'm going try and put a program together tonight to test that out.

Maybe a more general question then is: what is the best way to pass binary data to a program? This program is going to need to know two things: (1) the file descriptor to write() back its result, and (2) where to find its input(s). Passing integer to the program was just a way for me to learn more about passing binary data. I think at the end of the day, I'll pass the address to the start of a table in memory with all of the formula inputs. All of the inputs should be at known, fixed offsets in the table once the algorithm is parsed/compiled. Should I just parse strings with these values? Or should I continue to explore passing binary data by setting up argv[] myself? Or should I be thinking about something else entirely?
 
Old 08-26-2010, 08:05 AM   #4
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 451Reputation: 451Reputation: 451Reputation: 451Reputation: 451
Quote:
Originally Posted by jason_m View Post
So you're suggesting it is bash breaking up the arguments (and not some "higher power")? I can understand that, I just wasn't sure ahead of time if that was going to be the case or not. I figured bash still has to parse the entire string I give it, in this case to the final single quote, so maybe it would just toss those bytes in a buffer, slap a 0x00 at the ends and send the arguments off to the program.

Are you suggesting that I could accomplish this by setting up char* argv[] myself? If I have time, I'm going try and put a program together tonight to test that out.

Maybe a more general question then is: what is the best way to pass binary data to a program? This program is going to need to know two things: (1) the file descriptor to write() back its result, and (2) where to find its input(s). Passing integer to the program was just a way for me to learn more about passing binary data. I think at the end of the day, I'll pass the address to the start of a table in memory with all of the formula inputs. All of the inputs should be at known, fixed offsets in the table once the algorithm is parsed/compiled. Should I just parse strings with these values? Or should I continue to explore passing binary data by setting up argv[] myself? Or should I be thinking about something else entirely?
You are limited to what exec* functions do. Your freedom is limited by the following quote:

Code:
The const char *arg and subsequent ellipses in the execl(), execlp(), and execle() functions can be thought of  as  arg0,  arg1,  ...,  argn.   Together  they
       describe a list of one or more pointers to null-terminated strings that represent the argument list available to the executed program.  The first argument, by
       convention, should point to the filename associated with the file being executed.  The list of arguments must be terminated by  a  NULL  pointer,  and,  since
       these are variadic functions, this pointer must be cast (char *) NULL.
.
 
Old 08-26-2010, 10:36 PM   #5
jason_m
Member
 
Registered: Jun 2009
Posts: 33

Original Poster
Rep: Reputation: 12
Below is an example using a call to execve() that accomplishes what I wanted to test.

write_test2.c:
Code:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char* argv[]) {
  pid_t identity;
  int* aptr;
  int* bptr;
  int pfd[2];

  aptr = malloc(sizeof(int));
  bptr = malloc(sizeof(int));

  *aptr = 40;
  *bptr = 2;

  pipe(pfd);
  identity = fork();

  if (identity == 0) {
    // Child process
    // Setup argv[]
    char* arrrgs[4]; // Pirates?
    arrrgs[0]  = (char*)(&pfd[1]);
    arrrgs[1] = (char*)aptr;
    arrrgs[2] = (char*)bptr;
    arrrgs[3] = (char*)0;

    execve("write_test2_child", arrrgs, (char *)0);
  } else {
    int buf;
    read(pfd[0], &buf, sizeof(int));
    printf("Parent received value: %x, %d\n", buf, buf);
  }

  wait(NULL);  // Don't exit until the child is done

  printf("All done!\n");
  
  return 0;
}
write_test2_child.c
Code:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char* argv[]) {
  pid_t id;
  int c;

  id = getpid();

  printf("Yo, attach a debugger to: %d\n", (int)id);
  c = getc(stdin);

  printf("argc: %d\n", argc);

  int fd = *argv[0];
  int a = *argv[1];
  int b = *argv[2];
  printf("fd: %x, %d\n", fd, fd);
  printf("a: %x, %d\n", a, a);
  printf("b: %x, %d\n", b, b);

  int rslt = a + b;
  write(fd, &rslt, sizeof(int));

  return 0;
}
Running:
Code:
jason@c0mpy:~/Development/asm$ ./write_test2
Yo, attach a debugger to: 2481
a
argc: 3
fd: 4, 4
a: 28, 40
b: 2, 2
Parent received value: 2a, 42
All done!
jason@c0mpy:~/Development/asm$
Note that when manually setting up argv[], the library correctly computes argc, but it does not enforce that argv[0] is the filename.

It is nice to know I can accomplish passing some binary data if necessary.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
passing arguments to system command nathan Programming 1 12-17-2009 03:46 AM
Passing command-line arguments to qglviewer application MALDATA Programming 1 07-15-2009 09:27 AM
Passing command line arguments through getopts and long options neville310 Programming 3 04-16-2007 06:38 AM
Passing a text file to the command line as arguments wimnat Linux - General 2 12-05-2005 08:09 AM
passing a list of arguments to a command hdagelic Linux - General 2 05-09-2005 09:30 AM


All times are GMT -5. The time now is 01:00 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration