LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   default array size for variable in gcc compiler (https://www.linuxquestions.org/questions/programming-9/default-array-size-for-variable-in-gcc-compiler-399497/)

mlaich 01-04-2006 12:25 PM

default array size for variable in gcc compiler
 
hi,
i am running a program where i have to read the text in a flat file into runtime variable. i did something like:
Code:

  char chr[], FileText[];
  int FileHandlr;

  FileHandlr = open("file_name", O_RDONLY);
  ... ... ...
  while (read(FileHandlr, &chr, 1024)==1)
    strcat(FileText, &chr, 1024);
  ... ... ...

i am on gcc v3.2.2 and v3.4.5. this part of code gave me errors that the array size is not defined.
my problem is that there may be (at least theoratically) no defined size for this array which is going to hold complete file. is there some better way to do it or should i just go on and give some extra large number like
"char FileText[200000]"
or something. i personally don't think it could be a good idea as that might cause some memory issue even if the file is small as large quantity of memory have already been allocated to it.
please help me as how it is done in other professional programs.

Thanks --mlaich

graemef 01-04-2006 12:36 PM

The builtin arrays must have their size determined at run time. So you will will either need to dynamically allocate the memory or fix the size in your program.

Questions to ask...
  1. Do you need all the data or can you read it in and then process it
  2. Can you find out the size of the file before you start

If you don't need to read in the entire file before processing then you can work with much less memory requirements. However if you need the whole file to be read in then I'd suggest that you allocate the memory using something like malloc (since your using c) and then play around with it.

graeme.

mlaich 01-04-2006 12:41 PM

hi,
i have to read the whole file but since it have to be atleast read only, i can read the size of the file. i found something similar at "http://www-h.eng.cam.ac.uk/help/tpl/languages/C/teaching_C/node21.html" about the malloc stuff.

i think i have to start like that only...

thanks --mlaich

paulsm4 01-04-2006 12:56 PM

Bottom line: when you do this:
Code:

char chr[]
you have merely "declared" variable "chr". You have NOT allocated any space for it yet. If you're lucky, your first "read()" will crash and burn with a hairy segmentation fault. If you're not-so-lucky, the program might *appear* to run ... while corrupting adjacent data ... perhaps failing much later, far removed ... and insidiously difficult to track down and debug ..


Anyway, the first thing is to understand the difference in C/C++ between "declaration" and "definition":
http://publications.gbdirect.co.uk/c...claration.html
http://www.lysator.liu.se/c/c-faq/c-10.html
http://www.sun.com/971124/cover-linden/cchap.html

The second issue is "What should I do"?

If you're coding in C, you basically have two alternatives:
1. Declare a static array:
Code:

#define MAX_ELEMENTS 20000
  ...
  char myarray[MAX_ELEMENTS];
  ...

2. Dynamically allocate memory at runtime:
Code:

  char *myarray;
  ...
  n = xyz;
  myarray = (char *)malloc (n);

If you're coding in C++, you can also use the C++ "new" operator (instead of the C "malloc()" function), or (better), you could take advantage of the C++/STL "vector" class:
Code:

#include <vector>
#include <algorithm>
...
using namespace std;
  ...
  vector<char> myvector;
  myvector.push_back ('A');
  ...
  vector<string> my_string_vector
  my_string_vector.push_back ("Hello STL");

Here's a reasonable good tutorial on STL:
http://www.yolinux.com/TUTORIALS/Lin...alC++STL.html;

PS:
You can always "stat()" the file, get the file size, and make your array (at least) that big.

mlaich 01-04-2006 01:18 PM

thanks...
as my program (in C) is more along the lines of string handling and processing, i think i will stuck with the malloc stuff.
one small doubt... as we are allocating the size of the array at run time, suppose the size is change (increasing from 300 to 400) while it already contains the data of length 298, would it have any problem on the data (of 298 length) already present in the memory.

thanks for the links anyway...

mlaich

paulsm4 01-04-2006 01:30 PM

I'm curious about the idiomatic use of the noun "doubt" (as opposed, say, to "I have a question", or simply *asking* the question). Are you originally from Montana?

ANYWAY:
1. Yes, trying to read 400 bytes into a 300 element character array would be Bad (if that's what you're asking). That's precisely the reason that "#define MAX_ELEMENTS ..." is important if you have a static array:
Code:

#define MAX_ELEMENTS 300
  char myarray[MAX_ELEMENTS];
...
  if (n < MAX_ELEMENTS)
  {
    iret = fread (myarray, n, 1, fp);
  }
  else
  {
    fprintf (stderr, "Array too small: current size= %d, needed size= %d!\n",
      MAX_ELEMENTS, n);
    return ERROR_STATUS;
  }

2. Static arrays are easier and less prone to memory leaks than malloc(). Whenever I have the luxury of being able to choose, I generally prefer static arrays.

sundialsvcs 01-04-2006 02:46 PM

If possible, I would use C++ because it's much easier to write reliable code involving variable-sized arrays and variable-sized strings when you have its features to work with.

If you are constrained to C, at least make the best of it by defining subroutines for manipulating this data-structure, which is concealed from the view of the rest of the program by placing it inside of that unit... not as "extern."

Pursuing this "poor man's object-oriented" line of thinking, consider that your "array of strings" does not have to be physically "an array" at all. All that it has to be is a collection of string-values accessed by means of an integer representing its ordinal position within the collection. There are many, many ways that you could represent such a thing. It is absolutely harmless to choose to use a set of subroutine/function calls to provide access to that collection, rather than stuffing the entire program with references to a C "array."

What you need is...
  • A routine to be called at program initialization to prepare the data structure.
  • A routine to be called upon termination to release its storage.
  • A routine that will return a string given its ordinal. (I suggest that the parameters be: ordinal#, pointer to your buffer, size of your buffer).
  • A routine that will store or replace a string given its new value and its ordinal. (Same parameters.)
Call this "defensive coding." It will work well, it will run fast enough, and it will greatly increase the reliability of your code.
Quote:

So your 'efficient' coding saved a few microseconds. Big deal! Every lousy minute that you spend debugging a weird pointer issue costs a billion microseconds! Furthermore, the computer is paid-for but you cost money by the minute, and guess who costs more? Exactly. Furthermore, you have a brain, and I need to use that brain for more important things than debugging 'clever' code. Give me a program that takes one more minute to run, but can be utterly relied upon not to crash.

sundialsvcs 01-04-2006 02:47 PM

If possible, I would use C++ because it's much easier to write reliable code involving variable-sized arrays and variable-sized strings when you have its features to work with.

If you are constrained to C, at least make the best of it by defining subroutines for manipulating this data-structure, which is concealed from the view of the rest of the program by placing it inside of that unit... not as "extern."

Pursuing this "poor man's object-oriented" line of thinking, consider that your "array of strings" does not have to be physically "an array" at all. All that it has to be is a collection of string-values accessed by means of an integer representing its ordinal position within the collection. There are many, many ways that you could represent such a thing. It is absolutely harmless to choose to use a set of subroutine/function calls to provide access to that collection, rather than stuffing the entire program with references to a C "array."

What you need is...
  • A routine to be called at program initialization to prepare the data structure.
  • A routine to be called upon termination to release its storage.
  • A routine that will return a string given its ordinal. (I suggest that the parameters be: ordinal#, pointer to your buffer, size of your buffer. You don't need to expose any pointer-values to anyone.)
  • A routine that will store or replace a string given its new value and its ordinal. (Same parameters.)
Call this "defensive coding." It will work well, it will run fast enough, and it will greatly increase the reliability of your code.
Quote:

So your 'efficient' coding saved a few microseconds. Big deal! Every lousy minute that you spend debugging a weird pointer issue costs a billion microseconds! Furthermore, the computer is paid-for but you cost money by the minute, and guess who costs more? Exactly. Furthermore, you have a brain, and I need to use that brain for more important things than debugging 'clever' code. Give me a program that takes one more minute to run, but can be utterly relied upon not to crash.

sundialsvcs 01-04-2006 02:50 PM

If possible, I would use C++ because it's much easier to write reliable code involving variable-sized arrays and variable-sized strings when you have its features to work with.

If you are constrained to C, at least make the best of it by defining subroutines for manipulating this data-structure, which is concealed from the view of the rest of the program by placing it inside of that unit... not as "extern."

Pursuing this "poor man's object-oriented" line of thinking, consider that your "array of strings" does not have to be physically "an array" at all. All that it has to be is a collection of string-values accessed by means of an integer representing its ordinal position within the collection. There are many, many ways that you could represent such a thing. It is absolutely harmless to choose to use a set of subroutine/function calls to provide access to that collection, rather than stuffing the entire program with references to a C "array."

What you need is...
  • A routine to be called at program initialization to prepare the data structure.
  • A routine to be called upon termination to release its storage.
  • A routine that will return a string given its ordinal. (I suggest that the parameters be: ordinal#, pointer to your buffer, size of your buffer. You don't need to expose any pointer-values to anyone.)
  • A routine that will store or replace a string given its new value and its ordinal. (Same parameters.)
Call this "defensive coding." It will work well, it will run fast enough, and it will greatly increase the reliability of your code.
Quote:

So your 'efficient' coding saved a few microseconds. Big deal! Every hour that you spend debugging a weird pointer issue costs billions microseconds! Furthermore, the computer is paid-for but you cost money by the minute, and guess who costs more? Exactly. Furthermore, you have a brain, and I need to use that brain for more important things than debugging 'clever' code. Give me a program that takes one more minute to run, but can be utterly relied upon not to crash.

sundialsvcs 01-04-2006 02:50 PM

If possible, I would use C++ because it's much easier to write reliable code involving variable-sized arrays and variable-sized strings when you have its features to work with.

If you are constrained to C, at least make the best of it by defining subroutines for manipulating this data-structure, which is concealed from the view of the rest of the program by placing it inside of that unit... not as "extern."

Pursuing this "poor man's object-oriented" line of thinking, consider that your "array of strings" does not have to be physically "an array" at all. All that it has to be is a collection of string-values accessed by means of an integer representing its ordinal position within the collection. There are many, many ways that you could represent such a thing. It is absolutely harmless to choose to use a set of subroutine/function calls to provide access to that collection, rather than stuffing the entire program with references to a C "array."

What you need is...
  • A routine to be called at program initialization to prepare the data structure.
  • A routine to be called upon termination to release its storage.
  • A routine that will return a string given its ordinal. (I suggest that the parameters be: ordinal#, pointer to your buffer, size of your buffer. You don't need to expose any pointer-values to anyone.)
  • A routine that will store or replace a string given its new value and its ordinal. (Same parameters.)
Call this "defensive coding." It will work well, it will run fast enough, and it will greatly increase the reliability of your code.
Quote:

So your 'efficient' coding saved a few microseconds. Big deal! Every hour that you spend debugging a goofball pointer-issue costs billions microseconds! Furthermore, the computer is paid-for but you cost money by the minute, and guess who costs more? Exactly. Furthermore, you have a brain, and I need to use that brain for more important things than debugging 'clever' code. Give me a program that takes one more minute to run, but can be utterly relied upon not to crash.

sundialsvcs 01-04-2006 02:51 PM

If possible, I would use C++ because it's much easier to write reliable code involving variable-sized arrays and variable-sized strings when you have its features to work with.

If you are constrained to C, at least make the best of it by defining subroutines for manipulating this data-structure, which is concealed from the view of the rest of the program by placing it inside of that unit... not as "extern."

Pursuing this "poor man's object-oriented" line of thinking, consider that your "array of strings" does not have to be physically "an array" at all. All that it has to be is a collection of string-values accessed by means of an integer representing its ordinal position within the collection. There are many, many ways that you could represent such a thing. It is absolutely harmless to choose to use a set of subroutine/function calls to provide access to that collection, rather than stuffing the entire program with references to a C "array."

What you need is...
  • A routine to be called at program initialization to prepare the data structure.
  • A routine to be called upon termination to release its storage.
  • A routine that will return a string given its ordinal. (I suggest that the parameters be: ordinal#, pointer to your buffer, size of your buffer. You don't need to expose any pointer-values to anyone.)
  • A routine that will store or replace a string given its new value and its ordinal. (Same parameters.)
Call this "defensive coding." It will work well, it will run fast enough, and it will greatly increase the reliability of your code.
Quote:

So your 'efficient' coding saved a few microseconds. Big deal! Every hour that you spend debugging a goofball pointer-issue costs billions of microseconds! And real money! The computer is paid-for but you cost money by the minute, and guess who costs more? Exactly. Furthermore, you have a brain, and I need to use that brain for more important things than debugging 'clever' code. Give me a program that takes one more minute to run, but can be utterly relied upon not to crash.

dmail 01-04-2006 02:54 PM

Quote:

please help me as how it is done in other professional programs...
If I was to use C to do this, then I would do it in the way greame suggested in this quote.

Quote:

Originally Posted by graemef
...[*]Do you need all the data or can you read it in and then process ...

If you don't need to read in the entire file before processing then you can work with much less memory requirements....
graeme.

Heres a question for you. Is there a reason why you need to read all of the data in before you can process it?

mlaich 01-04-2006 03:50 PM

hi,
thanx for comments, i think i really don't need necessarily read the major big files completely. as far the small files are concerned, i think some thing like
Code:

  struct stat buffer;
  int status;

  status = stat("file_name", &buffer);
  printf("%d", buffer.st_size);

would help me in locating the size of array to be defined and then processing further...

actually, my major constraint is not C or C++ (I really have 0 experience in C++, for C it IS nearby 0, but something bigger :)), but that i have to use flat file and not database.

thanks --mlaich

paulsm4 01-04-2006 05:40 PM

Don't forget to check and make sure that "status" is zero (success)! ;-)


All times are GMT -5. The time now is 10:03 AM.