LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Why directory record length is variable in ext2 filesystem? (https://www.linuxquestions.org/questions/programming-9/why-directory-record-length-is-variable-in-ext2-filesystem-854772/)

password636 01-07-2011 12:48 AM

Why directory record length is variable in ext2 filesystem?
 
Hi,

The declaration for directory record length in ext2 filesystem is as follows:
Code:

#define EXT2_NAME_LEN 255

struct ext2_dir_entry_2 {
  __u32 inode;      /* Inode number */
  __u16 rec_len;    /* Directory entry length */
  __u8  name_len;  /* Name length */
  __u8  file_type;
  char  name[EXT2_NAME_LEN];  /* File name */
};

Some say because the record is not in a fixed length so rec_len is the real record length. Why is the length of the array `name' not fixed? I thought C arrays like this should be fixed length. C99 has variable-length arrays, does this structure count on C99?

THANKS!

wje_lq 01-08-2011 12:43 PM

Quote:

Originally Posted by password636 (Post 4216137)
Some say because the record is not in a fixed length so rec_len is the real record length.

They are correct. Look here, in section 18.2.6.2.

The struct is a convenient way for the C programmer to look at the data. If the final element of a struct is an array, it is often not necessary in real life for the number of elements in the array to be a true indication of how large the array is.

If the struct is itself in an array, then it is indeed necessary. One needs to know that the length of each element of the array of structs is constant, so one can index into that array correctly.

But this is different. You use the length of each item (as stored in the rec_len element) to calculate where the next entry begins.

Why do they do things this way? Because most names as stored in directory entries are much, much shorter tha the maximum, so this way we can store more directory entries within a directory block.

password636 01-09-2011 09:01 PM

Quote:

Originally Posted by wje_lq (Post 4217577)
If the final element of a struct is an array, it is often not necessary in real life for the number of elements in the array to be a true indication of how large the array is.

Thank you very much. Can you tell me any documentation or code example to illustrate this? I wrote a small piece of code trying to tell what's going on there but got no lucky. The result is "268 268". I don't see the length of the struct ext_dir_entry variable....
Code:

struct ext_dir_entry
{
        int inode;
        int rec_len;
        int name_len;
        char name[255];
};

int main()
{
        struct ext_dir_entry dir1 = {1000, 10, 10, "t"};            // random values except the last
        printf ("%d %d\n", sizeof( struct ext_dir_entry ), sizeof dir1);
        return 0;
}


wje_lq 01-09-2011 09:52 PM

Quote:

Originally Posted by password636 (Post 4218836)
Thank you very much. Can you tell me any documentation

Sure. The ANSI C standard, ISO/IEC 9899-1999(E), section 6.7.2.1, paragraph 16, begins with this:

Quote:

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
Quote:

Originally Posted by password636 (Post 4218836)
or code example to illustrate this?

Sure. I get this output:
Code:

sizeof()s: 8 1024
returned data: 11 17

... when I run the following perfectly legal program:
Code:

#include <stdio.h>
#include <stdlib.h>

typedef struct betty_s
{
  int fred;
  int barney[1];

} betty_t,
 *betty_p;

typedef struct wilma_s
{
  int fred;
  int barney[255];

} wilma_t,
 *wilma_p;

void
write_betty(betty_p structure, int index, int value)
{
  structure->barney[index]=value;

} /* write_betty() */

void
write_wilma(wilma_p structure, int index, int value)
{
  structure->barney[index]=value;

} /* write_wilma() */

int
read_betty(betty_p structure, int index)
{
  return structure->barney[index];

}  /* read_betty() */

int
read_wilma(wilma_p structure, int index)
{
  return structure->barney[index];

}  /* read_wilma() */

int main(void)
{
  betty_p the_betty;
  wilma_p the_wilma;

  the_betty=malloc(7*sizeof(int));
  the_wilma=malloc(7*sizeof(int));

  if((the_betty==NULL) || (the_wilma==NULL))
  {
    fprintf(stderr,"malloc() fail\n");

    return 1;
  }

  write_betty(the_betty,5,11);
  write_wilma(the_wilma,5,17);

  printf("sizeof()s: %d %d\n",sizeof(*the_betty),sizeof(*the_wilma));

  printf("returned data: %d %d\n",
        read_betty(the_betty,5),
        read_wilma(the_wilma,5)
        );

  return 0;

} /* main() */

As you can see, the sizeof() has nothing to do with the actual size of the data. I didn't use it in the malloc(), and I don't pay attention to it later.
Quote:

Originally Posted by password636 (Post 4218836)
I wrote a small piece of code trying to tell what's going on there but got no lucky. The result is "268 268". I don't see the length of the struct ext_dir_entry variable....
Code:

struct ext_dir_entry
{
        int inode;
        int rec_len;
        int name_len;
        char name[255];
};

int main()
{
        struct ext_dir_entry dir1 = {1000, 10, 10, "t"};            // random values except the last
        printf ("%d %d\n", sizeof( struct ext_dir_entry ), sizeof dir1);
        return 0;
}


Don't use the sizeof() operator here. It won't help you. Use the rec_len member to calculate where the next element is. And yes, this means that you can't use an array of struct ext_dir_entry to access any individual entries after the first one in a directory block.

password636 01-10-2011 01:26 AM

Quote:

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
I think an incomplete array type should have empty value between its square brackets, like `[]'. The size can't be determined. But in this case, between in the square brackets is 255. Can this be called an incomplete array type? Are flexible array member and the array named `name' the same thing?

And in my small code example, using sizeof got the same results no matter how short the `name' array was. If the lengths of the structure `ext_dir_entry' instances could be variable, then does it mean the sizeof operator returned wrong size? Or the sizeof operator should not be used to determine the size a structure of this kind?

wje_lq 01-10-2011 04:53 AM

Quote:

Originally Posted by password636 (Post 4218978)
I think an incomplete array type should have empty value between its square brackets, like `[]'.

That would be fine, but plugging a "wrong" number in there would be fine also. It makes no difference.
Quote:

Originally Posted by password636 (Post 4218978)
But in this case, between in the square brackets is 255. Can this be called an incomplete array type?

Yes.
Quote:

Originally Posted by password636 (Post 4218978)
Are flexible array member and the array named `name' the same thing?

The array named name is an example of a flexible array member, yes.
Quote:

Originally Posted by password636 (Post 4218978)
And in my small code example, using sizeof got the same results no matter how short the `name' array was.

Of course it did. sizeof is entirely determined at compile time. As far as the compiler is concerned, what you place in fields rec_len and name_len at runtime is not only irrelevant at compile time, it is unknowable at compile time.
Quote:

Originally Posted by password636 (Post 4218978)
If the lengths of the structure `ext_dir_entry' instances could be variable, then does it mean the sizeof operator returned wrong size?

No. sizeof returned the (correct) length of the structure as defined at compile time.
Quote:

Originally Posted by password636 (Post 4218978)
Or the sizeof operator should not be used to determine the size a structure of this kind?

Almost correct. sizeof can be used to determine the size of the structure, as defined at compile time. But that information is entirely useless to you. Use the data that's stored in the structure to determine where the next directory entry is.

password636 01-11-2011 02:43 AM

wje_lq, thank you so much for the detailed answers.
I think many of my questions were answered. But I still can't figure why people do not put such kind of information into the standard. I mean, putting a constant in the square brackets instead of empty, you still get variable-length array here in the structure. Quite confused.

wje_lq 01-11-2011 07:24 AM

Quote:

Originally Posted by password636 (Post 4220217)
I still can't figure why people do not put such kind of information into the standard.

The reason is that this confusion really has nothing to do with flexible array members. It has to do with not the language itself, but how one uses structs at runtime. Permit me to show you an unusual, ugly, perfectly standard-conforming example that contains almost no arrays at all. There's no array within the struct, and there's no array of structs. The only arrays are the (casted) arrays of char which we use for pointer arithmetic. And "ugly" means "Please don't ever write code like this."

Suppose you have the following struct, designed to handle up to twelve children:
Code:

/* Disclaimer: This ugly code has not been tested, or even compiled. */

typedef struct children_s
{
  int number_of_children;
  int firstborn_age;
  int secondborn_age;
  int thirdborn_age;
  int fourthborn_age;
  int fifthborn_age;
  int sixthborn_age;
  int seventhborn_age;
  int eighthborn_age;
  int ninthborn_age;
  int tenthborn_age;
  int eleventhborn_age;
  int twelfthborn_age;

} children_t,
 *children_p;

It is perfectly legal to malloc() an area for an example of this struct which is not as large as the whole struct, if you're not going to use the whole thing. In this case, we could use the number of children as an indication of how much memory the example occupies. Suppose we had one large chunk of memory which held a collection of these ugly things. Suppose we marked the end of the collection with one of these things having -1 as the number of children.

We could write a function which calculates the number of bytes required to store a particular family's information:
Code:

/* Disclaimer: This ugly code has not been tested, or even compiled. */

int family_storage_size(children_p ar_family)
{
  return *ar_family->number_of_children+1)*sizeof(int);

} /* family_storage_size() */

We could then write a function which takes the address of the whole collection and calculates the address of the struct which marks the end of the collection.
Code:

/* Disclaimer: This ugly code has not been tested, or even compiled. */

children_p find_end(children_p ar_whole_collection)
{
  children_p lo_result;

  for(lo_result=ar_whole_collection;
      lo_result->number_of_children>=0;
      lo_result=(children_p)(((char *)lo_result)+family_storage_size(lo_result));
  {
    /* Empty block. */
  }

  return lo_result;

} /* find_end() */

A function which adds a new struct to the end of the collection could look like this.
Code:

/* Disclaimer: This ugly code has not been tested, or even compiled. */

children_p add_family(children_p ar_collection,
                      int        ar_number_of_children,
                      int        ar_firstborn_age,
                      int        ar_secondborn_age,
                      int        ar_thirdborn_age,
                      int        ar_fourthborn_age,
                      int        ar_fifthborn_age,
                      int        ar_sixthborn_age,
                      int        ar_seventhborn_age,
                      int        ar_eighthborn_age,
                      int        ar_ninthborn_age,
                      int        ar_tenthborn_age,
                      int        ar_eleventhborn_age,
                      int        ar_twelfthborn_age
                    )
{
  children_p lo_new_family;
  children_p lo_result;

  if((ar_number_of_children< 0) ||
    (ar_number_of_children>12)
    )
  {
    /* Handle this error. */
  }

  lo_result=realloc(ar_collection,((char *)find_end(ar_collection)
                                -((char *)ar_collection)
                                +(ar_number_of_children+1+1)*sizeof(int)
  /*                                                    ^ ^              */
  /*                                                    | |              */
  /*              an extra int for the number of children |              */
  /*                                                      |              */
  /*          an extra int for the end-of-collection marker              */
                  );

  if(lo_result==NULL)
  {
    /* Put error handling code here. */
  }

  lo_new_family=find_end(lo_result);

  lo_new_family->number_of_children=ar_number_of_children;

  /* Really ugly: a switch statement containing no breaks. */

  switch(ar_number_of_children)
  {
    default: /* handle error; should never get here */
    case 12: lo_new_family->twelfthborn_age =ar_twelfthborn_age ;
    case 11: lo_new_family->eleventhborn_age=ar_eleventhborn_age;
    case 10: lo_new_family->tenthborn_age  =ar_tenthborn_age  ;
    case  9: lo_new_family->ninthborn_age  =ar_ninthborn_age  ;
    case  8: lo_new_family->eighthborn_age  =ar_eighthborn_age  ;
    case  7: lo_new_family->seventhborn_age =ar_seventhborn_age ;
    case  6: lo_new_family->sixthborn_age  =ar_sixthborn_age  ;
    case  5: lo_new_family->fifthborn_age  =ar_fifthborn_age  ;
    case  4: lo_new_family->fourthborn_age  =ar_fourthborn_age  ;
    case  3: lo_new_family->thirdborn_age  =ar_thirdborn_age  ;
    case  2: lo_new_family->secondborn_age  =ar_secondborn_age  ;
    case  1: lo_new_family->firstborn_age  =ar_firstborn_age  ;
    case  0: /* nothing else */
  }

  /* Also really ugly: Add the new marker for the end of the collection. */

  lo_new_family=(children_p)(((char *)lo_new_family)+family_storage_size(lo_new_family));

  lo_new_family->number_of_children=-1;

  return lo_result;

} /* add_family() */

As I said, please don't ever write code like this. But it illustrates the point that the size of a struct need not indicate how much memory you're really using.

But programmers do it all the time with flexible array members. You can put anything between the square brackets ([], [1], [255], anything) and it will work. And the only real reason to put any mention in the standard is that [] is allowed. If, in a different world, [] were not allowed, programmers could put any non-negative integer between the square brackets, and it would be just as legal as the ugly code I wrote above.

Allowing [], with no integer between the square brackets, is the only reason this is mentioned in the standard at all. It's a good way of documenting that the array length is flexible (determined at runtime), and one could argue that the implementers of ext2 should have done that.


All times are GMT -5. The time now is 05:54 PM.