[SOLVED] C code review, too much error checking, is it handled well?

Andy Alt · 06-22-2019, 04:54 PM

Hi... was looking for a little code review.

In this function, is_time_to_purge, does it look like the amount of error checking I'm using is appropriate? Too much? If there are any failures, does it seem like it's ok to just use exit()?

I know it may be kind of difficult to make an accurate assessment without reviewing more of the code base, but generally speaking I guess, does anything really stick out as "Oh, no!"?

Not all linked lists are freed at that point, such as *waste_curr, but I really can't imagine using free on every malloc'ed variable any time exit() is called (though I've seen comments by people that suggest that although exit() is supposed to free everything, it shouldn't be relied upon and vars should be explicity freed before calling exit()).

BW-userx · 06-22-2019, 05:03 PM

is there really too much error checking, and handling?

whenever I write a program, I put in the obvious error checking, then when I run it I try to break it, playing them what ifs,and I have come across the getting into over kill on this one particular thing that comes to mind that I did once. to where it was getting to ridiculous to try and conceive every possible way someone could type something in and try to catch it and throw an error, or handle the error, then I chalked that up to user error for not using the program as it was written to be used. ie wrong information on the cli.

So I do suppose if it gets to that point, you should know you've gone too far.

as far as exit and freeing everything before exit, I've read that today's pc are more capable of freeing memory upon exiting a program then yesterdays were. I think it is still an open debate on that one.

ntubski · 06-22-2019, 10:13 PM

It's definitely a legitimate strategy for a short-running command line tool to just call exit() on errors, and rely on the OS to clean up memory and file handles. It does have the drawback of confusing automated tools for catching memory leaks like valgrind (it's hard for the tool to know whether some memory is leaked by accident or you're just intending to let the OS clean up).

Code:

 * @return a bool value

This isn't a useful comment, the code already tells me you're returning a bool value. Say what true or false signifies (e.g., "@return true if trash should be purged").

Code:

  char file_lastpurge[MP];
  sprintf (file_lastpurge, "%s%s", HOMEDIR, PURGE_DAY_FILE);

  bufchk (file_lastpurge, MP);

This makes me nervous about buffer overruns. And don't think that bufchk will save you, the program can easily crash before it's called.

Your "Create file if it doesn't exist" code duplicates the most of the code for updating file_lastpurge.

Code:

  if (exists (file_lastpurge))
  {
     fp = fopen (file_lastpurge, "r");

It's better to use something like

Code:

if (fp = fopen(file_lastpurge, "r"))
{
  // exists and readable
}
else
{
  // doesn't exist or is unreadable for some reason
}

See https://en.wikipedia.org/wiki/TOCTOU

Andy Alt · 06-23-2019, 04:48 PM

@ntubski, @BW-userx, Thanks for the suggestions and review. is this quite a bit better?

Code:

/*!
 * Called in @ref main() to determine whether or not @ref purge() was run today, reads
 * and writes to the 'lastpurge` file. If it hasn't been run today, the
 * current day will be written. If 'lastpurge' doesn't exist, it gets
 * created.
 * @return FALSE if the day is the same (don't purge); otherwise TRUE and
 * writes the current day to 'lastpurge'
 */
bool
is_time_to_purge (void)
{
  int req_len = strlen (HOMEDIR) + strlen (PURGE_DAY_FILE) + 1;
  char file_lastpurge[req_len];
  snprintf (file_lastpurge, req_len, "%s%s", HOMEDIR, PURGE_DAY_FILE);

  char today_dd[3];
  get_time_string (today_dd, 3, "%d");

  FILE *fp = fopen (file_lastpurge, "r");

  if (fp)
  {
    char last_purge_dd[3];

    if (fgets (last_purge_dd, sizeof (last_purge_dd), fp) == NULL)
    {
      print_msg_error ();
      printf ("while getting line from %s\n", file_lastpurge);
      perror (__func__);
      close_file (fp, file_lastpurge, __func__);
      exit (ERR_FGETS);
    }

    bufchk (last_purge_dd, 3);
    trim_white_space (last_purge_dd);

    close_file (fp, file_lastpurge, __func__);

    /*
     * if these are the same, purge has already been run today
     */

    if (!strcmp (today_dd, last_purge_dd))
      return FALSE;
  }

  bool init = exists(file_lastpurge);
  fp = fopen (file_lastpurge, "w");
  if (fp)
  {
    fprintf (fp, "%s\n", today_dd);
    close_file (fp, file_lastpurge, __func__);

    /*
     * if this is the first time the file got created, it's very likely
     * indeed that purge does not need to run. Only return FALSE if the
     * file didn't previously exist.
     */
    return init;
  }

  /*
   * pretty sure the stream should be closed (free the pointer)
   * even if it's returned NULL
   */
  close_file (fp, file_lastpurge, __func__);

  /*
   * if we can't even write this file to the config directory, something
   * is not right. Make it fatal.
   */
  open_err (file_lastpurge, __func__);
  msg_return_code (ERR_OPEN);
  exit (ERR_OPEN);
}

ntubski · 06-24-2019, 05:37 PM

Quote:

Code:

bool init = exists(file_lastpurge);

I suggest moving this upward and writing it like this instead:

Code:

  FILE *fp = fopen (file_lastpurge, "r");
  bool init = (fp != NULL);

Quote:

Code:

bufchk (last_purge_dd, 3);

By the way, I'm skeptical that this bufchk is useful. Has its check ever triggerred for you? (anywhere, not just this particular usage)

Quote:

Code:

  /*
   * pretty sure the stream should be closed (free the pointer)
   * even if it's returned NULL
   */

Not the case apparently (I had to look this up myself)
https://stackoverflow.com/questions/...eturning-error

GazL · 06-24-2019, 07:38 PM

bool init == (fp != NULL);

Shouldn't that be an assignment (single '=') ?

ntubski · 06-24-2019, 07:43 PM

Quote:

Originally Posted by GazL

bool init == (fp != NULL);

Shouldn't that be an assignment (single '=') ?

Oops, yes, edited.

GazL · 06-24-2019, 07:58 PM

BTW, what's with the '@ref' and '@return' stuff in the comments? Is that some sort of markup for an IDE or other developer tool?

BW-userx · 06-24-2019, 09:43 PM

I think that is self documentation noting for some other software to help write out the documentation on coding. for a lack of a better way of saying it. I seen something like that before.

Andy Alt · 06-25-2019, 12:37 AM

Quote:

Originally Posted by GazL

BTW, what's with the '@ref' and '@return' stuff in the comments? Is that some sort of markup for an IDE or other developer tool?

Hi GazL... I use that to generate API docs for rmw using Doxygen.

Docs like that probably aren't needed for a program like this, but I wanted to learn how to use Doxygen, and figured maybe having things documented like that on a web site could be helpful to future contributors.

Quote:

Originally Posted by ntubski

I suggest moving this upward and writing it like this instead:

Code:

  FILE *fp = fopen (file_lastpurge, "r");
  bool init = (fp != NULL);

Nice! Thanks again.

Quote:

By the way, I'm skeptical that this bufchk is useful. Has its check ever triggerred for you? (anywhere, not just this particular usage)

Ok, so for the benefit of other thread readers, I'll post some of the other functions related to this (rather than linking to them)...

Yes (and good question!), bufchk() has triggered for me. Not too recently as most of the program bugs are worked out. In the past, it would happen more often when I introduce some changes and accidentally muck something up. For testing the entire program before releases, after I make a lot of code changes I install rmw and use it as I normally would (also I implemented unit tests several months ago).

To test this particular function, I've defined MP to different values at compile-time (8, 20, 40). As long as I know the destination array size, I can use bufchk to.. for example, check a malloc'ed string that I want to put in a char[] array with an already declared size. If bufchk() says ok, I then copy the string to the array.

I do have some unit tests written for it if you're more curious than that.. buffer_overrun.c and buffer_overrun.sh.in (the actual script gets created when '../configure' is run). If you're interested in a demo, just clone the repo and use 'make check`, and look at test/buffer_overrun.sh.log.

Code:

/*!
 * Verify that *str doesn't exceed boundary, otherwise exit with an error code
 * @param[in] str The string to check
 * @param[in] boundary boundary
 * @return void
 * @see msg_err_buffer_overrun
 * @see bufchk_len
 */
void
bufchk (const char *str, ushort boundary)
{
  entry_NULL_check (str, __func__);

  /* STR_PART defines the first n characters of the string to display.
   * This assumes 10 will never exceed a buffer size. In this program,
   * there are no buffers that are <= 10 (that I can think of right now)
   */
#define STR_PART 10

  static ushort len;
  len = strlen (str);

  if (len < boundary)
    return;

#ifdef TEST_LIB
  errno = 1;
  return;
#endif

  print_msg_error ();
  /*
   * For now, using arbitrary arguments instead of changing every call to bufchk
   */
  msg_err_buffer_overrun ("unknown", 0);

  /*
   * This will add a null terminator within the boundary specified by
   * display_len, making it possible to view part of the strings to help
   * with debugging or tracing the error.
   */
  static ushort display_len;
  display_len = 0;

  display_len = (boundary > STR_PART) ? STR_PART : boundary;

  char temp[display_len];
  strncpy (temp, str, display_len);
  temp[display_len - 1] = '\0';

  /* Though we've set STR_PART to 10, we don't really know what's "safe",
   * so only display this if verbosity is on
   */
  if (verbose)
  {
    fprintf (stderr,
             _
             (" <--> Displaying part of the string that caused the error <-->\n\n"));
    fprintf (stderr, "%s\n\n", temp);
  }

  msg_return_code (EXIT_BUF_ERR);
  exit (EXIT_BUF_ERR);
}

This is a new function I made. I guess the comments are good enough I don't need to explain it.

Code:

/*!
 * Verify that len doesn't exceed boundary, otherwise exit with an error code.
 * Usually used before concatenating 2 or more strings. Program will exit
 * with an error code if len exceeds boundary. len should already have space
 * for the null terminator when this function is called.
 * @param[in] len The string to check
 * @param[in] boundary boundary
 * @param[in] func The calling function
 * @param[in] line The line number from where the function was called
 * @return void
 * @see msg_err_buffer_overrun
 * @see bufchk
 */
void
bufchk_len (const int len, const int boundary, const char *func, const int line)
{
  if (len <= boundary)
    return;

#ifdef TEST_LIB
  errno = 1;
  return;
#endif

  print_msg_error ();
  msg_err_buffer_overrun (func, line);
  exit (EXIT_BUF_ERR);
}

Another function I recently made. First "variadic" function I ever made.

I use it with bufchk_len(). If I want to concatenate 3 separate strings into an array, I can check the combined length with just one function, instead of using strlen() before each one. It has some flaws that are probably obvious so I have to be pretty careful with the calling functions (such as making sure argc matches, and only passing it strings).

Code:

/*!
 * Get the combined length of multiple strings.
 *
 * @param[in] argc The number of arguments
 * @param[in] ... variable argument list, each argument should be a string
 * @return the combined length of each string
 */
int
multi_strlen (int argc, ...)
{
  va_list vlist;
  int len = 0;
  int i;
  va_start (vlist, argc);
  for (i = 0; i < argc; i++)
    len += strlen (va_arg (vlist, char*));
  va_end(vlist);
  return len;
}

To help keep my error messages consistent.

Code:

/*!
 * Used by functions that prevent buffer overflows
 * @param[in] func the function from which it was originally called
 * @param[in] line the line number of the original calling function
 * @return void
 * @see bufchk
 * @see bufchk_len
 */
void
msg_err_buffer_overrun (const char *func, const int line)
{
  fprintf (stderr, "func = %s, line = %d\n", func, line);
  /* TRANSLATORS:  "buffer" in the following instances refers to the amount
   * of memory allocated for a string  */
  fputs (_("buffer overrun (segmentation fault) prevented.\n"), stderr);
  fputs (_("If you think this may be a bug, please report it to the rmw developers.\n"), stderr);
}

I really haven't gotten much feedback yet about rmw from other developers (because I haven't asked much yet), but I'm open to receiving more feedback from anyone who has the time and interest. Thanks.

NevemTeve · 06-25-2019, 02:07 AM

Maybe it would be more robust if you stored unix-timestamp instead of just day-of-month. (Or you could go for human-readable timestamp (strftime("%Y%m%d%H%M%S")), but then you have to choose from UTC/localtime. I'd suggest UTC.)

GazL · 06-25-2019, 04:40 AM

Quote:

Originally Posted by BW-userx

I seen something like that before.

Yes, I had as well, but I didn't know what it was used by. Thanks to Andy I now know it's doxygen.

ntubski · 06-25-2019, 05:27 AM

Quote:

Originally Posted by Andy Alt

Yes (and good question!), bufchk() has triggered for me. Not too recently as most of the program bugs are worked out. In the past, it would happen more often when I introduce some changes and accidentally muck something up. For testing the entire program before releases, after I make a lot of code changes I install rmw and use it as I normally would (also I implemented unit tests several months ago).

Okay, interesting. I was thinking about suggesting a safer API (e.g., see http://www.and.org/vstr/comparison and http://www.and.org/vstr/security), but on the other hand, if most of your string manipulation is just reading and writing fixed-length strings like timestamps it might not be worth the trouble.

Quote:

Code:

  fputs (_("buffer overrun (segmentation fault) prevented.\n"), stderr);

You haven't prevented the overrun, just detected it.

Andy Alt · 06-25-2019, 05:30 PM

Quote:

Originally Posted by ntubski

Okay, interesting. I was thinking about suggesting a safer API (e.g., see http://www.and.org/vstr/comparison and http://www.and.org/vstr/security), but on the other hand, if most of your string manipulation is just reading and writing fixed-length strings like timestamps it might not be worth the trouble.

Most of the strings I'd say are variable length, consisting of file and path names. But I wind up putting them in fixed-length arrays

Quote:

rmw (ReMove to Waste) is a cross-platform command-line "trash can" utility. It can send files to your "Desktop" trash, or a completely separate folder; restore files and append a unique string to the filenames so they won't be overwritten (duplication protection).

Code:

/** Some system don't define PATH_MAX. A limit needs to be set however,
 * and if it's not defined in a system header file, define it here
 */
#ifndef PATH_MAX /* for portability */
#  define PATH_MAX 256
#endif

#define MP (PATH_MAX + 1)

Not really sure why I'm not using "MP" below, but here's an as-is example of a "popular" structure used in the program...

Code:

typedef struct st_waste st_waste;

/** Each waste directory is added to a linked list and has the data
 * from this structure associated with it.
 */
struct st_waste{
  /** The parent directory, e.g. $HOME/.local/share/Trash */
  char parent[PATH_MAX + 1];

  /*! The info directory (where .trashinfo files are written) will be appended to the parent directory */
  char info[PATH_MAX + 1];

  /** Appended to the parent directory, where files are moved to when they are rmw'ed
   */
  char files[PATH_MAX + 1];

  /** The device number of the filesystem on which the file resides. rmw does
   * not copy files from one filesystem to another, but rather only moves them.
   * They must reside on the same filesystem as a WASTE folder specified in
   * the configuration file.
   */
  unsigned int dev_num;

  /** set to <tt>true</tt> if the parent directory is on a removable device,
   * <tt>false</tt> otherwise.
   */
  bool removable;

  /** Points to the previous WASTE directory in the linked list
   */
  st_waste *prev_node;

  /** Points to the next WASTE directory in the linked list
   */
  st_waste *next_node;
};

Quote:

fputs (_("buffer overrun (segmentation fault) prevented.\n"), stderr);

You haven't prevented the overrun, just detected it.

Hmm.. I think of it like... it's not detected because it hasn't happened. The program for some reason encountered a string that was longer than the destination array, so it's quitting instead of trying to complete the transaction. I'm not sure what better wording would be for that fputs() statement.

btw, here's my updated is_time_to_purge() function

Andy Alt · 06-25-2019, 06:22 PM

Quote:

Originally Posted by NevemTeve

Maybe it would be more robust if you stored unix-timestamp instead of just day-of-month. (Or you could go for human-readable timestamp (strftime("%Y%m%d%H%M%S")), but then you have to choose from UTC/localtime. I'd suggest UTC.)

Good idea, I've opened a ticket.