LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-14-2012, 04:58 PM   #1
hawkfan50
LQ Newbie
 
Registered: Apr 2012
Posts: 10

Rep: Reputation: Disabled
Comparing Directories


So this program reads in 2 directories and then outputs which files are NEW and which have been MODIFIED. I can't figure out how to tell the difference between a subdirectory and a file in the modified directory. As it stands right now, if there's a subdirectory in moddir it'll treat it like a file and just call it NEW. But it's not suppose to do that. If the -R flag is on then the program should recurse through the subdirectories and display the corresponding information. Any suggestions would be great. I'm stumped.


Code:
#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <sys/stat.h>
#include <string.h>


int main(int argc, char ** argv)
{

	int recurse = 0;	
	int diff = 0;		
	char * basedir = NULL;
	char * moddir = NULL;	

	int i;
	for (i = 1; i < argc; i++)
	{
		if (argv[i][0] == '-')
		{
			if (strcmp(argv[i], "-R") == 0)
			{
				recurse = 1;
			}
			else if (strcmp(argv[i], "-D") == 0)
			{
				diff = 1;
			}
			else
			{
				printf("error: invalid option %s\n", argv[i]);
				exit(-1);
			}
		}
		else
		{
			if (basedir == NULL)
			{
				basedir = argv[i];
			}
			else if (moddir == NULL)
			{
				moddir = argv[i];
			}
			else
			{
				printf("error: invalid argument %s\n", argv[i]);
				exit(-2);
			}
		}
	}

	if ((basedir == NULL) || (moddir == NULL))
	{
		printf("error: must pass both basedir and moddir\n");
		exit(-3);
	}

	DIR * moddir_contents = opendir(moddir);
	if (moddir_contents != NULL)
	{
		struct dirent * moddir_entry = readdir(moddir_contents);
		while (moddir_entry != NULL)
		{

			DIR * basedir_contents = opendir(basedir);
			if (basedir_contents != NULL)
			{
				int matched = 0;	
			
				struct dirent * basedir_entry = readdir(basedir_contents);
				while (basedir_entry != NULL)
				{

					if (strcmp((*moddir_entry).d_name, (*basedir_entry).d_name) == 0)
					{
						matched = 1;
						
						struct stat buf;

						time_t mod_stamp;
						char modfilename[1024];
						strcpy(modfilename, moddir);
						strcat(modfilename, "/");
						strcat(modfilename, (*moddir_entry).d_name);
						if (stat(modfilename, &buf) == 0)
						{
							mod_stamp = buf.st_mtime;
						}
						else
						{
							printf("error: failed to get modified time for file %s\n", (*moddir_entry).d_name);
			
							exit(-5);
						}

						time_t base_stamp;
						char basefilename[1024];
						strcpy(basefilename, basedir);
						strcat(basefilename, "/");
						strcat(basefilename, (*basedir_entry).d_name);
						if (stat(basefilename, &buf) == 0)
						{
							base_stamp = buf.st_mtime;
						}
						else
						{
							printf("error: failed to get modified time for file %s\n", (*basedir_entry).d_name);
							exit(-5);
						}

						if (mod_stamp > base_stamp)
						{
							printf("%s MODIFIED\n", (*moddir_entry).d_name);
						}
					}

					basedir_entry = readdir(basedir_contents);
				}

				if (matched == 0)
				{
					printf("%s NEW\n", (*moddir_entry).d_name);
				}

				closedir(basedir_contents);
			}
			else
			{
				printf("error: failed to open basedir %s\n", basedir);
	
				exit(-5);
			}

			moddir_entry = readdir(moddir_contents);
		}

		closedir(moddir_contents);
	}
	else
	{
		printf("error: failed to open moddir %s\n", moddir);
		exit(-4);
	}

return 0;
}
 
Old 05-14-2012, 07:38 PM   #2
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Your general approach is faulty. You cannot do it recursively using a loop alone. Also, your approach of scanning the second directory for each match in the first is not only slow, but incomplete: what about files that only exist in the second directory?

Instead, I would recommend you write a function that compares the contents of two directories. For example, something along the lines of
Code:
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <errno.h>

struct list *compare(const char *const directory1,
                     const char *const directory2,
                     const int         flags)
{
    /* ... */
}
You will need to scan both directory1 and directory2 once. Whichever you scan first, you'll find entries that exist in that one alone, and in both. (Just try to get statistics for an entry by that name, under both directories.) Whichever you scan last, you are concerned only with the entries that do not exist in the first; remember, you already found all those.

When recursing into directories, you'll need to construct the new directory names, since the directory entries only contain the final segment (name), not the full path. Here is a helper function that returns the full path needed, as a dynamically allocated string:
Code:
#include <stdlib.h>
#include <string.h>
#include <errno.h>

char *pathto(const char *const dir, const char *const name)
{
    const size_t dirlen = (dir) ? strlen(dir) : 0;
    const size_t namelen = (name) ? strlen(name) : 0;
    const size_t size = dirlen + namelen + 2;
    size_t       len;
    char        *path;

    if (size < 3) {
        errno = EINVAL;
        return NULL;
    }

    path = malloc(len);
    if (!path) {
        errno = ENOMEM;
        return NULL;
    }

    len = dirlen;
    if (dirlen > 0) {
        memcpy(path, dir, dirlen);
        if (path[len-1] != '/')
            path[len++] = '/';
    }
    if (namelen > 0) {
        memcpy(path + len, name, namelen);
        len += namelen;
    }
    path[len] = '\0';

    return path;
}
After the recursive call has been done, using the dynamically allocated directory names, you need to free() them explicitly.
_ _ _ _ _ _ _ _ _ _

If this is part of a library or real application code, there is one deeply technical issue I'd like to bring up.

When handling directory structures, directories and files may be renamed at any point. Because of that, it is recommended that instead of relying solely on paths, applications should retain a descriptor to the directory, and use the fstatat(dirfd, name...) function (POSIX.1-2008, i.e. #define _POSIX_C_SOURCE 200809L). The descriptor will stay valid, even if the name of the underlying directory happened to change.

I understand that this is something that is totally new and alien to programmers with only Windows experience. Let me assure you: directory names are surprisingly volatile (and for useful reasons) in other OSes. Do not let Windows-isms drag down the quality of your code. (In Linux and BSDs and derivatives, you can rename or delete open files, even executables. I've found that many with only Windows experience expect having the file open be some kind of lock that should forbid such actions. I've never understood the reasoning for that.)

To open a directory for the purpose of scanning its contents, you can use
Code:
#define _POSIX_C_SOURCE 200809L
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

DIR *opendirat(int dirfd, const char *pathname)
{
    DIR *dir;
    int  fd, result, saved_errno;

    saved_errno = errno;

    do {
#ifdef O_NOCTTY
#ifdef O_DIRECTORY
        fd = openat(dirfd, pathname, O_RDONLY | O_DIRECTORY | O_NOCTTY);
#else
        fd = openat(dirfd, pathname, O_RDONLY | O_NOCTTY);
#endif
#else
#ifdef O_DIRECTORY
        fd = openat(dirfd, pathname, O_RDONLY | O_DIRECTORY);
#else
        fd = openat(dirfd, pathname, O_RDONLY);
#endif
#endif
    } while (fd == -1 && errno == EINTR);
    if (fd == -1)
        return NULL;

    do {
        dir = fdopendir(fd);
    } while (!dir && errno == EINTR);
    if (!dir) {
        saved_errno = errno;
        do {
            result = close(fd);
        } while (result == -1 && errno == EINTR);
        errno = saved_errno;
        return NULL;
    }

    /* fd is now incorporated into the dir handle;
     * it will be closed when the dir is closed.
    */

    errno = saved_errno;
    return dir;
}
In the general case, a process can acquire a descriptor to a directory it cannot read, as long as it can enter it. In this case, that is not necessary, because you cannot get the listing from such a directory anyway. Therefore the above simplified version is perfectly adequate, but only for when you need to scan the contents of that directory. It is not sufficient if you don't necessarily need to get a listing, but only need to enter said directory; although that sounds even simpler, for that you do need the general, rather complex version of the function.

In the most general case, a opendirat() implementation requires a child process entering the desired directory, passing the descriptor back via an ancillary socket message, because otherwise all other threads and signal handlers (and library code) would see the current working directory flipping back and forth while the code is running. Because the current working directory is process-wide, you really need to use a separate process to enter the new directory, then pass back a reference to it. Fortunately, this mess can almost always be avoided. For example, you can use a mutex or an rwlock to protect any access that has to do with the current working directory.

The difference between the general case and the above implementation is that the above implementation will only work if the current user has read rights to the directory (i.e. can see the directory listing).
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing 2 Directories hawkfan50 Linux - Newbie 2 04-30-2012 05:30 PM
Comparing two directories fergus Linux - General 7 04-03-2012 10:56 AM
Comparing directories bzenowich Linux - Software 3 10-08-2009 01:47 PM
comparing directories and files crazy8 Linux - Newbie 4 01-16-2008 10:33 AM
Comparing directories ursusman Linux - Newbie 5 07-04-2006 06:56 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:55 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration