LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-13-2004, 04:03 PM   #1
unholy
Member
 
Registered: Sep 2003
Location: Eire
Distribution: Ubuntu 7.10
Posts: 344

Rep: Reputation: 30
Parsing config files


I'm just wondering if every programmer is writing the same code to parse a config file, or is there some code out there to do it. ( You can call me lazy, but I prefer to call it reuse )

For example, the semi-colon seperated path list. And accounting for the possibility that the user may have hacked the file, which means the code has to determine if the file is valid or not, and recover gracefully etc.

Thank you for listening,

unholy
 
Old 09-14-2004, 07:42 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,356

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
Well, the following Perl allows for an optional cmd line specified cfg file, else it looks for a file in the current dir, named after the program.
It also allows for 1 level of recursion ie an entry in the cfg file can point to another cfg file to be included.
The basic entry is
var=value
and blank lines and lines begininning with '#' ie comments are allowed, but will be ignored by the routine.

Code:
#******************************************************************************
#
# Function      : get_cmd_line_params
#
# Description   : Get the cfg filename (optional) from the cmd line,
#                 then call get_config_params() to read either the file
#                 specified, or a default cfg file if no cmd line
#                 param supplied.
#
# Params        : none
#
# Returns       : none
#
#******************************************************************************
sub get_cmd_line_params
{
    # Get the cmd line options
    getopts('c:');

    # Check if customer id supplied
    if( $Getopt::Std::opt_c )
    {
        # Assign to global hash, forcing uppercase keys
        $config::params{'CMD_LINE_CFG'} = $Getopt::Std::opt_c;
    }

    # Call fn to get cfg values
    get_config_params();

    return;
}

#******************************************************************************
#
# Function      : get_config_params
#
# Description   : Get config params from associated cfg file.
#
# Params        : none
#
# Returns       : none
#
#******************************************************************************
sub get_config_params
{
    my (
        $config_file,   # eg program.cfg
        $prog_dir,      # program file dir
        $prog_ext,      # program file extension
        $config_rec,    # config file record
        $key,           # hash key from config rec
        $value          # hash value from config rec
        );

    # Derive config filename:
    # Start by checking for extra cfg filename specified in
    # cfg file and loaded into hash 1st time through this routine
    # ie this is 2nd time through routine
    if( $config::params{'INCLUDE_CFG'} )
    {
        $config_file = $config::params{'INCLUDE_CFG'};

        # Ensure we don't go into an infinite loop
        delete($config::params{'INCLUDE_CFG'});
    }
    elsif( $config::params{'CMD_LINE_CFG'} )
    {
        # Get cfg filename from cmd line param
        $config_file = $config::params{'CMD_LINE_CFG'};
    }
    else
    {
        # 1st time through this routine
        # Program private cfg filename is based on prog name
        ($config_file, $prog_dir, $prog_ext) = fileparse($0, '\..*');
        $config_file .= ".cfg";
    }

    # Open config file
    open( CONFIG_FILE, "<$config_file" ) or
            die "Can't open config file: $config_file: $!\n";

    # Process config file records
    while ( defined ( $config_rec = <CONFIG_FILE> ) )
    {
        # Remove unwanted chars
        chomp $config_rec;                  # newline
        $config_rec =~ s/#.*//;             # comments
        $config_rec =~ s/^\s+//;            # leading whitespace
        $config_rec =~ s/\s+$//;            # trailing whitespace


        next unless length($config_rec);    # anything left?

        # Split 'key=value' string
        ($key, $value) = split( /\s*=\s*/, $config_rec, 2);

        # Assign to global hash, forcing uppercase keys
        $config::params{uc($key)} = $value;
    }

    # Close config file
    close (CONFIG_FILE) or
            die "Can't close config file: $config_file: $!\n";

    # Check for included cfg ie extra cfg file specified
    # in prog private cfg file.
    if( exists $config::params{'INCLUDE_CFG'} )
    {
        # If key has a value, then try to get
        # params from included cfg file,
        # else, remove unwanted key
        if( $config::params{'INCLUDE_CFG'} )
        {
            get_config_params();
        }
        else
        {
            delete($config::params{'INCLUDE_CFG'});
        }
    }
}
It creates a global hash via a package
Code:
# Declare Config pkg so we can refer to it anywhere
{
    package config;

    # Config file params
    %config::params = ();
}
and you'll need:

Code:
use Getopt::Std;        # Get cmd line params
use locale;             # Ensure correct charset for eg 'uc()'
use File::Basename;     # Extract filepath components
use strict;             # Enforce declarations
When it comes to parsing specific values eg PATH type lines etc, you'll need to write a specific bit of code for each type of value you are going to have.
Only you know what they are going to be...

HTH
 
Old 09-17-2004, 01:42 PM   #3
unholy
Member
 
Registered: Sep 2003
Location: Eire
Distribution: Ubuntu 7.10
Posts: 344

Original Poster
Rep: Reputation: 30
Thanks Chris,

thats good to know. Basically I have to write my own code! I'm writing the program in C++, and I want to make sure that a malformatted config file wont throw the application off course (eg seg fault or something).

I've written a gui tool which always outputs a well formatted config file, but I know many people would prefer to edit the file directly. (That is of course, if anyone is interested in my app).

Thanks for getting me interested in Perl

Regards,

unholy
 
Old 09-17-2004, 04:51 PM   #4
barisdemiray
Member
 
Registered: Sep 2003
Location: Ankara/Turkey
Distribution: Slackware
Posts: 155

Rep: Reputation: 30
I don't know whether there is an API or something similar but you can use getline() and strtok() for config files. For example i use them for a configuration file that has entries like

Code:
option value
as in:

Code:
/*
 * conf_line : .conf file line to parse
 * conf_attr : .conf file attribute to return
 * conf_value: .conf file attribute's value to return
 */
int ns_parse_conf_line(char *conf_line,
                       char *conf_attr,
                       char *conf_value)
{
	char *tmp = (char *)calloc(BYTE * (strlen(conf_line)+1), BYTE);
	char *tkn = (char *)calloc(BYTE * (strlen(conf_line)+1), BYTE);

	strncpy(tmp, conf_line, strlen(conf_line)+1);
	tkn = (char *)strtok(tmp, " \n\v\t");
	if (tkn)	{
		strncpy(conf_attr, tkn, strlen(tkn));
		conf_attr[strlen(tkn)] = 0x00;
	}
	tkn = (char *)strtok(NULL, " \n\v\t");
	if (tkn)	{
		strncpy(conf_value, tkn, strlen(tkn));
		conf_value[strlen(tkn)] = 0x00;
	}

	if (tmp) free(tmp);
	//if (tkn) free(tkn);

	return 1;
}

int ns_get_conf_value(char *attr,		/* Attribute name to retrieve it's value */
                      int  *value_num,  /* filled if value is numeric, -1234 otherwise */
                      char *value_char) /* filled if value is character-based, NULL otherwise */
{
	FILE   * conf_file;
	char   * conf_line = NULL;		/* currently readed .conf line */
	size_t   conf_line_size = 0;	/* currently readed .conf line size
					* filled in *n parameter of getline */
	ssize_t  conf_readed;			/* currently readed .conf line size
							      * filled by return value of getline */
	int i;

	char *parsed_conf_attr  = (char *)calloc(BYTE * 40, BYTE);
	char *parsed_conf_value = (char *)calloc(BYTE * 40, BYTE);

	/* variables for config file format check */
	short int num_space   = 0;
	short int num_invalid = 0;
	/* variables for config line types */
	short int type_comment = 0;			/* line is a comment-line */
	short int type_empty   = 0;			/* line begins with space or newline */
	short int type_valid   = 1;			/* line is valid or not */
	int conf_line_number   = 1;			/* for appending line number to log messages */

	/* For locally generated errors */
	char *err_str_local = (char *)malloc(BYTE * 200);

	if ( (conf_file = fopen(CONF_FILE_PATH, "r")) == NULL )	{
		ns_log_event("configuration file cannot be opened!", LOG_ERR, \
					 NETSENTINEL_MODE);
		perror("\aERROR: ns.conf file cannot be opened!\nPlease check the log files");
		exit(1);
	}

	/* read the config file line by line */
	while ( (conf_readed = getline(&conf_line, &conf_line_size, conf_file)) != -1 )	{
		for (i=0; i<strlen(conf_line); ++i)	{ /* check the characters in line */
			if (conf_line[i] == 0x20)
				++num_space;
			/* '/' : Database/TableName separator ( NetSentinel/Packets )
			 * '_' : Word separator               ( here_is_a_variable  )
			 * '.' : Required for file names      ( ns_packet_logs.xml  )
			 * All other characters are invalid
			 */
			if (!isalnum(conf_line[i]) && !isspace(conf_line[i]) && \
				conf_line[i] != '/'    && conf_line[i] != '_'    && \
				conf_line[i] != '.')
				++num_invalid;
		}

		/* is current line a comment? */
		if (conf_line[0] == 0x23)	{
			printf("Line %02d: comment line found\n", conf_line_number);
			type_comment = 1;
			type_valid   = 0;
			goto fire_exit;
		}
		if (conf_line[0] == 0x20 || conf_line[0] == 0x0A) {
			printf("Line %02d: invalid line (starts with newline or space)\n", \
                   conf_line_number);
			type_valid = 0;
			goto fire_exit;
		}
		if (num_space != 1)	{
			snprintf(err_str_local, 199, \
					"[.conf File Error at Line %d] More than 2 tokens found", \
					conf_line_number);
			ns_log_event(err_str_local, LOG_ERR, NETSENTINEL_MODE);
			printf("Line %02d: invalid line (more than 2 tokens found)\n", \
                   conf_line_number);
			type_valid = 0;
			goto fire_exit;
		}
		if (num_invalid)	{
			snprintf(err_str_local, 199, \
					"[.conf File Error at Line %d] Invalid characters found", \
					conf_line_number);
			ns_log_event(err_str_local, LOG_ERR, NETSENTINEL_MODE);
			printf("Line %02d: invalid line (invalid characters found)\n", \
                   conf_line_number);
			type_valid = 0;
			goto fire_exit;
		}

		if ( type_valid )	{
			ns_parse_conf_line(conf_line, parsed_conf_attr, parsed_conf_value);
			if ( !strcmp(parsed_conf_attr, attr) )	{	/* we found it! */
				/* check the option's value type (numeric or character-based) */
				if ( isdigit(parsed_conf_value[0]) )	{
					*value_num = atoi(parsed_conf_value);
					goto mission_completed;
				} else {
					strncpy(value_char, parsed_conf_value, strlen(parsed_conf_value)+1);
					goto mission_completed;
				}
			}
		}

	fire_exit:
		type_comment = 0;
		type_empty   = 0;
		type_valid   = 1;
		num_space    = 0;
		num_invalid  = 0;
		++conf_line_number;
	}

  mission_completed:
	free(parsed_conf_attr);
	free(parsed_conf_value);
	free(err_str_local);
	free(conf_line);	/* free conf_line allocated
                         * by getline() internally */

	return 1;
}
There can be error(s) in the code (and highly likely there is), but the code should give an idea. Here ns_get_conf_value() function gets the option name as its first argument and then checks whether it has correct syntax, is a comment line, etc. and if it's valid then it passes the readed line to ns_parse_conf_line(). ns_parse_conf_line() will parse the configuration file line and return the option name and option value separately. Then if the value is numeric, value_num will be filled and value_char will be NULL (and vice versa) and returned.

Also i'd like to know whether there is an interface for .conf files. There is no need to re-invent the wheel i think.
 
Old 09-18-2004, 09:38 AM   #5
llama_meme
Member
 
Registered: Nov 2001
Location: London, England
Distribution: Gentoo, FreeBSD
Posts: 590

Rep: Reputation: 30
It might be a bit over the top for a simple configuration file, but you could try using the Spirit parser framweork (http://spirit.sourceforge.net/). It allows you to write recursive descent grammars directly in C++, so you can parse (pretty much) anything that you could parse using Lex/Yacc, but without the hassle of code generation. It will also ensure that your parser doesn't segfault.

Alex
 
Old 09-18-2004, 03:59 PM   #6
unholy
Member
 
Registered: Sep 2003
Location: Eire
Distribution: Ubuntu 7.10
Posts: 344

Original Poster
Rep: Reputation: 30
Thanks for the code barisdemiray,

and I agree that there is no point in reinventing the wheel. How many thousands if not millions of unix / gnu programmers have written code to perform the same task. And, as you said, there could be bugs!

Config files (from the few I have seen) usually have the same 'grammar' regardless of the vocab, eg..
Code:
[variable]=value;

[checked]=true;

paths=/first/path;/yet/another/path

[defaultapp]=frozenbubble
Maybe this task is considered too simple for a dedicated library, like llama_meme said.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
config file parsing in C/C++ kadissie Programming 12 08-26-2009 05:19 PM
Error Parsing Config file for X chiefreborn SUSE / openSUSE 5 08-31-2005 06:30 PM
Help Parsing Log files blacky777 Linux - Security 6 04-21-2004 01:06 PM
Suggestiions for config file parsing library kubicon Linux - General 0 03-01-2004 02:02 PM
can't locate parsing code for .config file coolwind Linux - General 1 12-08-2002 08:24 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:42 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration