LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Blogs > catkin
User Name
Password

Notices


Rate this Entry

Ruby: configuration file parsing utility

Posted 04-18-2011 at 10:12 AM by catkin

Netsearching found three options for Ruby to parse configuration files:So I rolled my own and offer it here for community use. I was delaying publishing because it is not finished (there are TODOs in the code) but it works well enough and may never be finished so better publish it now.

The code generates detailed error messages if the config file is defective -- useful when creating config files.

It may be easier to understand the code when its functionality is clear so here is a commented sample configuration file:
Code:
# * Leading and trailing whitespace is discarded.
# * Lines beginning with # and empty lines are ignored.
# * Lines ending in \ are continuation lines. The \ is removed and the contents
#   of the following line appended (any whitespace before the \ is retained).
# * For string values: 
#   - What is left must be of the format keyword = value
#   - Whitespace after the keyword is discarded.
#   - Whitespace before the value is discarded.
# * For array values: 
#   - What is left must be of the format keyword = [
#   - Followed by none or more lines, each one being an array member string
#     These lines may be continued using a trailing \
#   - Followed by a line containing a single ]
# * For hash values: 
#   - What is left must be of the format keyword = {
#   - Followed by none or more lines, each one being a hash key and value like
#       my_key => my_value
#     These lines may be continued using a trailing \
#   - Followed by a line containing a single }

CollationRootDir = /srv/docoll_dev/

Database = { 
  host => localhost
  db_name => docoll_dev
  password => some_password
  port => 5432
  user => superuser
}

# LeadingDirsToStrip regexes in config files are case-insentive
LeadingDirsToStrip = [
  /srv/hardcopy_indexes/
  /srv/old_KT/
  /srv/rsync/[^/]*/
  ^home/[^/]*/
  ^[A-Z]/
  .*/All Users/Documents/
  .*/My Documents/
  .*/My Music/
  .*/My Pictures/
]
When programming, the first thing to do is to provide a complete set of default parameters:
Code:
def InitialiseParameters
  $parameters = Hash.new

  # In alphabetical order ...

  $parameters[ "ConfigFile" ] = ""

  $parameters[ "CollationRootDir" ] = "/srv/docs/"

  $parameters[ "LeadingDirsToStrip" ] = [ \
    %r|/srv/rsync/| \
  ]

  $parameters[ "Database" ] = { \
    :db_name => "collation",
    :host => "localhost",
    :password => "whatever",
    :port => 5432,
    :user => "superuser"
  }

  $parameters[ "SourceRootDirs" ] = [ "/srv/rsync/" ]
end
If a config file is to be parsed (this example assumes it can only be named on the command line):
Code:
  # Parse any config file
  # Must do now so config file settings can be over-ridden by the command line
  x = ARGV.index( "--config" ) 
  if x != nil && ARGV[ x + 1 ] != nil
    config_file_error_msg = ParseConfigFile( ARGV[ x + 1 ], $parameters.keys )
  else
    config_file_error_msg = ''
  end
The ParseConfigFile object receives a list of keys for validation and checks for empty values; more specific error trapping is left for the caller allowing ParseConfigFile to be a library utility:
Code:
def ParseConfigFile( config_path, *valid_keywords )
  # TODO: change from valid_keywords to desired_keywords?
  # TODO: nice to pass name of variable to load with config data rather than assuming $parameters
  # TODO: support "=" in value

  begin
    fd = File.open( config_path, 'r' )
  rescue => error_info
    return "\n  Config file: #{ error_info }"
  end
  error_msg = ""
  while true
    key, value, get_error_msg = GetConfigFileData( fd )
    if get_error_msg == ''
      if key == ""; break end
      # Validate key
      valid = false
      valid_keywords[ 0 ].each \
      do |valid_keyword|
        if key == valid_keyword; valid = true; break; end
      end
      if valid
        $parameters[ key ] = value
      else
        error_msg += "\n  Invalid keyword '#{ key }' on line number #{ $INPUT_LINE_NUMBER }"
      end 
  
      # Validate value
      if value == ""
        error_msg += "\n  No value on line number #{ $INPUT_LINE_NUMBER }"
      end
    else
      error_msg += get_error_msg
      break
    end
  end
  if error_msg != ''
    error_msg = "Configuration file (#{ config_path }) error(s):" + error_msg
  end
  fd.close
  return error_msg
end
The nitty-gritty of parsing is done by the GetConfigFileData object, called once for each configuration element:
Code:
def GetConfigFileData( fd )
  # TODO: don't treat # in quoted values as a comment
  # TODO: allow comments after data

  continuing = false
  data = ''
  getting_array = false
  getting_hash = false
  key = ''
  begin
    while (line = fd.readline)
      line.strip!

      # Ignore comments and empty lines
      if line.index( '#' ) != nil || line == ''; next end

      # Gather key and value
      data += line
      case data[ -1 ]
        when "\\"
          continuing = true
          data.slice!( -1, 1 )
          data.rstrip!
          next
        when "["
          if getting_array
            return nil, nil, \
              "\n  New array started before end of array started on line" \
              + array_start_line
          end
          if getting_hash
            return nil, nil, \
              "\n  Array started before end of hash started on line" \
              + hash_start_line
          end
          data.slice!( -1, 1 )
          data.rstrip!
          array_start_line = "#{ $INPUT_LINE_NUMBER }"
          if data[ -1 ] != "="
            return nil, nil, \
              "\n  Array start line not of format 'key = [' on line " \
              + array_start_line
          end
          getting_array = true
          array = Array.new
        when "]"
          if ! getting_array
            return nil, nil, \
              + "\n  Array end on line #{ $INPUT_LINE_NUMBER } before array started"
          end
          data.slice!( -1, 1 ).rstrip!
          if data != ''
            return nil, nil, \
              + "\n  Data invalidly given before ] on line #{ $INPUT_LINE_NUMBER }"
          end
          return key, array, ""
        when "{"
          if getting_array
            return nil, nil, \
              "\n  Hash started before end of array started on line" \
              + array_start_line
          end
          if getting_hash
            return nil, nil, \
              "\n  New hash started before end of hash started on line" \
              + hash_start_line
          end
          data.slice!( -1, 1 )
          data.rstrip!
          hash_start_line = "#{ $INPUT_LINE_NUMBER }"
          if data[ -1 ] != "="
            return nil, nil, \
              "\n  Hash start line not of format 'key = {' on line " \
              + hash_start_line
          end
          getting_hash = true
          hash = Hash.new
        when "}"
          if ! getting_hash
            return nil, nil, \
              "\n  Hash end on line #{ $INPUT_LINE_NUMBER } before hash started"
          end
          data.slice!( -1, 1 ).rstrip!
          if data != ''
            return nil, nil, \
              "\n  Data invalidly given before } on line #{ $INPUT_LINE_NUMBER }"
          end
          return key, hash, ""
      end
      if key == ''
        if ! data.include?( '=' )
          return nil, nil, "\n  No = in #{ data }"
        end
        key, data, rest = data.split( '=' )
        if rest != nil
          return nil, nil, "\n '=' not supported in value (" + data + rest + ')'
        end
        key.rstrip!
        if data == nil
          data = ""
        else
          data.lstrip!
        end
      end
      if data != nil && data != ''
        if getting_array
          array += [ data ]
          data = ""
        elsif getting_hash
          # TODO: would be nice to accept "key: value" too
          if ! data.include?( " => " )
            return nil, nil, \
              + "\n  Hash key/value on line #{ $INPUT_LINE_NUMBER } does not have ' => '"
          end
          hash_key, hash_value = data.split( " => " )
          hash = hash.merge!( { hash_key.rstrip => hash_value.lstrip } )
          data = ""
        else
          return key, data, ""
        end
      end
    end
  rescue EOFError
    if continuing
      error_msg += "\n  End of file found when continuation line expected"
    end
    if getting_array
      error_msg += "\n  End of file found before end of array started on line" \
        + array_start_line
    end
    if getting_hash
      error_msg += "\n  End of file found before end of hash started on line" \
        + hash_start_line
    end
    return "", "", ""
  end
end
For completeness, here is a simple example of error traps done after parsing the config file:
Code:
def CheckParameters( )
  error_msg = ''
  error_msg += CheckDir( $parameters[ "CollationRootDir" ], 'w' )
  $parameters[ "SourceRootDirs" ].each \
  do |source_dir|
    error_msg += CheckDir( source_dir, 'r' )
  end
  return error_msg
end
Posted in Uncategorized
Views 3509 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 10:02 PM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration