[SOLVED] Have I found a bug in the GAWK asorti() function?

PTrenholme · 09-05-2016, 07:28 PM

I have an alias:

Code:

alias Mounted='gawk '\''/^\//{print gensub("\\\\040"," ","g",$2)}'\'' /etc/mtab | sort'

that I wanted to improve to include to include mount-point information, The alias I used for that:

Code:

alias mounted='mount | grep -v ^[^/] | sort'

produced more than I wanted and seldom fits in a terminal window.

So I started to write a GAWK program (which I'll post later.) but the output was not what I expected when I wanted to see my mounts listed by their mount-points. I finally figured out that the return from asorti was not producing what I expected. I added code to my Mount.gawk program do do what I expected, and to call the part that uses the asorti function as a "test" option.

So, the program has three options: To display the file system name and mount sorted by name, device, or "test". Here is the output I got:

Code:

$ ./Mounted.gawk byname      
/                                       (/dev/md127)
/Backups                                (/dev/sdc1)
/Debian                                 (/dev/sdb5)
/Fedora                                 (/dev/sdb1)
/SD                                     (/dev/sdd1)
/Win10                                  (/dev/sda2)
/Win10/HP_Recover                       (/dev/sda3)
/Win10/System                           (/dev/sda1)
/mnt/ISO/1635 - The Eastern Front       (/dev/loop1)
               "                        (/dev/loop0)
/mnt/ISO/Cryoburn                       (/dev/loop15)
/mnt/ISO/Invasion                       (/dev/loop11)
/mnt/ISO/Storm from the Shadows         (/dev/loop12)
/mnt/ISO/The Best of Jim Baens Universe (/dev/loop6)
/mnt/ISO/The Claws that Catch           (/dev/loop7)
/mnt/ISO/The Spider                     (/dev/loop13)
        "                               (/dev/loop14)
/mnt/ISO/This Septer'd Isle             (/dev/loop8)
/mnt/ISO/Unto the Breach                (/dev/loop4)
           "                            (/dev/loop5)
/mnt/ISO/When the Tide Rises            (/dev/loop3)
             "                          (/dev/loop2)
/mnt/ISO/Windrider's Oath               (/dev/loop9)
           "                            (/dev/loop10)

$ ./Mounted.gawk by-dev
/mnt/ISO/1635 - The Eastern Front       (/dev/loop0)
/mnt/ISO/1635 - The Eastern Front       (/dev/loop1)
/mnt/ISO/When the Tide Rises            (/dev/loop2)
/mnt/ISO/When the Tide Rises            (/dev/loop3)
/mnt/ISO/Unto the Breach                (/dev/loop4)
/mnt/ISO/Unto the Breach                (/dev/loop5)
/mnt/ISO/The Best of Jim Baens Universe (/dev/loop6)
/mnt/ISO/The Claws that Catch           (/dev/loop7)
/mnt/ISO/This Septer'd Isle             (/dev/loop8)
/mnt/ISO/Windrider's Oath               (/dev/loop9)
/mnt/ISO/Windrider's Oath               (/dev/loop10)
/mnt/ISO/Invasion                       (/dev/loop11)
/mnt/ISO/Storm from the Shadows         (/dev/loop12)
/mnt/ISO/The Spider                     (/dev/loop13)
/mnt/ISO/The Spider                     (/dev/loop14)
/mnt/ISO/Cryoburn                       (/dev/loop15)
/                                       (/dev/md127)
/Win10/System                           (/dev/sda1)
/Win10                                  (/dev/sda2)
/Win10/HP_Recover                       (/dev/sda3)
/Fedora                                 (/dev/sdb1)
/Debian                                 (/dev/sdb5)
/Backups                                (/dev/sdc1)
/SD                                     (/dev/sdd1)

$ ./Mounted.gawk test
/mnt/ISO/The Spider                     (/dev/loop14)
/mnt/ISO/When the Tide Rises            (/dev/loop3)
/mnt/ISO/Cryoburn                       (/dev/loop15)
/mnt/ISO/Unto the Breach                (/dev/loop4)
/mnt/ISO/Unto the Breach                (/dev/loop5)
/mnt/ISO/The Best of Jim Baens Universe (/dev/loop6)
/mnt/ISO/The Claws that Catch           (/dev/loop7)
/mnt/ISO/This Septer'd Isle             (/dev/loop8)
/mnt/ISO/Windrider's Oath               (/dev/loop9)
/mnt/ISO/Windrider's Oath               (/dev/loop10)
/mnt/ISO/Invasion                       (/dev/loop11)
/mnt/ISO/1635 - The Eastern Front       (/dev/loop0)
/mnt/ISO/Storm from the Shadows         (/dev/loop12)
/mnt/ISO/1635 - The Eastern Front       (/dev/loop1)
/mnt/ISO/The Spider                     (/dev/loop13)
/mnt/ISO/When the Tide Rises            (/dev/loop2)
/                                       (/dev/md127)
/Debian                                 (/dev/sdb5)
/SD                                     (/dev/sdd1)
/Backups                                (/dev/sdc1)
/Win10/HP_Recover                       (/dev/sda3)
/Win10                                  (/dev/sda2)
/Fedora                                 (/dev/sdb1)
/Win10/System                           (/dev/sda1)

Note that the last one, the "test" run, has the device list in an (almost) random order. If you look at the code, below, you'll see that case 2 and case 3 tn the switch statements are (almost) identical, In fact in the first switch statement (data preparation), case 2 and case 3 are identical. In the second switch statement, in the END section, the only difference is that case 2 call my Qsorti instead of asorti, but with identical arguments.

So, after all that, have I found a "bug" in asorti, or just misunderstood what asorti was supposed to accomplish?

Here's the Mount.gawk program:

Code:

#!/bin/gawk -f
###############################################################################
#
# Get the list of mounted file systems from /proc/self/mounts
# and the length of the longest mounted file system's name
#
# NOTE: Lines in /proc/self/mounts which do not represent
#       device mounts (i.e, do not have a "/" as the initial
#       character) are ignored.
#
# This section places "/proc/self/mounts" in the argument list if none of the
# arguments are readable file names. It also looks for any of the output
# specification fields (by_name, by_dev, or test), deletes them from the
# argument list and queues the last one found for execution.
#
# The default action, if none is specified is "by_name"
#
# If a readable file id found it will be read instead of /proc/self/mounts.
#
# NOTE: Action specifications are not case sensitive, and the underscore or
#       dash between the "by" and the key is optional. (You may also use a
#       blank between the "by" and the key if you enclose the phrase in
#       quotes, but that seems more work than would be warrented.)
#
###############################################################################
BEGIN {
  found_file=0
  order=1
  for (i=1; i<ARGC; ++i) {
    # Is this argument a file name?
    if (test("-e " ARGV[i])) {
      found_file=1
      continue
    }
    # Is it one or our sort options?
    switch (tolower(ARGV[i])) {
      case /^by[ _-]?name/: {
        order=1
        ARGV[i]=""
        break
      }
      case /^by[ _-]?dev/: {
        order=2
        ARGV[i]=""
        break
      }
      case /test/: {
        order=3
        ARGV[i]=""
        break
      }
    }
  }
  if (!found_file) {
    ARGV[ARGC]="/proc/self/mounts"
    ++ARGC
  }
  ln=0
  fs=0134 # The field separator (\) is (apparently) valid in Linux file system names . . .
}

# Read the input file, and process any line with an initial "/"
/^\// {
  mount_point = $1
  mount = $2
  # "Fix" the special characters permitted in file names
  # which are forbidden in /proc/mounts
  mount=gensub(/\\040/," ", "g",mount) # blank
  # NOTE: File names containing any of the next three
  # allowed characters may display somewhat weirdly.
  # NOTE: For this program, these characters are shown as
  #       their usual escape sequences.
  if (mount ~ /\\[01]/) {
    mount=gensub(/\\011/,"\\t", "g",mount) # tab
    mount=gensub(/\\012/,"\\n", "g",mount) # new line
    mount=gensub(/\\134/,"\\\\","g",mount) # field seperator
  }
  #
  # If we have any other octal-coded characters, try to
  # replace them with the character they represent assuming
  # that they are UTF-8 characters, encoded as octals (without a
  # leading zero following the back slash.)
  #
  # (NOTE: As far as I know, this is not necessary, but (IIRC) the Linux posix
  #        specifications, at one time, restricted file names to valid
  #        ASCII characters so it's conceivable that UTF-8 characters might
  #        be octal-encoded in /proc/mounts.)
  #
  if (mount ~ /\\[0-7]/) {
    n = split(mount, /\\[0-7]+/, part, sep)
    mount = part[1]
    for (i = 2; i<=n; ++i) {
      mount = (mount sprintf("%c", strtonum("\\0" substr(sep[i-1],2))) part[i])
    }
  }
  #
  # Get the maximum mount_point name length
  ln = ((m = length(mount)) > ln)?m:ln
  #
  # Post the mount_point and device information to mtab in the format needed
  switch (order) {
    # For a list by name, the index of mtab is the name and the data is
    # a colon-delimited list of devices to which it is attached.
    case 1:
    {
      mtab[mount]=(((mount in mtab)?mtab[mount]":":"") mount_point)
      break
    }
    # For a list by device, the index of mtab is a device, and the value
    # is a colon-delimited list of file system names attached to that device.
    # (I.e.: The primary mount and any additional bind mounts.)
    case 2:
    case 3:
    {
      mtab[mount_point]=((mount_point in mtab)?":":"") mount
      break
    }
  }
}

END {
  part[""]=""
  delete part[""]
  fmt=("%-" ln "s (%s)\n")
  switch (order) {
    case 1: # List mounts by name
    {
      n=asorti(mtab,mtabi)
      for (i=1;i<=n;++i) {
        name = mtabi[i]
        k = split(mtab[mtabi[i]], part, /:/)
        for (j=1; j<= k; ++j) {
          if (j == 2) {
            l = int(length(name)/2)
            name = ""
            for (m=1; m<l; ++m) {
              name = (name " ")
            }
            name = (name "\"")
          }
          printf(fmt, name, part[j])
        }
      }
      break
    }
    case 2: # List mount by device using Qsorti(), an asorti "clone" based on
            # the "quicksort" program shipped with gawk.
    {
      n = Qsorti(mtab, mtabi, "cmp_numeric_suffix_within_non_numeric_prefix")
      print_values(mtab, mtabi, n, fmt)
      break
    }
    case 3: # List the same data using the asorti function
            # (I'm puzzled that asorti, using the same comparison function,
            # seems to produce a different result.)
    {
      n = asorti(mtab, mtabi, "cmp_numeric_suffix_within_non_numeric_prefix")
      print_values(mtab, mtabi, n, fmt)
      break
    }
  }
}

# Function to return "true" if the system "test" executable so evaluates
function test(expression)
{
  return !system(("env test " expression))
}

# Function used to output the values for both the "bydev" and "test" cases
function print_values(\
  mtab,     # Data values
  mtabi,    # Array of index values in the order that mtab should be printed
  n,        # Length of the mtabi array
  fmt,      # Format to use whilst printing
  # Locals:
  i,j,k,    # Integers
  ind,      # Current index value (The "mount_point".)
  part)     # Array to hols the split() output
{
  for (i=1;i<=n;++i) {
    ind = mtabi[i]
    if (isarray(mtab[ind])) {
      print "mtab[\""ind"\"] is an array. Skipping print."
    }
    else
    {
      k = split(mtab[ind], part, /:/)
      for (j=1; j<= k; ++j) {
        printf(fmt, part[j], ind)
      }
    }
  }
}
###############################################################################
#
# Generic function to order to values in the form "ccccnnnn" where "c" is
# a character and "d" is a digit. The "c" preceding the first "d", if any, can
# not (of course) be a digit.
#
# NOTE: Only trailing digits are used for the "numeric" part. (E.g.: The
#       "numeric part" of 1.3e-15 is "15" which may not be an expected value.)
#
# NOTE: See the code below for the treatment of the special cases of missing
#       "ccc" and/or "ddd" parts in any of the input arguments.
#
###############################################################################
function cmp_numeric_suffix_within_non_numeric_prefix(\
  v1,       # Value of element one
  v2,       # Value of element two
  # Local variables:
  prefix1,  # Part of v1 preceding num1
  num1,     # Trailing numeric part of v1
  prefix2,  # Part of v2 preceding num2
  num2,     # Trailing numeric part of v2
  part,     # Array in which decompose() returns the prefix and suffix values
  rv)       # Return value (Not really needed, but a help in debugging.)
{
  # NOTE: This function might be called when either v1 or v2 is an array.
  #       The standard is that arrays sort after any non-array values,
  #       and the all array values are "equal" when compared, so their final
  #       order, other than "after non-array values" is undefined.
  if (isarray(v1) || isarray(v2)) {
    # At least one of v1 and v2 must be an array.
    if (isarray(v1)) {
      # O.K., v1 is an array, Return 0 if v2 is also an array, otherwise 1
      return ((isarray(v2))?0:1)
    }
    # If we get here, v1 is not an array, so v2 must be one.
    return -1 # therefore v1 must preceed v2
  }
  # If we get here neither v1 nor v2 is an array.
  #
  # Break v1 and v2 into their component parts.
  delete part
  decompose(v1, part)
  prefix1 = part[1]
  num1    = part[2]
  decompose(v2, part)
  prefix2 = part[1]
  num2    = part[2]
  # Do we have prefix values for both input values?
  if (prefix1 && prefix2) {
    # If the prefix values are different, just compare them.
    if (prefix1 != prefix2) {
      rv = ((prefix1 < prefix2)? -1: 1)
      return rv
    }
    # If we get here, the prefix values are the same. Compare
    # the suffix values.
    rv = ((num1 < num2)? -1: (num1 != num2))
    return rv
  }
  # If we are here, at least one of the prefix values is null.
  if (prefix1 && (! prefix2)) { # null precedes non-null
    return 1
  }
  if ((! prefix1) && prefix2) { # null precedes non-null
    return -1
  }
  # Both null? Then just compare the numeric values.
  # NOTE: This assumes that an empty file name will never occur in /proc/mounts.
  rv = ((num1 < num2)? -1: (num1 != num2))
  return rv
}
# Break a string into the part preceding any terminal digit string and that string.
function decompose(\
  v,    # Value to decompose
  part, # Array in which to returm the components
  # Local variable:
  n)    # Start of the terminal digit string
{
  n = match(v, /[[:digit:]]+$/)
  if (n) {
    part[1] = substr(v, 1, n-1)
    part[2] = 0 + substr(v, n)
    return 1
  }
  part[1] = v
  part[2] = 0
  return 0
}
###############################################################################
#
# Replacement of the asorti function that (seems to) fail to work properly.
#
###############################################################################
function Qsorti(\
  inArray,    # The values to be sorted
  outArray,   # The array containing the sorted data
  # Optional variables:
  compare,    # Name of a function that returns a negative value if inArray[i] precedes inArray[j]
  # Recursive function control variables:
  left,       # The starting index of the sub-array bring sorted (initially 1)
  right,      # The ending index of the sub-array being sorted (initally length(inArray))
  # Local variables:
  i,          # integer (loop index)
  j,          # integer (temporary index holder)
  temp,       # temporary data holder
  last)       # endpoint position
{
  # Sanity checks:
  # Do we have any input?
  if (!isarray(inArray)) {
    return 0
  }
  # Do we have a comparison function?
  if (compare=="" || (!(compare in FUNCTAB))) {
    # Has a default ordering for the "in" operator been defined?
    if (PROCINFO["sorted_in"]) {
      # Since the "sorted_in" function controls the "in"
      # operator's return order, all we need to do is populate
      # outArray from inArray.
      last=0
      for (i in inArray) {
        ++last
        outArray[last] = i
      }
      return last
    }
    else {
      # No default ordering. Use a "order by string, ascending" default function.
      compare = "cmp_str_val"
    }
  }
  # Initialization
  if (!left) {
    left = 1
    right = 0
    for (i in inArray) {
      ++right
      outArray[right] = i
    }
  }
  if (left >= right) {  # Nothing to do.
    return length(outArray)
  }
  # Swap the first element of the sub-array with the middle one
  j = int((left + right) / 2)
  temp = outArray[left]
  outArray[left] = outArray[j]
  outArray[j] = temp
  last = left
  for (i = left + 1; i <= right; i++) {
    if (@compare(outArray[i], outArray[left])<0) {
      ++last
      temp=outArray[i]
      outArray[i]=outArray[last]
      outArray[last]=temp
    }
  }
  temp=outArray[left]
  outArray[left]=outArray[last]
  outArray[last]=temp
  Qsorti(inArray, outArray, compare, left, last - 1)
  Qsorti(inArray, outArray, compare, last + 1, right)
  return length(outArray)
}
# Default index sort function
#
# NOTE: This is a slightly modified copy of the function with this name in the
# gawk info file. (The modification is to "handle" arrays as v1 or v2. NOTE: untested.)
#
function cmp_str_val(v1, v2)
{
  # Order any array values after any other values, and do not impose any order
  # on any such s=array values.
  if (isarray(v1) || isarray(v2)) {
    # At least one of v1 and v2 must be an array . . .
    if (isarray(v1)) {
      # O.K., v1 is an array, Return 0 if v2 is also an array, otherwise 1
      return ((isarray(v2))?0:1)
    }
    # If we get here, v1 is not an array, so v2 must be one.
    return -1 # therefore v1 must preceed v2
  }
  # string value comparison, ascending order
  v1 = v1 ""
  v2 = v2 ""
  if (v1 < v2) {
    return -1
  }
  return (v1 != v2)
}

mpapet · 09-06-2016, 11:27 AM

Don't assume it's a bug.
You need to ask the question either as a bug report, or a mailing list.
https://www.gnu.org/software/gawk/

http://savannah.gnu.org/mail/?group=gawk

grail · 09-06-2016, 12:39 PM

I do not have an answer at this point, but out of curiosity, you have created your own sort function and your own comparison function but then blame the builtin function for performing incorrectly.
What happens if you call your sort and the builtin with default behaviour? (ie. not influencing the comparison as well)

I am not disagreeing that the 'test' option output seems a bit all over the place, but perhaps that is due to your comparison?? As an example, I printed the output of 'v1' and 'v2' for both the call
to 'test' and 'by-dev' and the output of the print statement is radically different.

PTrenholme · 09-06-2016, 04:59 PM

Thank you grail/ I'm an idiot!

The info file PLAINLY tells me that a comparison function takes FOUR arguments not TWO. But - since I was focused only on the index values, well

I'll post the corrected code here as son as I get it to run.

PTrenholme · 09-06-2016, 06:55 PM

O.K., here's my new, improved code: (The two include files are below the first one.)

Note that I changed the multiple mounts logic to use two-dimensional arrays instead of colon-delimited strings.

Code:

#!/bin/gawk -f
@include "test.awk"
@include "cmp_by_numeric_suffix_within_alpha_prefix.gawk"
###############################################################################
#
# Get the list of mounted file systems from /proc/self/mounts
# and the length of the longest mounted file system's name
#
# NOTE: Lines in /proc/self/mounts which do not represent
#       device mounts (i.e, do not have a "/" as the initial
#       character) are ignored.
#
# This section places "/proc/self/mounts" in the argument list if none of the
# arguments are readable file names. It also looks for any of the output
# specification fields (by_name, by_dev, or test), deletes them from the
# argument list and queues the last one found for execution.
#
# The default action, if none is specified is "by_name"
#
# If a readable file id found it will be read instead of /proc/self/mounts.
#
# NOTE: Action specifications are not case sensitive, and the underscore or
#       dash between the "by" and the key is optional. (You may also use a
#       blank between the "by" and the key if you enclose the phrase in
#       quotes, but that seems more work than would be warrented.)
#
###############################################################################
BEGIN {
  found_file=0
  order=1
  for (i=1; i<ARGC; ++i) {
    # Is this argument a file name?
    if (test("-e " ARGV[i])) {
      found_file=1
      continue
    }
    # Is it one or our sort options?
    switch (tolower(ARGV[i])) {
      case /^by[ _-]?name/: {
        order=1
        ARGV[i]=""
        break
      }
      case /^by[ _-]?dev/: {
        order=2
        ARGV[i]=""
        break
      }
    }
  }
  if (!found_file) {
    ARGV[ARGC]="/proc/self/mounts"
    ++ARGC
  }
  ln=0
  fs=0134 # The field separator (\) is (apparently) valid in Linux file system names . . .
}

# Read the input file, and process any line with an initial "/"
/^\// {
  mount_point = $1
  mount = $2
  # "Fix" the special characters permitted in file names
  # which are forbidden in /proc/mounts
  mount=gensub(/\\040/," ", "g",mount) # blank
  # NOTE: File names containing any of the next three
  # allowed characters may display somewhat weirdly.
  # NOTE: For this program, these characters are shown as
  #       their usual escape sequences.
  if (mount ~ /\\[01]/) {
    mount=gensub(/\\011/,"\\t", "g",mount) # tab
    mount=gensub(/\\012/,"\\n", "g",mount) # new line
    mount=gensub(/\\134/,"\\\\","g",mount) # field separator
  }
  #
  # If we have any other octal-coded characters, try to
  # replace them with the character they represent assuming
  # that they are UTF-8 characters, encoded as octals (without a
  # leading zero following the back slash.)
  #
  # (NOTE: As far as I know, this is not necessary, but (IIRC) the Linux posix
  #        specifications, at one time, restricted file names to valid
  #        ASCII characters so it's conceivable that UTF-8 characters might
  #        be octal-encoded in /proc/mounts.)
  #
  if (mount ~ /\\[0-7]/) {
    n = split(mount, /\\[0-7]+/, part, sep)
    mount = part[1]
    for (i = 2; i<=n; ++i) {
      mount = (mount sprintf("%c", strtonum("\\0" substr(sep[i-1],2))) part[i])
    }
  }
  #
  # Get the maximum mount_point name length
  ln = ((m = length(mount)) > ln)?m:ln
  #
  # Post the mount_point and device information to mtab in the format needed
  switch (order) {
    # For a list by name, the index of mtab is the name and the data is
    # a a array of devices to which it is attached.
    case 1:
    {
      if (mount in mtab) {
        mtab[mount][1+length(mtab[mount])] = mount_point
      } else {
        mtab[mount][1] = mount_point
      }
      break
    }
    # For a list by device, the index of mtab is a device, and the value
    # is an array of file system names attached to that device.
    # (I.e.: The primary mount and any additional bind mounts.)
    case 2:
    {
      if (mount_point in mtab) {
        mtab[mount_point][1+length(mtab[mount_point])] = mount
      } else {
        mtab[mount_point][1] = mount
      }
      break
    }
  }
}

END {
  fmt=("%-" ln "s (%s)\n")
  switch (order) {
    case 1: # List mount_point by name
    {
      n=asorti(mtab,mtabi)
      for (i=1;i<=n;++i) {
        name = mtabi[i]
        for (j=1; j<= length(mtab[mtabi[i]]); ++j) {
          if (j == 2) {
            l = int(length(name)/2)
            name = ""
            for (m=1; m<l; ++m) {
              name = (name " ")
            }
            name = (name "\"")
          }
          printf(fmt, name, mtab[mtabi[i]][j])
        }
      }
      break
    }
    case 2: # List of names by mount_point
    {
      n = asorti(mtab, mtabi, "cmp_ind_numeric_suffix_within_alpha_prefix")
      for (i=1;i<=n;++i) {
        for (j=1; j<= length(mtab[mtabi[i]]); ++j) {
          printf(fmt, mtab[mtabi[i]][j], mtabi[i])
        }
      }
      break
    }
  }
}

Here;s the test.awk include file:

Code:

$ cat /usr/local/awk/test.awk 
#############################################################################################
#
# Call the test function, return true if the expression so evaluates.
#
# Typical usage:
#
# if (!test(" -e "file_name" -a -r "file_name)) {
#   print "\"" file_name "\" is not readable." > "/dev/stderr"
# }
#
#############################################################################################
function test(expression)
{
  return !system(("env test " expression))
}
#############################################################################################
#
# Call the test function as a superuser, return true if the expression so evaluates.
#
# Example usage:
#
# if (!su_test(" -e "file_name" -a -r "file_name) ) {
#   print "\"" file_name "\" is not readable by \"root.\"" > "/dev/stderr"
# }
#
# NOTE: This works best if you are in the /etc/sudoers file with a NOPASSWD attribute.
#
#############################################################################################
function su_test(expression)
{
  return !system(("sudo env test " expression))
}

And, finally, the corrected sorting functions:

Code:

$ cat /usr/local/awk/cmp_by_numeric_suffix_within_alpha_prefix.gawk 
# Comparison function for use by asorti
function cmp_ind_numeric_suffix_within_alpha_prefix(i1, v1, i2, v2) {
  return compare_numeric_suffix_within_alpha_prefix(i1, i2)
}
# Comparison function for use by asort
function cmp_val_numeric_suffix_within_alpha_prefix(i1, v1, i2, v2) {
  return compare_numeric_suffix_within_alpha_prefix(v1, v2)
}
###############################################################################
#
# Generic function to order to values in the form "ccccnnnn" where "c" is
# a character and "d" is a digit. The "c" preceding the first "d", if any, can
# not (of course) be a digit.
#
# NOTE: Only trailing digits are used for the "numeric" part. (E.g.: The
#       "numeric part" of 1.3e-15 is "15" which may not be an expected value.)
#
# NOTE: See the code below for the treatment of the special cases of missing
#       "ccc" and/or "ddd" parts in any of the input arguments.
#
###############################################################################
function compare_numeric_suffix_within_alpha_prefix(\
v1,       # Value of element one
v2,       # Value of element two
# Local variables:
prefix1,  # Part of v1 preceding num1
num1,     # Trailing numeric part of v1
prefix2,  # Part of v2 preceding num2
num2,     # Trailing numeric part of v2
part)     # Array in which decompose() returns the prefix and suffix values
{
  # NOTE: This function might be called when either v1 or v2 is an array.
  #       The standard is that arrays sort after any non-array values,
  #       and the all array values are "equal" when compared, so their final
  #       order, other than "after non-array values" is undefined.
  if (isarray(v1) || isarray(v2)) {
    # At least one of v1 and v2 must be an array.
    if (isarray(v1)) {
      # O.K., v1 is an array, Return 0 if v2 is also an array, otherwise 1
      return ((isarray(v2))?0:1)
    }
    # If we get here, v1 is not an array, so v2 must be one.
    return -1 # therefore v1 must preceed v2
  }
  # Break v1 and v2 into their component parts.
  delete part
  decompose(v1, part)
  prefix1 = part[1]
  num1    = part[2]
  decompose(v2, part)
  prefix2 = part[1]
  num2    = part[2]
  # Do we have prefix values for both input values?
  if (prefix1 && prefix2) {
    # If the prefix values are different, just compare them.
    if (prefix1 != prefix2) {
      return ((prefix1 < prefix2)? -1: 1)
    }
    # If we get here, the prefix values are the same. Compare
    # the suffix values.
    return ((num1 < num2)? -1: (num1 != num2))
  }
  # If we are here, at least one of the prefix values is null.
  if (prefix1 && (! prefix2)) { # null precedes non-null
    return 1
  }
  if ((! prefix1) && prefix2) { # null precedes non-null
    return -1
  }
  # Both null? Then just compare the numeric values.
  return ((num1 < num2)? -1: (num1 != num2))
}
###############################################################################
#
# Break a string into the part preceding any terminal digit string,
# and that string as elements of the array passed as the second argument.
#
# Return true if a numeric suffix was found, false if not.
#
###############################################################################
function decompose(\
  v,    # Value to decompose
  part, # Array in which to returm the components
  # Local variable:
  n)    # Start of the terminal digit string
{
  # Force part to be an array if it's undefined
  # NOTE: This will abort GAWK if "part" is defined and not an array.
  part[""]=""
  # and empty it.
  delete part
  n = match(v, /[[:digit:]]+$/)
  if (n) {
    part[1] = substr(v, 1, n-1)
    part[2] = 0 + substr(v, n)
    return 1
  }
  # If we get here there was no terminal digit string. Assume 0.
  part[1] = v
  part[2] = 0
  return 0
}

PTrenholme · 09-06-2016, 07:09 PM

Oh, if anyone's interested, here's my new, improved, output:
(The mounted alias is alias Mounted='gawk -f /usr/local/awk/Mounted.gawk '

Code:

$ Mounted 
/                                       (/dev/md127)
/Backups                                (/dev/sdc1)
/Debian                                 (/dev/sdb5)
/Fedora                                 (/dev/sdb1)
/SD                                     (/dev/sdd1)
/Samba/OFFICE/Books                     (//OFFICE/Books/)
/Samba/OFFICE/Documents                 (//OFFICE/Documents/)
/Samba/OFFICE/Downloads                 (//OFFICE/Downloads/)
/Samba/OFFICE/Public                    (//OFFICE/Public/)
/Samba/OFFICE/Users                     (//OFFICE/Users/)
/Win10                                  (/dev/sda2)
/Win10/HP_Recover                       (/dev/sda3)
/Win10/System                           (/dev/sda1)
/mnt/ISO/1635 - The Eastern Front       (/dev/loop0)
/mnt/ISO/Cryoburn                       (/dev/loop10)
/mnt/ISO/Invasion                       (/dev/loop7)
/mnt/ISO/Storm from the Shadows         (/dev/loop8)
/mnt/ISO/The Best of Jim Baens Universe (/dev/loop3)
/mnt/ISO/The Claws that Catch           (/dev/loop4)
/mnt/ISO/The Spider                     (/dev/loop9)
/mnt/ISO/This Septer'd Isle             (/dev/loop5)
/mnt/ISO/Unto the Breach                (/dev/loop2)
/mnt/ISO/When the Tide Rises            (/dev/loop1)
/mnt/ISO/Windrider's Oath               (/dev/loop6)

$ Mounted by-dev
/Samba/OFFICE/Books                     (//OFFICE/Books/)
/Samba/OFFICE/Documents                 (//OFFICE/Documents/)
/Samba/OFFICE/Downloads                 (//OFFICE/Downloads/)
/Samba/OFFICE/Public                    (//OFFICE/Public/)
/Samba/OFFICE/Users                     (//OFFICE/Users/)
/mnt/ISO/1635 - The Eastern Front       (/dev/loop0)
/mnt/ISO/When the Tide Rises            (/dev/loop1)
/mnt/ISO/Unto the Breach                (/dev/loop2)
/mnt/ISO/The Best of Jim Baens Universe (/dev/loop3)
/mnt/ISO/The Claws that Catch           (/dev/loop4)
/mnt/ISO/This Septer'd Isle             (/dev/loop5)
/mnt/ISO/Windrider's Oath               (/dev/loop6)
/mnt/ISO/Invasion                       (/dev/loop7)
/mnt/ISO/Storm from the Shadows         (/dev/loop8)
/mnt/ISO/The Spider                     (/dev/loop9)
/mnt/ISO/Cryoburn                       (/dev/loop10)
/                                       (/dev/md127)
/Win10/System                           (/dev/sda1)
/Win10                                  (/dev/sda2)
/Win10/HP_Recover                       (/dev/sda3)
/Fedora                                 (/dev/sdb1)
/Debian                                 (/dev/sdb5)
/Backups                                (/dev/sdc1)
/SD                                     (/dev/sdd1)

grail · 09-06-2016, 08:53 PM

Glad you got there

Something else to note, my system has some NFS mounts and hence some of my device names get repeated but the first solution did not seem to account for this and I ended
up with a number of blank lines next to a device name. Will try the new version today some time and let you know if that has improved

PTrenholme · 09-07-2016, 01:58 PM

Strange.

Are you sure that the lines were blank? The way I coded the "byname" section (case 1), duplicated name on different devices are (intended) to be listed with the duplicated name replaced by a quote around the midpoint of the name's length for any duplicated name. (And that logic is a little kludgy, and I intend to change it soon, but it does seem to work.)

I don't have ant NFS mounts, just few Samba connections, and none duplicated, so I can't (easily) replicate your problem.

grail · 09-07-2016, 02:18 PM

I haven't had a chance to test the new code as yet

here is the output of the current code:

Code:

# by-name
/                        (/dev/sdc3)
/boot                    (/dev/sdc1)
/home                    (/dev/sdc4)
/home/grail/Downloads    (/dev/sda5)
/opt/multimedia/movies   (/dev/sdb2)
/opt/multimedia/music    (/dev/sda2)
/opt/multimedia/pictures (/dev/sda3)
/opt/multimedia/tv       (/dev/sdb1)
/run/media/grail/Coven   (/dev/sde1)
/run/media/grail/Dozer   (/dev/sdd1)
/srv/nfs4/craig_movies   (/dev/sda5)
/srv/nfs4/exercises      (/dev/sdc4)
/srv/nfs4/kristy         (/dev/sdb1)
/srv/nfs4/movies         (/dev/sdb2)
/srv/nfs4/music          (/dev/sdc3)
      "                  (/dev/sda2)
/var                     (/dev/sda1)

# by-dev
/var                     (/dev/sda1)
                         (/dev/sda2)
/srv/nfs4/music          (/dev/sda2)
/opt/multimedia/pictures (/dev/sda3)
                         (/dev/sda5)
/srv/nfs4/craig_movies   (/dev/sda5)
                         (/dev/sdb1)
/srv/nfs4/kristy         (/dev/sdb1)
                         (/dev/sdb2)
/srv/nfs4/movies         (/dev/sdb2)
/boot                    (/dev/sdc1)
                         (/dev/sdc3)
/srv/nfs4/music          (/dev/sdc3)
                         (/dev/sdc4)
/srv/nfs4/exercises      (/dev/sdc4)
/run/media/grail/Dozer   (/dev/sdd1)
/run/media/grail/Coven   (/dev/sde1)

# test
/var                     (/dev/sda1)
                         (/dev/sda2)
/srv/nfs4/music          (/dev/sda2)
                         (/dev/sdb1)
/srv/nfs4/kristy         (/dev/sdb1)
/opt/multimedia/pictures (/dev/sda3)
                         (/dev/sdb2)
/srv/nfs4/movies         (/dev/sdb2)
/run/media/grail/Dozer   (/dev/sdd1)
/run/media/grail/Coven   (/dev/sde1)
                         (/dev/sda5)
/srv/nfs4/craig_movies   (/dev/sda5)
                         (/dev/sdc3)
/srv/nfs4/music          (/dev/sdc3)
                         (/dev/sdc4)
/srv/nfs4/exercises      (/dev/sdc4)
/boot                    (/dev/sdc1)

And here is my mtab output with the relevant sections:

Code:

$ awk '/^\//{print $1,$2}' /etc/mtab 
/dev/sdc3 /
/dev/sdc1 /boot
/dev/sdc4 /home
/dev/sdc4 /srv/nfs4/exercises
/dev/sdc3 /srv/nfs4/music
/dev/sdb1 /opt/multimedia/tv
/dev/sdb1 /srv/nfs4/kristy
/dev/sdb2 /opt/multimedia/movies
/dev/sdb2 /srv/nfs4/movies
/dev/sda1 /var
/dev/sda2 /opt/multimedia/music
/dev/sda2 /srv/nfs4/music
/dev/sda3 /opt/multimedia/pictures
/dev/sda5 /home/grail/Downloads
/dev/sda5 /srv/nfs4/craig_movies
/dev/sde1 /run/media/grail/Coven
/dev/sdd1 /run/media/grail/Dozer

Will get back to you on the new code

PTrenholme · 09-07-2016, 07:50 PM

Well, here's what I get with

Code:

$ cat gail.mtab
/dev/sdc3 /
/dev/sdc1 /boot
/dev/sdc4 /home
/dev/sdc4 /srv/nfs4/exercises
/dev/sdc3 /srv/nfs4/music
/dev/sdb1 /opt/multimedia/tv
/dev/sdb1 /srv/nfs4/kristy
/dev/sdb2 /opt/multimedia/movies
/dev/sdb2 /srv/nfs4/movies
/dev/sda1 /var
/dev/sda2 /opt/multimedia/music
/dev/sda2 /srv/nfs4/music
/dev/sda3 /opt/multimedia/pictures
/dev/sda5 /home/grail/Downloads
/dev/sda5 /srv/nfs4/craig_movies
/dev/sde1 /run/media/grail/Coven
/dev/sdd1 /run/media/grail/Dozer

as input.

Code:

$ Mounted ./gail.mtab by-name

Total 17

File System                 Device
------------------------ ---------
/                        (/dev/sdc3)
/boot                    (/dev/sdc1)
/home                    (/dev/sdc4)
/home/grail/Downloads    (/dev/sda5)
/opt/multimedia/movies   (/dev/sdb2)
/opt/multimedia/music    (/dev/sda2)
/opt/multimedia/pictures (/dev/sda3)
/opt/multimedia/tv       (/dev/sdb1)
/run/media/grail/Coven   (/dev/sde1)
/run/media/grail/Dozer   (/dev/sdd1)
/srv/nfs4/craig_movies   (/dev/sda5)
/srv/nfs4/exercises      (/dev/sdc4)
/srv/nfs4/kristy         (/dev/sdb1)
/srv/nfs4/movies         (/dev/sdb2)
/srv/nfs4/music          (/dev/sdc3)
       "                 (/dev/sda2)
/var                     (/dev/sda1)

$ Mounted by-dev gail.mtab

Total 17

File System                 Device
------------------------ ---------
/var                     (/dev/sda1)
/opt/multimedia/music    (/dev/sda2)
/srv/nfs4/music          (    "    )
/opt/multimedia/pictures (/dev/sda3)
/home/grail/Downloads    (/dev/sda5)
/srv/nfs4/craig_movies   (    "    )
/opt/multimedia/tv       (/dev/sdb1)
/srv/nfs4/kristy         (    "    )
/opt/multimedia/movies   (/dev/sdb2)
/srv/nfs4/movies         (    "    )
/boot                    (/dev/sdc1)
/                        (/dev/sdc3)
/srv/nfs4/music          (    "    )
/home                    (/dev/sdc4)
/srv/nfs4/exercises      (    "    )
/run/media/grail/Dozer   (/dev/sdd1)
/run/media/grail/Coven   (/dev/sde1)

and here's my less kludgy end section

Code:

END {
  # Print the header line
  print ("\nTotal " N "\n")
  fmt=("%-"ln_m"s %"ln_p"s\n")
  dashes=blanks=substr(sprintf(fmt,"",""),1,((ln_p < ln_m)?ln_m:ln_p))
  gsub(/ /,"-",dashes)
  printf(fmt, "File System", "Device")
  printf(fmt, substr(dashes,1,ln_m), substr(dashes,1,ln_p))
  # Set the output format
  fmt=("%-"ln_m"s (%"ln_p"s)\n")
   # Print the results in the desired order
  switch (order) {
    case 1: # List of mount_points by name
    {
      # Sort the index values (the file system names)
      n=asorti(mtab,mtabi)
      # Print the names and their mount points
      for (i=1;i<=n;++i) {
        name = mtabi[i]
        for (j=1; j<= length(mtab[mtabi[i]]); ++j) {
          if (j==2) {
            name = quote_for_name(name, blanks)
          }
          printf(fmt, name, mtab[mtabi[i]][j])
        }
      }
      break
    }
    case 2: # List of names by mount_point
    {
      # Sort the imdex values (the mount points) using the named compairson function
      n = asorti(mtab, mtabi, "cmp_ind_numeric_suffix_within_alpha_prefix")
      for (i=1;i<=n;++i) {
        name = mtabi[i]
        for (j=1; j<= length(mtab[mtabi[i]]); ++j) {
          if (j==2) {
            name = quote_for_name(name, blanks)
          }
          printf(fmt, mtab[mtabi[i]][j], name)
        }
      }
      break
    }
  }
}
# For any mutiple names, replace the name by a quote in about
# the center of the printed name.
function quote_for_name(name, blanks) {
  switch (length(name)) {
    case 0: { # This should never happen. Ignore it.
      break
    }
    case 1: {
      name = "\""
      break
    }
    case 2: {
      name = "\" "
      break
    }
    default: {
      k=int(length(name)/2)
      l=int(length(name) - k - 1)
      name = (substr(blanks,1, k) "\"" substr(blanks,1,l))
    }
  }
  return name
}

PTrenholme · 09-07-2016, 08:05 PM

Oh, sorry, I should have mentioned that I replaced the computation of ln (for "length of name") with computation of ln_m (length of mounted-as name), ln_p (length of mounted-on name).

In the BEGIN section, the ln=0 should be replaced by

Code:

  ln_m=13
  ln_p=8

(Those numbers ar the minimum if the headers are to print properly.)

and in the process section, the computation of ln should be

Code:

  ln_m = ((m =       length(mount)) > ln_m)?m:ln_m
  ln_p = ((m = length(mount_point)) > ln_p)?m:ln_p