Re-creating a column byte-based file

kenosando · 11-05-2009, 01:33 PM

Greetings,
I searched the forums for what I am trying to accomplish, but it was hard describing it and getting any relevant results. Here is what I am trying to accomplish:

I have a file that stores employee login IDs, names, types, and permissions. Our software reads the information based on byte-columns, so it reads a column as any ASCII character (spaces, letters, numbers, punctuation, etc.). I want to create a web-interface for adding and removing users, and storing the data in a MySQL database. However, if I am creating the files from the MySQL output, I need a way to write to specific column locations in the file ...

User ID: Columns 1-4
User Name: Columns 6-30
Type: 32-40
Permissions: 42-45

I want to use a scripting language, preferably C-Shell, to call MySQL for the data and write the data to the correct columns of the file. I wrote a script that takes the data from the file, and dumps it into the MySQL table, so maybe I can pad the remaining space in the table column to fill with spaces ... any suggestions?

MensaWater · 11-05-2009, 01:57 PM

You can use the "cut" command -c option to get positions.

The way I'd do extract the data in bash/ksh is:

Code:

while read line
do USER=`echo $line |cut -c1-4`
NAME=`echo $line |cut -c6-30`
TYPE=`echo $line |cut -c32-40`
PERMS=`echo $line |cut -c42-45`
echo User ID is $USER User Name is $NAME Type is $TYPE Permissions are $PERMS
done <FILE

Where FILE is the file that contains the original data.

Of course instead of just doing the final echo line you would want to put your routine that writes the data the way you want.

kenosando · 11-06-2009, 09:10 AM

That is what I used to get the data out of the file and into MySQL, using cut, but since I will need to re-write any changes made to the MySQL table back to this file, I need it formatted as before.

kenosando · 11-06-2009, 09:17 AM

Code:

while [ $i -le $lines ]
do

        id=`range $i $i $file | cut -b 1-4`
        name=`range $i $i $file | cut -b 6-30`
        types=`range $i $i $file | cut -b 32-40`
        permits=`range $i $i $file | cut -b 42-45`
        team=`range $i $i $file | cut -b 140-147`
        i=$(($i+1))

        echo 'insert into test.employee () values ("'$id'", "'$name'", "'$types'", "'$permits'", "'$team'");' >> .insert
done

This is what I had to do, because when you stream a line of a text file to a variable (in your case $line) it doesn't have columns anymore, as I tried the exact code you did before and it wouldn't read the many spaces as columns. the range command is simply:

Code:

awk -v start="$1" -v end="$2" '{if(NR >= start && NR <= end) {print}}' $3

where arg1 is the starting point and arg2 is the end, and arg3 is the file. This was the only way I could get the columns to be preserve is actually streaming the file, and not a variable that captures the line of the file.

But I digress, just pointing that out ...

MensaWater · 11-06-2009, 09:27 AM

Sorry I misread - I thought you were trying to figure out how to get positions out of text into MySQL. You were going the other way.

Unfortunately I don't work with MySQL much so can't really guide you.

This link talks about writing to a text file from MySQL:
http://www.wellho.net/forum/The-MySQ...text-file.html

You can use awk to pad fields. An example of that would be this test script:

Code:

#!/bin/bash
# Example script I wrote for LQ to show use of padding in awk with printf.
# 07-Jul-2009 jlightne
#
while read label address
do echo $address |awk -F. '{printf "fixed-address %03d.%03d.%03d.%03d\n",$1,$2,$3,$4}'
done <awkprintf.test

awkprintf.test is the following:

Code:

fixed-address 1.2.3.4
fixed-address 22.33.123.234

ghostdog74 · 11-06-2009, 09:35 AM

using a while read loop and calling external cut 6 times for each line is a terribly slow way of doing file parsing. Use awk for efficiency (or use bash's internal string functions).

kenosando · 11-06-2009, 09:36 AM

Of course, I didn't think of awk being able to printf and pad, thanks a lot!

MensaWater · 11-06-2009, 09:41 AM

Quote:

Originally Posted by ghostdog74

using a while read loop and calling external cut 6 times for each line is a terribly slow way of doing file parsing. Use awk for efficiency (or use bash's internal string functions).

Maybe providing an actual example routine to do this would be helpful rather than simply saying the cut routine provided isn't.

kenosando · 11-06-2009, 09:46 AM

@ghostdog74 - it is slow, however, the cut process only needed to be done once in my case. I just needed to get existing data into the MySQL table, after that, the MySQL call will feed into the text file.

What is the field delimiter to use for a single byte in awk, as in cut -b is for byte?

ghostdog74 · 11-06-2009, 09:57 AM

you use substr() in awk.

kenosando · 11-17-2009, 04:28 PM

the syntax for substr() is:

Code:

awk '{ print substr(a,b,c)}' file

where...
a is the field (or $0 for the entire line, in my case)
b is the starting position
c is the length from the starting position

chrism01 · 11-17-2009, 07:18 PM

Why don't you just keep all the data in one place eg MySQL. Copying back and forth means you'll always have some time when the 2 are not synced.

Anyway, to extract from MySQL

1. if the data is always same length (in the DB), the use the concat() fn to extract.
2. if the data is of variable lengths (in the DB), then use rpad()
http://dev.mysql.com/doc/refman/5.0/...functions.html

kenosando · 11-18-2009, 10:22 AM

Quote:

Originally Posted by chrism01

Why don't you just keep all the data in one place eg MySQL. Copying back and forth means you'll always have some time when the 2 are not synced.

That is the goal, but the software that runs read a column-delimited file that has to be formatted per its standards.

The easiest way for me was using awk's printf and then speciftying the length of the field and then the data to write to it. The data is not the same length, as there are name strings. Also, since this will translate into a web-based tool, PHP does fwrite() which also has extensive formatting options.