LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-08-2014, 06:35 AM   #1
suneelbabu.etl
LQ Newbie
 
Registered: Mar 2014
Posts: 6

Rep: Reputation: Disabled
Thumbs up find max length of characters in a perticular field


can u tell me how to find max length of characters in a perticular field.
my input file is
s.no,sname
1,sd
35,jtud
here max length is 2 right ..
i want output is
s.no,sname
01,sd
35,jtud
like this ..
here we just add the zero's before number if <2..
plz help me/...
 
Old 03-08-2014, 07:03 AM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,339

Rep: Reputation: Disabled
The title of your post seems to indicate that you want to find the largest number of characters in a column/field, while in the post itself you seem to want to pad a numeric field with leading zeroes. Which is it?
 
1 members found this post helpful.
Old 03-08-2014, 07:52 AM   #3
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,634

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by suneelbabu.etl View Post
can u tell me how to find max length of characters in a perticular field.
my input file is
s.no,sname
1,sd
35,jtud
here max length is 2 right ..
i want output is
s.no,sname
01,sd
35,jtud
like this ..
here we just add the zero's before number if <2..
plz help me/...
We will be happy to help...but you need to spell out your words, and quit using text-speak, and you need to show us what you've written/tried so far, along with answering Ser Olmy's question. Not exactly sure what you're looking for/needing here.

We will NOT write your scripts for you, but will be happy to help.
 
Old 03-08-2014, 11:40 AM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
This snippet doesn't solve the problem but provides an idea to guide you.

With this comma-delimited InFile ...
Code:
audi,bentley,bmw
chevrolet,dodge,ford,honda
mazda,nissan,subaru,toyota
... this awk ...
Code:
awk -F, '{for (j=1;j<=NF;j++) {print "In record",NR," field",j,
" name="$j," length of name=",length($j)}}' $InFile >$OutFile
... produced this OutFile ...
Code:
In record 1  field 1  name=audi  length of name= 4
In record 1  field 2  name=bentley  length of name= 7
In record 1  field 3  name=bmw  length of name= 3
In record 2  field 1  name=chevrolet  length of name= 9
In record 2  field 2  name=dodge  length of name= 5
In record 2  field 3  name=ford  length of name= 4
In record 2  field 4  name=honda  length of name= 5
In record 3  field 1  name=mazda  length of name= 5
In record 3  field 2  name=nissan  length of name= 6
In record 3  field 3  name=subaru  length of name= 6
In record 3  field 4  name=toyota  length of name= 6
Daniel B. Martin
 
Old 03-08-2014, 06:17 PM   #5
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
Quote:
Originally Posted by danielbmartin View Post
Code:
audi,bentley,bmw
chevrolet,dodge,ford,honda
mazda,nissan,subaru,toyota
doesn't honda belong in the third line ?
 
Old 03-09-2014, 04:54 AM   #6
suneelbabu.etl
LQ Newbie
 
Registered: Mar 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
Hi Daniel B. Martin,

I don't want to display length of the every field,
Quote:
audi,bentley,bmw
chevrolet,dodge,ford
mazda,nissan,subaru
in this example take first field, max length is 9 right.
it check the every line and if it is <9 then add 0(zero) before the first field.
the Out-file is:
Quote:
00000audi,bentley,bmw
chevrolet,dodge,ford,honda
00000mazda,nissan,subaru,toyota
like this..
u got my point right,..
 
Old 03-09-2014, 10:09 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
So we get your point, where is your attempt to either alter this script to do as you require or your own script that attempts to do the same?

As said earlier, we are not here to write the scripts for you.
 
Old 03-09-2014, 10:34 AM   #8
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,339

Rep: Reputation: Disabled
Do you have any previous experience with creating scripts? As grail said, users in this forum will be happy to help you debug and develop a script, but we'd rather not just write one for you.

This is a general "Programming" forum, and your problem could be solved with many different programming/scripting languages. It would be helpful to know which one you'd prefer.

(People just looking for a working solution rather than help with solving the problem, should preferably pay someone to create and implement that solution.)
 
Old 03-09-2014, 10:39 AM   #9
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,634

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by suneelbabu.etl View Post
I don't want to display length of the every field,

in this example take first field, max length is 9 right. it check the every line and if it is <9 then add 0(zero) before the first field. the Out-file is:

like this..u got my point right,..
Again, you need to SPELL OUT YOUR WORDS, and stop using text-speak, and you need to show us what you have done/tried on your own. We will be happy to HELP you, but we ARE NOT going to write your scripts for you.

You've been offered a great hint/solution, but have shown zero effort of your own to implement it, or think about how to modify it to do what you want. Show some effort, and we can help. Show no effort, and there's no point in posting.
 
Old 03-09-2014, 10:59 AM   #10
suneelbabu.etl
LQ Newbie
 
Registered: Mar 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
I don't have experience to write scripting, i am Beginner.
I tried the code is:
Quote:
#!/bin/sh
cd /TESTING/DATA
#cat`ls -lrt *.csv | perl -lane 'print $F[-1]' | tail -1` #recently modified file name
awk -F, '
NR == 1
NR > 1 {
data[NR] = $0
w1[NR] = length($1)
if (length($1) > max) max = length($1)
}
END {
for (i = 2; i <= NR; ++i) {
w = max - w1[i]
if (w > 0) printf "%0" w "d", 0
print data[i]
}
}' test.csv #this code append only one zero
 
Old 03-09-2014, 11:08 AM   #11
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,339

Rep: Reputation: Disabled
OK, from the snippet you posted it seems that you want to do this:
  1. Parse the file to find the currently largest number of characters in field/column 1
  2. Pad column 1 in each row/line with zeroes to make them all of equal length
Is that correct?

Also, it seems you want the script to process only the most recent .csv file in a given directory, is that so?
 
Old 03-09-2014, 11:24 AM   #12
suneelbabu.etl
LQ Newbie
 
Registered: Mar 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
exactly .. that is only..
 
Old 03-09-2014, 12:19 PM   #13
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by schneidz View Post
doesn't honda belong in the third line ?
You are right. That embarrassing oversight has been corrected.

A previous post showed OP how to use length(string). This post goes one step further, showing how to "left pad" a field with length(string) and substr(string)... yet refrains from writing the script for him. Let him read, digest, and adapt this code to his application.

With this InFile ...
Code:
audi,bentley,bmw
chevrolet,dodge,ford
honda,mazda,nissan,subaru,toyota
... this awk ...
Code:
awk -F, '{for (j=1;j<=NF;j++) { 
 pnr="00000000"NR;
 pnr=substr(pnr,length(pnr)-4);
print "In record",pnr", field number",j,"contains",$j}}'  \
 $InFile >$OutFile
... produced this OutFile ...
Code:
In record 00001, field number 1 contains audi
In record 00001, field number 2 contains bentley
In record 00001, field number 3 contains bmw
In record 00002, field number 1 contains chevrolet
In record 00002, field number 2 contains dodge
In record 00002, field number 3 contains ford
In record 00003, field number 1 contains honda
In record 00003, field number 2 contains mazda
In record 00003, field number 3 contains nissan
In record 00003, field number 4 contains subaru
In record 00003, field number 5 contains toyota
Daniel B. Martin
 
Old 03-09-2014, 12:23 PM   #14
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,339

Rep: Reputation: Disabled
Right, then. First, you can indeed get the most recently modified file with ls -lrt | tail -n 1 (or ls -lt | head -n 1 for that matter), but there's really no need to invoke perl when cut can do the job just as well:
Code:
!/bin/sh
input_dir=/TESTING/DATA
input_file=`ls -lt $input_dir/*.csv | head -n 1 | tr -s " " | cut -d " " -f 9`
tr is used to "squeeze" the spaces to make sure the file name is really in column 9. Note that neither the cut version nor the perl version will handle file names containing spaces.

The remainder of your script is written in awk, a language I'm sadly unfamiliar with, but I'll do my best. The first part of the program attempts to check if the length of field 1 is greater than a variable called "max" and updates the variable if it is, but the second part just prints the input line verbatim.

The problem with the current program is twofold:
  • this operation requires two passes; one to determine the field length, and one to modify the field; while an awk program only processes the input once
  • the code as it stands doesn't do what it's supposed to
As far as I've been able to determine, awk program blocks are executed against each line of the input file, unless you specify a "pattern" (such as BEGIN, which gets executed before any data is processed; and END, which is run when the program runs out of input data) before the code block. The first part can thus be replaced by this:
Code:
max=`awk -F, '
BEGIN \
{
  max = 0
}
  {
    if (max < length($1)) max = length($1)
  }
END \
{
  print max
}' $input_file`
The script simply returns the value of "max" after having scanned through the entire file, line by line. I've chosen to initialize the "max" variable in a BEGIN block, but I'm not sure if that's necessary or not in awk.

Since the entire thing is in backticks (`), the output is captured by the shell and stored in a shell variable also (confusingly) called "max". You can use this inside a second awk program as $max, as long as you use double quotes instead of single quotes around the awk program, AND escape all other dollar signs (like this: \$).

Edit: A much better way would be to simply end the quoting right before the shell variable and restart it afterwards, like this: awk '{print "'$variable'" }'. *goes back to reading awk articles and howtos*

Now try and create the second part of the program.

Last edited by Ser Olmy; 03-09-2014 at 12:40 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
find out the record number to which the max length belongs. vaibhavs17 Linux - Newbie 2 07-12-2013 02:34 AM
regex problem to find min,max length words in file gvanto Linux - Newbie 4 01-30-2011 05:34 PM
union field length mismatch linux_lover2005 Programming 4 05-22-2005 02:28 PM
Max pass length xathras Linux - Security 1 06-26-2004 03:46 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration