LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-28-2012, 03:03 AM   #1
arn2025
LQ Newbie
 
Registered: Feb 2012
Posts: 26
Blog Entries: 1

Rep: Reputation: Disabled
Manipulate data


i have a long text file with
[QUOTE]
  1. a 1 200 10-22-2012
  2. a 2 350 11-20-2012
  3. a 3 222 12-16-2012
  4. b 1 123 01-17-2014
  5. b 2 345
  6. b 3 432 02-02-2012
  7. c 1 675
  8. c 2 0 07-09-2012
  9. c 3 12778 03-08-2012
[/QUOTE
]

how do i change the file to
Quote:
  1. n 1 2 3 D1 D2 D3
  2. a 200 350 222 10-22-2012 10-20-2012 12-16-2012
  3. b 123 345 432 01-17-2014 02-02-2012
  4. c 675 0 12778 07-09-2012 03-08-2012

Last edited by arn2025; 08-28-2012 at 06:23 AM.
 
Old 08-28-2012, 06:00 AM   #2
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
I'm afraid what you've given us is nothing but a long list of numbers, letters and dates. How are you going from one set of data to the other? None of my guesses seem to be consistent with the data you've given.

In any case, it looks like it's going to be a more complicated parsing job than just a simple sed/awk script (though I'm sure it could be done that way) so you'd be better off with python or perl or something like that.
 
Old 08-28-2012, 06:11 AM   #3
arn2025
LQ Newbie
 
Registered: Feb 2012
Posts: 26
Blog Entries: 1

Original Poster
Rep: Reputation: Disabled
please note the pattern in the first column the conetent is the same, i just want to have all the fields of that column fow which its the same in the same row
 
Old 08-28-2012, 06:15 AM   #4
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
Ah, is the 2nd date on line 2 now meant to be 11-20-2012? If so, I'll have a look at it.

Are all the 'a' lines going to be consecutive (same for b, c, etc.)? Are spaces always the delimiter?

EDIT: If I was correct about all the above guesses, then I think this programme does what you want:

Code:
#!/usr/bin/env python3

from sys import argv

if len(argv) != 2:
    print("Usage:",argv[0],"<infile>")
    exit(1)
try:
    infile = open(argv[1],"r")
except IOError:
    print("Error:",argv[1],"doesn't exist.")
    exit(2)
print("n 1 2 3 D1 D2 D3")
token = None
numbers = []
dates = []
for line in infile:
    line = line.split()
    if line[0] != token:
        if token != None:
            print(token,' '.join(numbers),' '.join(dates))
        token = line[0]
        numbers = []
        dates = []
    try:
        numbers.append(line[2])
    except IndexError:
        pass
    try:
        dates.append(line[3])
    except IndexError:
        pass
print(token,' '.join(numbers),' '.join(dates))
infile.close()

Last edited by Snark1994; 08-28-2012 at 06:30 AM.
 
Old 08-28-2012, 06:25 AM   #5
arn2025
LQ Newbie
 
Registered: Feb 2012
Posts: 26
Blog Entries: 1

Original Poster
Rep: Reputation: Disabled
sorry, i have corrected that, yess they are all going to be consective and spaces are the delimeters, its just a long list with d's e's and so on
 
Old 08-28-2012, 06:31 AM   #6
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
Ah, sorry, didn't see your latest post - I have edited my last post to include some code that I believe does what you want.

Hope this helps,
 
Old 08-28-2012, 07:49 AM   #7
arn2025
LQ Newbie
 
Registered: Feb 2012
Posts: 26
Blog Entries: 1

Original Poster
Rep: Reputation: Disabled
thaks though wen i look at the code it seems to suggest the first column changeds in 3's whereas at some points it changes after four of 5, ie 5 a's it could be
 
Old 08-28-2012, 09:29 AM   #8
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Rep: Reputation: Disabled
It can be achieved simply by a piece of AWK code. I believe it would be much better than the tedious and heavy python codes.

And, it is more general than the previous solution from Snark1994, since here the code can generate the first line according to the file itself rather than specifying it manually as "n 1 2 3 D1 D2 D3".

It dose not matter if you change your file to include more than 3 lines for a, b or c. The code bellow are able to adjust it according to the input file.

Code:
awk '!/^$/{
 f1=$2
 a[$2]=a[$2]$4" "
 b[$2]=b[$2]$5" "
 c[$2]=c[$2]$3" "
 split(c[f1],d," ")
 
}
END {
 printf "1. n "c[f1]
 for (i in d) printf "D"d[i]" "
 printf "\n"
 for (i in a) {
  k++
  printf k+1". "i " "a[i]b[i]
  printf "\n"}
}
' YOURINPUTFILE

Last edited by cristalp; 08-28-2012 at 09:52 AM.
 
Old 08-28-2012, 10:53 AM   #9
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,424

Rep: Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823
Well I am glad everyone else picked up on the fact that the data included a header ... had me baffled
Anyhoo:
Code:
ruby -ane 'BEGIN{l=[]};if ! l.empty? && l[0] != $F[0]; puts l.join(" ");l.clear;else l<<$F[0] if l.empty?;l<<$F.last;l.insert((l.count/2).ceil,$F[2]);end' file
 
Old 08-29-2012, 05:05 AM   #10
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
It always makes me cry on the inside a little when you do that, grail.

Sorry, arn2025, I didn't quite understand where you got your header line from - my code would work fine (it would print 4 or 5 numbers on the line) but it would only put "1 2 3" in the header. It would be easy to change it to do this properly too, but seeing as you've got two other solutions, I'll leave you with those.

(Also, if you look, neither grail nor I found it easy to work out what exactly you needed doing - even when the gist of it was clear, the fact that you might get 4 or 5 similar rows, or the presence of a header, was not obvious. Next time you post a thread, perhaps give a bit of thought to explaining what you want clearly and precisely at the start. Just a heads up, I hope I'm not lecturing )

If you consider this problem to be solved, can you mark the thread as 'SOLVED' please? Thank you.

Last edited by Snark1994; 08-29-2012 at 05:08 AM.
 
Old 08-29-2012, 05:12 AM   #11
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,424

Rep: Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823
Quote:
It always makes me cry on the inside a little when you do that, grail.
Get baffled? Happens to me all the time with how some of the questions are phrased
 
Old 08-30-2012, 04:59 AM   #12
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
Quote:
Originally Posted by grail View Post
Get baffled? Happens to me all the time with how some of the questions are phrased
Goodness no, I meant post a completely cryptic one-liner which baffles me, after I write a 20-line script to do it :L
 
Old 08-30-2012, 10:51 AM   #13
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,424

Rep: Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823
Quote:
Goodness no, I meant post a completely cryptic one-liner which baffles me, after I write a 20-line script to do it :L
Well I did start to teach myself python3 a while back, and do still quite enjoy its benefits, but since having been shown a little bit of ruby from another
LQ member I got stuck in and really enjoy it

You are right though, I need to remember to explain them a bit better
Code:
-a - Split read lines using the default delimiter into the global array $F
-n - Read in a file
-e - The following is a script to be interpreted

BEGIN{l=[]} - initialize the 'l' array (BEGIN here is the same as awk, ie only read once

if ! l.empty? && l[0] != $F[0] # l array not empty and first element in l and $F arrays not equal
  puts l.join(" ")             # Display the contents of l array separated by a space
  l.clear                      # Reset l array
else 
  l<<$F[0] if l.empty?         # like perl simple tests may come after the action. << means append to array
  l<<$F.last
  l.insert((l.count/2).ceil,$F[2]) # insert an item at given position. ceil is to round up to the nearest whole number
end
Hope that helps explain a bit better
 
Old 08-30-2012, 11:05 AM   #14
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
Mm, I do like ruby, mostly because of its support of functional programming - it bridged the gap between haskell and python, because I got to do a lot of the neat haskell tricks without having the hassle of very strict type checking and pure functions.

Yeah, definitely very nifty code

@cristalp, have we solved your problem? If so, please remember to mark the thread as solved.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Manipulate output alancampbell6 Linux - Newbie 6 10-11-2011 02:17 PM
awk: how to read dan manipulate data in multi files from a file mauludi Linux - Newbie 4 05-27-2011 02:52 AM
want to open & manipulate column data from excel file. rinu budhbhatti LinuxQuestions.org Member Intro 1 02-01-2011 09:50 AM
manipulate dcop data for File commandS? carl0ski Programming 2 07-02-2005 09:46 PM


All times are GMT -5. The time now is 03:08 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration