LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 05-04-2009, 08:57 PM   #1
fono
LQ Newbie
 
Registered: May 2009
Posts: 6

Rep: Reputation: 0
Exclamation Script to convert logs columns to rows


Hi all, i have a bunch of log files that needs to be imported into a database for analysis but the logs comes in the following format:

****************************CDR 1****************************
CDR_TYPE: CallRec
callingNumber: 7590405
calledNumber: 7500820
answerTime: 00:03:39
releaseTime:00:07:34
callDuration: 235
*************************************************************

****************************CDR 2****************************
CDR_TYPE: MOSMSRecord
serviceCentre: (91)685750004
recordingEntity: (91)685750007
location: (LAC: 00 0A CELLID: 75 49)
messageReference: 31
originationTime: 2007-09-01 00:07:34 - 0B00
destinationNumber: (91)6857513394
*************************************************************

****************************CDR 3****************************
CDR_TYPE: CallRec
callingNumber: 7583021
calledNumber: 7552525
answerTime: 00:04:39
releaseTime: 00:07:37
callDuration: 178
*************************************************************

I need a script to output data in rows insted of columns filtering only data of CDR_TYPE: CallRec

The output should be :

CDR_TYPE callingNumber calledNumber answerTime releaseTime callDuration
CallRec 7590405 7500820 00:03:39 00:07:34 235
CallRec 7583021 7552525 00:04:39 00:07:37 178
......


Thanks in advance
fono
 
Old 05-04-2009, 09:56 PM   #2
BrianK
Senior Member
 
Registered: Mar 2002
Location: Los Angeles, CA
Distribution: Debian, Ubuntu
Posts: 1,334

Rep: Reputation: 51
Argh... misread the problem. This will not solve it.
If you know python, you can probably use what's below as a guide, but will need to add more logic. If you don't know python, this is likely no help. At the end of the day, you just need to see if the call is of type "CallRec" & print out every line after the first ':' until you see a bunch of '*', then start over. I may try to come back & fix this script up, but for now, I'm going to go eat dinner.



Again, what's below is my initial response & does not address your problem correctly

Not too terribly difficult in python:

Code:
#!/usr/bin/python2.5

import sys
import string

class LQ_Parser(object):
    def __init__(self,some_file):
        """
        Parses a file, keying on the stuff before the ':'
        """

        # read the file, storing each line in an array:
        contents = open(some_file,'r').readlines()

        # set up some containers:
        self.data_dict = {}   # this dictionary will allow one key for multiple vals
        self.ret_string = ""  # this string will hold the return value

        for line in map(string.strip,contents):
            parts = line.split(':')
            if len(parts) < 2:
                continue
            key = parts[0]
            val = string.join(parts[1:],':').strip()
            if not self.data_dict.has_key(key):
                self.data_dict[key] = [val]
            else:
                self.data_dict[key].append(val)

        for dk in sorted(self.data_dict.keys()):
            self.ret_string += "%s %s\n" % (dk,string.join(self.data_dict[dk],' '))

        print self.ret_string


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print >>sys.stderr, "\nMust give a file name.\n"
        sys.exit(1)
    LQ_Parser(sys.argv[1])
runs like so:
Code:
akane:/tmp/lq>./parser.py data.txt
CDR_TYPE CallRec MOSMSRecord CallRec
answerTime 00:03:39 00:04:39
callDuration 235 178
calledNumber 7500820 7552525
callingNumber 7590405 7583021
destinationNumber (91)6857513394
location (LAC: 00 0A CELLID: 75 49)
messageReference 31
originationTime 2007-09-01 00:07:34 - 0B00
recordingEntity (91)685750007
releaseTime 00:07:34 00:07:37
serviceCentre (91)685750004
edited to add: I made the variables "self." so that they could be used in other functions...
I made the whole thing a class so that it could be easily used as a module later, in another application.
The thoery is you could make some other accessor function like "show_results" that returns the self.ret_string, then some other python app could do something like:

Code:
from parser import LQ_Parser
db_stuff = LQ_Parser(data_file).show_results()
for line in db_stuff.split('\n'):
   insert_into_db(line)

Last edited by BrianK; 05-04-2009 at 10:11 PM.
 
Old 05-04-2009, 11:12 PM   #3
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
if you have Python and can use it
Code:
data=open("file").read().split("\n\n")
print "CDR_TYPE callingNumber calledNumber answerTime releaseTime callDuration"
for items in data:
    if "callingNumber" in items:
        s=""
        items=items.split("\n")
        for i in items[1:-1]:
            s=s+" "+i.split(": ")[-1]
        print s
output:
Code:
# ./test.py
CDR_TYPE callingNumber calledNumber answerTime releaseTime callDuration
 CallRec 7590405 7500820 00:03:39 00:07:34 235
 CallRec 7583021 7552525 00:04:39 00:07:37 178
 
Old 05-05-2009, 01:08 AM   #4
fono
LQ Newbie
 
Registered: May 2009
Posts: 6

Original Poster
Rep: Reputation: 0
thanks guys,
dont really know python, but ill try...coz i came from a C++ /Visual Basic background..

thanks for help anyway
 
Old 05-18-2009, 05:44 PM   #5
fono
LQ Newbie
 
Registered: May 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Exclamation ghostdog74

sorry, wrong question

Last edited by fono; 05-19-2009 at 03:13 PM.
 
Old 05-19-2009, 03:11 PM   #6
fono
LQ Newbie
 
Registered: May 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Exclamation ghostdog74s code

Thanks ghostdog74

but one more question please, suppose i want to read in and print out only 3 fields, callingNumber, calledNumber and callDuration, where do i made decisions within the code you posted???

data=open("file").read().split("\n\n")
print "CDR_TYPE callingNumber calledNumber answerTime releaseTime callDuration"
for items in data:
if "callingNumber" in items:
s=""
items=items.split("\n")
for i in items[1:-1]:
s=s+" "+i.split(": ")[-1]
print s
 
Old 05-19-2009, 04:02 PM   #7
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 70
Quote:
Originally Posted by fono View Post
data=open("file").read().split("\n\n")
print "CDR_TYPE callingNumber calledNumber answerTime releaseTime callDuration"
for items in data:
if "callingNumber" in items:
s=""
items=items.split("\n")
for i in items[1:-1]:
s=s+" "+i.split(": ")[-1]
print s
The code you highlighted is only the heading, and doesn’t actually do the “hard part” of your program. The relevant part is in the line:
Code:
        for i in items[1:-1]:
You can change this to be any subsequence of the list items that you wish, either by index or by matching to your criteria (or you might even remove the offending elements before getting to this line).

P.S.
When quoting code (especially with spacing sensitive languages such as python, use “[code] Code goes here! [/code]” to preserve indentation.

P.P.S.
Just for fun, here is the original request in a perl one-liner:
Code:
perl -00naF'\n' -e 'print"@{[map{s/^.*: //;$_}grep/:/,@F]}\n"if/CallRec/' file
Or, for the selective request:
Code:
perl -00naF'\n' -e 'print"@{[map{s/^.*: //;$_}grep/callingNumber|calledNumber|callDuration/,@F]}\n"if/CallRec/' file
 
Old 05-19-2009, 05:04 PM   #8
fono
LQ Newbie
 
Registered: May 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Exclamation thanks again

but in response to ghostdog74,
the output im diggin should be:

# ./test.py
callingNumber calledNumber callDuration
7590405 7500820 235
7583021 7552525 178

stripping off other fields
is this possible using: import re

then use re.match() to filter off only needed fields?



Quote:
Originally Posted by osor View Post
The code you highlighted is only the heading, and doesn’t actually do the “hard part” of your program. The relevant part is in the line:
Code:
        for i in items[1:-1]:
You can change this to be any subsequence of the list items that you wish, either by index or by matching to your criteria (or you might even remove the offending elements before getting to this line).

P.S.
When quoting code (especially with spacing sensitive languages such as python, use “[code] Code goes here! [/code]” to preserve indentation.

P.P.S.
Just for fun, here is the original request in a perl one-liner:
Code:
perl -00naF'\n' -e 'print"@{[map{s/^.*: //;$_}grep/:/,@F]}\n"if/CallRec/' file
Or, for the selective request:
Code:
perl -00naF'\n' -e 'print"@{[map{s/^.*: //;$_}grep/callingNumber|calledNumber|callDuration/,@F]}\n"if/CallRec/' file
 
Old 05-19-2009, 07:06 PM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by fono View Post
Thanks ghostdog74

but one more question please, suppose i want to read in and print out only 3 fields, callingNumber, calledNumber and callDuration, where do i made decisions within the code you posted???

data=open("file").read().split("\n\n")
print "CDR_TYPE callingNumber calledNumber answerTime releaseTime callDuration"
for items in data:
if "callingNumber" in items:
s=""
items=items.split("\n")
for i in items[1:-1]:
s=s+" "+i.split(": ")[-1]
print s
put Python codes in code tags as suggested by osor. Python uses indentation, so if i can't see indentation at relevant parts of your code, its hard to see what went wrong...
Code:
#!/usr/bin/env python
data=open("file").read().split("\n\n")
print "callingNumber calledNumber callDuration"
for items in data:
    if "callingNumber" in items:
        s=""
        items=items.split("\n")
        items=items[2:4]+items[-2:-1] 
        for i in items:
            s=s+" "+i.split(": ")[-1]
        print s
sys.exit()
output
Code:
# ./test.py
callingNumber calledNumber callDuration
 7590405 7500820 235
 7583021 7552525 178
Python lists go be indexing..please read Python docs to get understanding of how it works. (see my sig)
 
Old 05-19-2009, 08:19 PM   #10
fono
LQ Newbie
 
Registered: May 2009
Posts: 6

Original Poster
Rep: Reputation: 0
ghostdog74

thank you so much ghostdog74,

exactly what i needed.. thanks again just for my sake, could you please brief on line:

items=items[2:4]+items[-2:-1]

cheers


Quote:
Originally Posted by ghostdog74 View Post
put Python codes in code tags as suggested by osor. Python uses indentation, so if i can't see indentation at relevant parts of your code, its hard to see what went wrong...
Code:
#!/usr/bin/env python
data=open("file").read().split("\n\n")
print "callingNumber calledNumber callDuration"
for items in data:
    if "callingNumber" in items:
        s=""
        items=items.split("\n")
        items=items[2:4]+items[-2:-1] 
        for i in items:
            s=s+" "+i.split(": ")[-1]
        print s
sys.exit()
output
Code:
# ./test.py
callingNumber calledNumber callDuration
 7590405 7500820 235
 7583021 7552525 178
Python lists go be indexing..please read Python docs to get understanding of how it works. (see my sig)
 
Old 05-19-2009, 08:29 PM   #11
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by fono View Post
thank you so much ghostdog74,

exactly what i needed.. thanks again just for my sake, could you please brief on line:

items=items[2:4]+items[-2:-1]

cheers
please play with the Python interpreter by typing "python" on the command prompt
Code:
# python
ActivePython 2.6.2.2 (ActiveState Software Inc.) based on
Python 2.6.2 (r262:71600, Apr 21 2009, 14:36:21)
[GCC 3.3.1 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> alist = [1,2,3,4,5,6,7,8,9,10]
>>> alist[2:4]
[3, 4]
>>> alist[-2:-1]
[9]
>>> alist[2:4] + alist[-2:-1]
[3, 4, 9]
do i need to explain more? please go to the Python site , read its tutorial (see my sig).
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to convert 1 column into several rows in Linux? markraem Linux - Software 9 03-30-2010 11:24 AM
How to print data in rows and columns suran Linux - General 3 03-15-2009 02:53 PM
text data conversion: rows into columns frankie_DJ Programming 6 06-03-2006 06:43 AM
columns & rows Ammad Linux - General 1 08-08-2005 04:02 AM
rows and columns digitalgravy Linux - General 2 03-16-2004 06:47 PM


All times are GMT -5. The time now is 01:52 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration