LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 02-15-2011, 09:29 AM   #1
saurabhmehan
Member
 
Registered: Jul 2010
Posts: 44

Rep: Reputation: 0
Question Want to improve the performance of script


Hi All,

I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately.
Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search the log file having these two strings with one more static string that is "CustomCDRInterceptor",then format the searched data in prescribed format.

Code of script written is as follows:
Code:
#!/bin/bash
if [ $# -ne 2 ]
then
    echo "Error in $0 - Invalid Argument Count"
    echo "Syntax: $0 input_file output_file"
    exit
fi

awk -F"," '{print $1 , $2}' $1 |
while read a b
do
  output=`cat $2 | grep "CustomCDRInterceptor" | grep "$a" | grep "$b" | cut -d"|" -f6 | awk -F"," '{print $4,",",$28,",",$27,",",$17,",",$12,","$21,",",$11,",",$26,",",$14,",",$6,",",$30,",",$31,",",$19,",",$5,",",$22,",",$10,",",$9,",",$20,",",$15,",",$29,",",substr($32,1,match($32,/\]/)-1),",",$23,",",$18,",",$24,",",$7,",",$13,",",$2,",",$25,",",$16,",",$8,",",$1,",",$3,","}'`
  #echo $output
  echo $output | perl -F, -lane 's/^\s*[- \w\[]+:(.*?)\s*$/$1/ foreach @F; print join ",", @F'
done
Sample data of input file will be like :
Code:
8273518145,SDP-DM-152281623
9062995078,SDP-DM-152281631
7870856010,SDP-DM-152281650
8445208702,SDP-DM-152281662
8923084825,SDP-DM-152281668
9061161091,SDP-DM-152281712
8401832603,SDP-DM-152281733
8273522929,SDP-DM-152281837
8341646298,SDP-DM-152281851
9062930630,SDP-DM-152281868
Sample Data in log file is as follows:
Code:
15-Feb-2011 20:56:36,538|8401131793|subscription_app|-23e57aa%3A12e29c422b8%3A502a|ChargeAmount|REInterceptor -  Is already Rated [Yes] RatedPrice [0.0]
15-Feb-2011 20:56:36,538|8401131793|subscription_app|-23e57aa%3A12e29c422b8%3A502a|ChargeAmount|ChargingInterceptor - subscriber details processed sucessfully- {arg0.referenceCode=balanceEnquiry:true;subsChannel:Unknown;channelType:Subscription;transactionId:-23e57aa%3A12e29c422b8%3A502a;pricePtAvl:true;eventType:subscription;contentId:4945;serviceId:CR03;Circle_Name:GJ;Circle_ID:5;isRated:Yes;productName:VAS0003ALL;basePrice:0.0;subsType:RECURRING;Sub_Profile:Pre-Paid, arg0.endUserIdentifier=8401131793, arg0.charge.description= Retrieve-Balance , arg0.charge.currency=INR, arg0.charge.code=, arg0.charge.amount=0.0}
15-Feb-2011 20:56:36,539|8445862834|subscription_app|5a1fa24a%3A12e29cb5fb3%3A1d75|ChargeAmount|CustomCDRInterceptor - CDR Info[Optional_Field1:,Subscription_Channel:Unknown,Optional_Field2:,Transaction_ID:,Content_ID:4945,IMEI:,Product_Name:VAS0003ALL,PPL_FLAG:,Charge_Code:,Base_Price:0.0,CustomerID:B_55822315,Circle_Name:UK,Sender_MSISDN:,IMSI:405818123375012,Content_Status:,Location:UK,Circle_ID:18,Original_Content_Owner_ID:,CPNAME:default_provider,Content_Price:0.0,Zone:,Content_Name:,Static_ID:UK#37453052,External_Correlation_Id:5a1fa24a%3A12e29cb5fb3%3A1d75,Subscription_Type:RECURRING,MSISDN:8445862834,Transaction_Mode:Subscription,Transaction_DateTime:2011-02-15 20:56:36 GMT+05:30,Content_Type:,Sub_Profile:Pre-Paid,CPID:,Other_Info:]
15-Feb-2011 20:56:36,539|8401131793|subscription_app|-23e57aa%3A12e29c422b8%3A502a|ChargeAmount|CustomCDRInterceptor - CDR Info[Optional_Field1:,Subscription_Channel:Unknown,Optional_Field2:,Transaction_ID:,Content_ID:4945,IMEI:,Product_Name:VAS0003ALL,PPL_FLAG:,Charge_Code:,Base_Price:0.0,CustomerID:B_44445354,Circle_Name:GJ,Sender_MSISDN:,IMSI:405927121139030,Content_Status:,Location:GJ,Circle_ID:5,Original_Content_Owner_ID:,CPNAME:default_provider,Content_Price:0.0,Zone:Default,Content_Name:,Static_ID:GJ#32697724,External_Correlation_Id:-23e57aa%3A12e29c422b8%3A502a,Subscription_Type:RECURRING,MSISDN:8401131793,Transaction_Mode:Subscription,Transaction_DateTime:2011-02-15 20:56:36 GMT+05:30,Content_Type:,Sub_Profile:Pre-Paid,CPID:,Other_Info:]
15-Feb-2011 20:56:36,540|8445862834|subscription_app|5a1fa24a%3A12e29cb5fb3%3A1d75|ChargeAmount|GetBalance|PaymentPlugin-Request -  Get User Balance of: 8445862834
15-Feb-2011 20:56:36,540|8401131793|subscription_app|-23e57aa%3A12e29c422b8%3A502a|ChargeAmount|GetBalance|PaymentPlugin-Request -  Get User Balance of: 8401131793
15-Feb-2011 20:56:36,545|8445862834|subscription_app|5a1fa24a%3A12e29cb5fb3%3A1d75|ChargeAmount|GetBalance|PaymentPlugin-Response -  Retrieved Balance Bucket: 1;20091003;20110810;21.50;|
 
Old 02-15-2011, 09:51 AM   #2
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,040

Rep: Reputation: 373Reputation: 373Reputation: 373Reputation: 373
For a starter, I'd rather use either awk, bash, or perl, but not the three interpreters together. That alone will probably cut down forking, ram usage, context switching and will avoid moving that amount of data from one process to another, much more taking into account that you are using awk and perl inside a loop, and they will be invoked thousands of times, probably.
 
Old 02-15-2011, 09:57 AM   #3
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,541

Rep: Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919Reputation: 1919
Well my first two observations are these:

1. Why use cat, grep, cut, awk and perl to do a job that could easily all be done in perl and probably even awk?

2. As $2 is listed in your 'Syntax' as the 'output' file, why is it not receiving any output but being used as input??
 
Old 02-15-2011, 10:03 AM   #4
kirtimaan_bkn
Member
 
Registered: Aug 2004
Location: INDIA
Distribution: Various Distros
Posts: 203

Rep: Reputation: 31
perl is the tool which you need to parse large files.
 
  


Reply

Tags
asap, awk, perl, script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
perl script - need help to improve performance johngreg Programming 7 10-14-2008 07:00 PM
Have i tried everything to improve my disk performance? drben Linux - Hardware 15 02-07-2006 02:38 PM
How can I improve gnome performance? pfaendtner Linux - Software 16 04-14-2005 11:52 AM
How to Improve performance of PC Imran Aziz Linux - Software 3 06-03-2004 02:10 PM
ways to improve performance flipboi Linux - Newbie 6 10-25-2003 11:22 AM


All times are GMT -5. The time now is 10:08 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration