Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Format of the output file :
TARGET_TABLE_NAME,INSERTED_APPLIED_ROWS,INSERTED_AFFECTED_ROWS,INSERTED_REJECTED_ROWS,UPDATED_APPLIE D_ROWS,UPDATED_AFFECTED_ROWS,UPDATED_REJECTED_ROWS,DELETED_APPLIED_ROWS,DELETED_AFFECTED_ROWS,DELETE D_REJECTED_ROWS
And the result should be (3 lines) :
CFG,2,2,1,0,0,0,0,0,0
PORTUG_ALL,695,695,0,4224,695,0,0,0,0
REMOTE_DTL,1228,1228,0,5,5,0,7,5,0
The numbers/statistics should be extracted from the latest SOURCE DATE (i.e. Tue Oct 18 09:35:54 2016)
I tried many options (cut, sed, awk, ...) but it does not work.
I would appreciate any help/suggestion on the matter.
Hello Experts,
I am new in the website and seeking your help on a specific need. I need to parse statistics in a csv format. Statistics are coming from a log file (see below) :
Format of the output file :
TARGET_TABLE_NAME,INSERTED_APPLIED_ROWS,INSERTED_AFFECTED_ROWS,INSERTED_REJECTED_ROWS,UPDATED_APPLIE D_ROWS,UPDATED_AFFECTED_ROWS,UPDATED_REJECTED_ROWS,DELETED_APPLIED_ROWS,DELETED_AFFECTED_ROWS,DELETE D_REJECTED_ROWS
And the result should be (3 lines) :
CFG,2,2,1,0,0,0,0,0,0
PORTUG_ALL,695,695,0,4224,695,0,0,0,0
REMOTE_DTL,1228,1228,0,5,5,0,7,5,0
The numbers/statistics should be extracted from the latest SOURCE DATE (i.e. Tue Oct 18 09:35:54 2016) I tried many options (cut, sed, awk, ...) but it does not work.
Please read the "Question Guidelines" link in my posting signature. Without knowing what you have done/tried or you posting your actual code, we can't tell you much. You say "tried many options", but don't tell us WHICH ONES, or give us details about what your actual goal is, or how often you need to do this. Solutions for a one-time fix will be different than something that's meant to be run numerous times a week/day.
Also, where is this data coming FROM? Could be there is already an option to save it in CSV format the way you want it.
If you know how to use awk and sed and all that, then this might help you get started. This will read in 14 lines as a record, with each line treated as a seperate variable, so you can massage the data into what you need:
If this is a serious sized log, and you are planning to do many of them, you WILL DEFINITELY want to use a programmatic language like Python or C++ to do this. Doing it in bash will cost you performance over the long run.
Last edited by szboardstretcher; 10-18-2016 at 11:42 AM.
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,524
Rep:
Let's break the problem up into bite-sized pieces. A file is made up of lines, and lines are made up of fields (or words) divided by white space (tab, space). Each word is commonly represented as: $1, $2, ..., i.e. if I have somefile that contains:
cat bat
cat mat
cat sat
If I use
Code:
$ grep cat somefile
and the output is:
Code:
cat bat
cat mat
cat sat
then
Code:
$ grep cat somefile | awk '{print $2}'
will yield output:
Code:
bat
sat
mat
Whereas
Code:
grep cat somefile | awk '{print $1,$2}'
will yield output:
Code:
bat cat
mat cat
sat cat
But there is no awk one-liner that will do everything you want to do. You'll need an awk--or better yet PERL--script to do it all. But AWK is easier than PERL, albeit less powerful.
Thanks.
PERL ? Do you mean it is not doable easily ?
Perl IS easy...and thank you for not replying to my questions. I had asked you where this data was coming FROM, since there is a possibility that it can be outputted into CSV natively, and I also asked you to show us what you've done/tried on your own. Why do you ignore these things?
Quote:
I can't parse the following line :
APP_9889 Inserted rows - Requested: 1218 Applied: 1218 Rejected: 0 Affected: 1218
and generate an output file like : 1218;1218;0;1218
Why?? Again, what DID YOU DO to attempt to parse this line??? Just saying "I can't parse the following line", tells us nothing about your efforts. Incidentally, I'm able to parse this almost down to what you need with a few sed statements, but your input and output are NOT MATCHING. For example, you posted this:
And the result should be (3 lines) :
CFG,2,2,1,0,0,0,0,0,0
PORTUG_ALL,695,695,0,4224,695,0,0,0,0
REMOTE_DTL,1228,1228,0,5,5,0,7,5,0
??? Look at the CFG input data...it has 4 numbers...2, 2, 0, and 2. For your 'required' output, you have 2,2,1,0,0,0,0,0,0. Where is that coming from, or are you saying you need the data padded??? And if this is for a database input, again we will ask where the data is coming from.
dfco, please write where the data is coming from. You might have more options than you are aware of. Though the fall-back option is some simple manipulation with perl.
APP_9889 Inserted rows - Requested: 1218 Applied: 1218 Rejected: 0 Affected: 1218
would generate the following line
1218;1218;0;1218
There are several goto tools for system administration. "grep", "sed", "awk", and "perl" top the list. Learning what they can do is a first step. "awk" seems most relevant here based on your sample, if the number of words is always the same.
Maybe I need to write I am a beginner and complex Linux scripting is something new.
The log is a csv file and is generated by an application which insert/update/delete records into a table.
I tried to make simplify my need with the last example I gave :
APP_9889 Inserted rows - Requested: 1218 Applied: 1218 Rejected: 0 Affected: 1218
would generate the following line
1218;1218;0;1218
Thanks.
No one is asking for you to write anything complex, but we are asking to see your attempt for at present it appears you are making none and waiting for someone to tell you
how to do it ... which is not the LQ way
You mention that the log file is of csv format, but, on an examination of the data you have shown I am unable to find even one comma (seeing as csv :- comma separated values)
Is the name of the application not allowed to be known? This may be the case, but you should at least say so instead of avoiding the question.
Yes your simplified example is much easier, although any solution provided that works on this simple example will more than likely not work on the real one due to the extra data,
and again, you have shown no effort as to what you might have tried.
The general response to no effort is to point you at the manuals for sed/awk/perl or the like and ask you to come back when you have had a go.
So, as asked by many others, please answer the questions asked so you may be provided with the most appropriate response?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.