LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   checking if table name is repeated in file and if yes ,remove it (https://www.linuxquestions.org/questions/linux-newbie-8/checking-if-table-name-is-repeated-in-file-and-if-yes-remove-it-857590/)

smritisingh03 01-20-2011 12:47 PM

checking if table name is repeated in file and if yes ,remove it
 
Hi All

I have an output file that looks like this:

Quote:

status for ACCOUNT_MISSING_FRM_RCIS_LINK- mismatch
status for ACCOUNT_MISSING_FRM_RCIS_LINK is ACCOUNT_MISSING_FRM_RCIS_LINK- does not exist in DB
status for ADP_COMMENT- mismatch
status for ADP_CONFIG- match
status for ADP_FIELD- mismatch
status for ADP_HEADER- match
status for ADP_INDEX- mismatch
status for ADP_JOIN- mismatch
status for ADP_LANGUAGE- match
status for ADP_NATIVE_SQL- match
status for ADP_OBJECT- match
status for ADP_OBJECT_NEW- mismatch
status for ADP_OBJECT_NEW is ADP_OBJECT_NEW- does not exist in DB

now if you see the first line:

status for ACCOUNT_MISSING_FRM_RCIS_LINK- mismatch
status for ACCOUNT_MISSING_FRM_RCIS_LINK is ACCOUNT_MISSING_FRM_RCIS_LINK- does not exist in DB


this should appear just once as :

Quote:

status for ACCOUNT_MISSING_FRM_RCIS_LINK- does not exist in DB

the same goes for last line.

for further information the ACCOUNT_MISSING_FRM_RCIS_LINK is a table name and it row count is taken from a log and then Database checked for the rowcount to see if it is a match,mismatch,or the table does not exist!!

I am getting the desird output just that i need to do something to this output file.

Any help will be greatly appricited.Thankyou!!!!

druuna 01-20-2011 12:53 PM

Hi,

This could work, depending on the rest of the file:

Code:

sed 's/status for .* is /status for /' infile
Hope this helps.

smritisingh03 01-20-2011 01:24 PM

Hi druuna

i tried ur sed but it is doing only part of the job.i am posting the output.

Quote:

status for TABLE_X_UDP_LVL1_VW_BASE_TMP- mismatch

status for TABLE_X_UDP_LVL2_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL2_VW_BASE_TMP- does not exist in DB

status for TABLE_X_UDP_LVL3_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- does not exist in DB

status for TABLE_X_UDP_LVL5_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL5_VW_BASE_TMP- does not exist in DB

I want the output file to look like:-


status for TABLE_X_UDP_LVL2_VW_BASE_TMP- does not exist in DB
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- does not exist in DB
status for TABLE_X_UDP_LVL5_VW_BASE_TMP- does not exist in DB

which meand that the first line in case where the table name is same should be deleted.

thankyou so much.I really really appreciate!!!

TB0ne 01-20-2011 01:47 PM

Quote:

Originally Posted by smritisingh03 (Post 4231987)
Hi druuna
i tried ur sed but it is doing only part of the job.i am posting the output. I want the output file to look like:-

status for TABLE_X_UDP_LVL2_VW_BASE_TMP- does not exist in DB
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- does not exist in DB
status for TABLE_X_UDP_LVL5_VW_BASE_TMP- does not exist in DB

which meand that the first line in case where the table name is same should be deleted.

thankyou so much.I really really appreciate!!!

Spell your words out, please. And now that we know what you want, can you post what you've written/tried?

smritisingh03 01-20-2011 01:58 PM

hi TBone

the input file to this script is:

Quote:

status for ACCOUNT_MISSING_FRM_RCIS_LINK- mismatch
status for ACCOUNT_MISSING_FRM_RCIS_LINK is ACCOUNT_MISSING_FRM_RCIS_LINK- does not exist in DB

status for ADP_COMMENT- mismatch
status for ADP_CONFIG- match
status for ADP_FIELD- mismatch
status for ADP_HEADER- match
status for ADP_INDEX- mismatch
status for ADP_JOIN- mismatch
status for ADP_LANGUAGE- match
status for ADP_NATIVE_SQL- match
status for ADP_OBJECT- match
status for ADP_OBJECT_NEW- mismatch
status for ADP_OBJECT_NEW is ADP_OBJECT_NEW- does not exist in DB

status for ADP_RELATION- match
status for ADP_TBL_OID- mismatch
status for ADP_TBL_OID_UNUSED- match
status for ADP_UPGRADE_OPS- match
status for ADP_VIEW_FIELD- mismatch
status for AUTHEN_NE_CON_BUS_PROD- mismatch
status for AUTHEN_NE_CON_BUS_PROD is AUTHEN_NE_CON_BUS_PROD- does not exist in DB

and what i tried is:

Quote:

#!/bin/sh



#sed '/is/d' statusOP


#awk '{
#if ($0 in stored_lines)
# x=1
# else
# print
# stored_lines[$0]=1
# }' statusOP

# #i > fileout

#awk '!x[$0]++' statusOP > statusOPfile.new

sed 's/status for .* is /status for /' statusOP
the output of this looks like:

Quote:

status for TABLE_X_UDP_LVL2_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL2_VW_BASE_TMP- does not exist in DB

status for TABLE_X_UDP_LVL3_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- does not exist in DB

status for TABLE_X_UDP_LVL5_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL5_VW_BASE_TMP- does not exist in DB

status for TEMP_ADDRESS_WO_ACC- mismatch
status for TEMP_ADDRESS_WO_ACC- does not exist in DB

so what I want is:

status for TABLE_X_UDP_LVL2_VW_BASE_TMP- does not exist in DB[/B]
status for TABLE_X_UDP_LVL3_VW_BASE_TMP- mismatch
status for TABLE_X_UDP_LVL4_VW_BASE_TMP- does not exist in DB
status for TABLE_X_UDP_LVL5_VW_BASE_TMP- does not exist in DB
status for TEMP_ADDRESS_WO_ACC- does not exist in DB[/QUOTE]

ANY HELP WOULD BE GREATLY APPRECIATED!!!!

druuna 01-20-2011 02:05 PM

Hi,

Would it be an idea to remove all the status for XYZ - does not exist in DB lines (trying to keep it simple)?
Or are there lines like that that do _not_ have a previous line with the same table name?

This should work, assuming that the duplicate tables are right after eachother:
Code:

#!/bin/bash

inFile="$1"

awk -F"[ -]" '
BEGIN { seen="" }
{ if( $3 == seen ){
    oldseen = $3
  } else { print }
  seen = $3
}' $inFile

Run it and provide an infile (somename.sh name.of.infile).

Hope this helps.

smritisingh03 01-20-2011 02:37 PM

Quote:

Would it be an idea to remove all the status for XYZ - does not exist in DB lines (trying to keep it simple)?
Or are there lines like that that do _not_ have a previous line with the same table name?

actually no as I have to bring out the status between the rowcount in logfile as well as databse.so basically when i say
ABC match

it means that thae rowcount for table ABC is same in log file as well as in database

ABC mismatch

means that the rowcount for ABC is different in database

ABC does not exist means that the table ABC does not exist in databse.So there is a difference in MISMATCH and DOES NOT EXIST which is why we cannot remove all the status for XYZ - does not exist in DB lines .

Quote:

Or are there lines like that that do _not_ have a previous line with the same table name?

yes it is the same pattern throughout i.e,

status for table ABC mismatch
status for table ABC does not exist in database

so i just want to get rid of status for table ABC mismatch

what i was thinking is to write a sed liner which finds the pattern does not exist in database and deletes the line above.I think this should work and I tried

awk -v RS='[^\n]*\n*pattern\n[^\n]*' '{print}' ORS="" infile

but it is not wkng.please help.

druuna 01-20-2011 02:43 PM

Hi,

Ignore, see my next post.

My previous answer had 2 parts, the second (the awk shell script) should do what you want. It is not related to the previous paragraph.

I'm talking about this part:

Code:

#!/bin/bash

inFile="$1"

awk -F"[ -]" '
BEGIN { seen="" }
{ if( $3 == seen ){
    oldseen = $3
  } else { print }
  seen = $3
}' $inFile

Run it and provide an infile (somename.sh name.of.infile).


EDIT: Hold on..... I seem to have made misread something. Be right back ;)

druuna 01-20-2011 02:54 PM

Hi,

Please don't mix your wanted output.....
From post #1
Quote:

this should appear just once as :

status for ACCOUNT_MISSING_FRM_RCIS_LINK- does not exist in DB
From post #7
Quote:

so i just want to get rid of status for table ABC mismatch
Assuming that post #7 is correct:
Code:

#!/bin/bash

inFile="$1"

awk -F"[ -]" '
BEGIN { seen="" }
{ if( $3 != seen ){
    oldseen = $3 ; print
  }
  seen = $3
}' $inFile

Assuming post #1 is what you want:
Code:

#!/bin/bash

inFile="$1"

awk -F"[ -]" '
BEGIN { seen="" }
{ if( $3 == seen ){
    oldseen = $3
  } else { print }
  seen = $3
}' $inFile


smritisingh03 01-20-2011 03:30 PM

it doesnt work!!!

druuna 01-20-2011 03:36 PM

"It" does work on my side.........

grail 01-21-2011 02:08 AM

Maybe something like:
Code:

awk -F"[ -]+" 'NR == 1{x = $0;y = $3;next}{if(y != $3){print x;x = $0;y = $3}else{$3=$4="";x = $0}}' file

smritisingh03 01-24-2011 03:53 PM

Thankyoy all for helping me solve this one...hats off to this forum!!!!!!!!

sumeet inani 01-25-2011 03:27 AM

Don't forget to thank members for their useful reply.


All times are GMT -5. The time now is 12:32 PM.