LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-06-2010, 09:58 AM   #31
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,606

Rep: Reputation: 448Reputation: 448Reputation: 448Reputation: 448Reputation: 448

Quote:
Originally Posted by grail View Post
I have a come a bit late to this party but would like to submit an option that seems to work with the data from post #22 (ie I get the same output):
Code:
awk '!f{getline line < "file2";f=1}$0 == line{f=0}1;END{while(getline < "file2")print}' file1
Hi,

using the simplified data from post #25, the command produces
Code:
mars
jupiter
saturn
neptune
deimos
but in this case file2 should just have been appended to file1. As the OP states, the logs will overlap most of the time. But there is still the possibility that of 'clean' splits, i.e. no overlapping tails/heads.
So any solution should be able to 'recognize' if there was a clean split; if the clean split can be identified as such. Due to the nature of the problem this won't be always the case and sometimes it would be identifiable.

This is pretty much worst case scenario.
 
Old 09-06-2010, 10:14 AM   #32
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
I did find this helped produce an error, which I fixed, but I am not sure I understand where you are coming from?
The data from post #25:
Code:
#file1
mars
jupiter
saturn
neptune

#file2
jupiter
saturn
uranus
deimos
Now I must be missing something as both files contain jupiter and saturn so there is overlap??
And with the slight change to my code:
Code:
awk '!f{getline line < "file2";f=1}$0 == line{f=0}1;END{if(f)print line;while(getline < "file2")print}' file1
I now get the expected (what I expect) as output:
Code:
mars
jupiter
saturn
neptune
uranus
deimos
You may need to help me identify how this information is wrong?

If there is no overlap at all:

Code:
#file1
mars
jupiter
saturn
neptune

#file2
venus
earth
uranus
deimos
I get:
Code:
mars
jupiter
saturn
neptune
venus
earth
uranus
deimos
ie. all lines from both files with file2 appended to file1
 
Old 09-06-2010, 10:40 AM   #33
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 727

Rep: Reputation: 74
Hi.

I'm happy to see that other responders were in sync with the OP and that issue was resolved. I was far out in left field on this ... cheers, makyo

PS For the problem I thought I was solving, the awk code in post # 14 to do uniq without sorting (and a union across many files) can be replaced by the far more succinct:
Code:
awk ' !x[$field]++ ' data1 data2
adapted from a number of posts at unix.com

Last edited by makyo; 09-06-2010 at 10:43 AM.
 
Old 09-06-2010, 01:05 PM   #34
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,606

Rep: Reputation: 448Reputation: 448Reputation: 448Reputation: 448Reputation: 448
Quote:
Originally Posted by grail View Post
I did find this helped produce an error, which I fixed, but I am not sure I understand where you are coming from?
Hi,
read this post first
http://www.linuxquestions.org/questi...3/#post4088742

I also did not understand it at first, but the tail of file1 and the head of file2 in post #25 do not overlap.
Code:
mars
jupiter
saturn     jupiter
neptune    saturn
           uranus
           deimos
If they were to overlap then the third item of file2 would have to be neptune. The constellation in file1 and file2, however, indicates a 'clean' split of the logfile.

An example which does qualify as overlap would be
Code:
mars
neptune
jupiter     jupiter
saturn      saturn
            uranus
            deimos
Think of it like 'sliding' the data of file2 up until you get the biggest possible overlap of the *tail* of file1 with the *head* of file2.
This is not possible with the data in post #25. If you slide up until jupiter and saturn match, then it would look like this
Code:
mars
jupiter     jupiter
saturn      saturn
neptune     uranus     <-- tail of file1 does not match head of file2
            deimos
neptune mismatches uranus in this case, hence the tail does not match the head.
When this thread started I also thought that this problem could be solved by a one- or two-liner. Right now, I am not sure about that. I guess, it can be done with awk but an awk script would still need a couple of lines. However, an awk solution *might* yield some advantage regarding execution time.
 
1 members found this post helpful.
Old 09-06-2010, 09:18 PM   #35
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Hey crts, thanks for the explanation Can see where I was getting lost now although the requirement does seem a little obtuse now (but each to their own)

So it would seem you need to check the reverse line sort of file1 against file2 prior to processing (ugly).
I'll put my thinking cap back on

Thanks again
 
Old 09-06-2010, 10:13 PM   #36
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,066
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by bonzer21 View Post
I'll bring the example home a little more - the system I'm working with is embedded and proprietary, I can't view the whole log (I doubt the device even keeps the whole log), but I can view the last few ~70ish entries of it at any given time. It's as if the log were scrolling up like the credits of a movie but with varying speed, depending on how busy the web server is, and once an entry has scrolled off the top it's gone forever. In an effort to produce a much more useful, browsable log, I am dumping each "screen" of data every few seconds to text files. To avoid missing any entries, I'm capturing at a rate that is faster than the log is every likely to scroll, which means that the text files I'm creating often heavily overlap.
Does the appliance give you shell access? Maybe
you *can* find a better way of handling this ...
 
Old 09-06-2010, 10:19 PM   #37
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Take 3 (I think):
Code:
awk '!f{getline line < "file2";f=1}line == $0{c++;f=0}c && line != $0{c=0;close("file2")}1;END{close("file2");while(c--)getline < "file2";while(getline < "file2")print}' file1
 
Old 09-06-2010, 10:41 PM   #38
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,606

Rep: Reputation: 448Reputation: 448Reputation: 448Reputation: 448Reputation: 448
Quote:
Originally Posted by grail View Post
Take 3 (I think):
Code:
awk '!f{getline line < "file2";f=1}line == $0{c++;f=0}c && line != $0{c=0;close("file2")}1;END{close("file2");while(c--)getline < "file2";while(getline < "file2")print}' file1
Close. Try it with file1 and file2 from post #11. This data set resembles the real scenario a bit closer.
file1
Code:
mercury
uranus
venus
jupiter
earth
mars
jupiter
saturn
file2
Code:
jupiter
saturn
uranus
neptune
mars
result
Code:
$ awk '!f{getline line < "file2";f=1}line == $0{c++;f=0}c && line != $0{c=0;close("file2")}1;END{close("file2");while(c--)getline < "file2";while(getline < "file2")print}' file1
mercury
uranus
venus
jupiter
earth
mars
jupiter
saturn
saturn   <-- one saturn too much
uranus
neptune
mars
 
Old 09-06-2010, 10:49 PM   #39
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
ahhh ... my bad, forgot to reset f when I close the file
Code:
awk '!f{getline line < "file2";f=1}line == $0{c++;f=0}c && line != $0{c=0;close("file2");f=0}1;END{close("file2");while(c--)getline < "file2";while(getline < "file2")print}' file1
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
xfree86-common xserver-common xfonts-base missing in etch/lenny unev_21 Debian 2 09-11-2009 03:12 AM
LXer: Unique Sorting Of Lists And Lists Of Lists With Perl For Linux Or Unix LXer Syndicated Linux News 0 09-05-2008 02:50 PM
LXer: kgdb, To Merge Or Not To Merge LXer Syndicated Linux News 0 02-05-2008 07:10 PM
LXer: KHTML Vs Webkit: To Merge or Not To Merge LXer Syndicated Linux News 0 10-27-2007 07:41 AM
BOGUS.common.04y -> /home/common/Mailbox jayakrishnan Linux - Networking 0 11-19-2005 05:48 AM


All times are GMT -5. The time now is 06:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration