A sed conundrum!
Stream Editor is one of those programs that makes you wonder how you survived before it (when you were using other less powerful OSes).
I've tried to get sed to solve this rather unusual problem I have, and as yet I haven't found an efficient solution. Can anyone help out? I'm quite sure there's a answer ... THE PROBLEM I have a scorecard, which typically might look like this: StartScores Alice 90 points Barry 12 points Christabel 23 points Derrick 17 points Erica 12 points Derrick 4 points Flora 8 points Barry 12 points EndScores Notice 'Barry 12 points' occurs twice! The list of scores above are kept in small text files, and there are lots of them. Unfortunately, the person taking down the scores made a few mistakes, and has appended some of the files like so: StartScores Alice 90 points Barry 12 points Christabel 23 points Derrick 17 points Erica 12 points Derrick 4 points Flora 8 points Barry 12 points End Scores Start Scores Alice 90 points Barry 12 points Christabel 23 points Derrick 17 points Erica 12 points Derrick 4 points Flora 8 points Barry 12 points EndScores This is highly undesirable, as unless the proprietary computer program looking at the scores suddenly becomes more intelligent than it is, it will assume the appended scores are valid, which they are not. THE SOLUTION Well, before we look at the solution, I'll just point out some things about the problem - Barry scored 12 points twice in the original scorecard. This means that doing a: Code:
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' Code:
sed '/StartScores/,/EndScores/d' Eeek. Any ideas anyone? I'm trying to convince people that sed is designed for this problem, so any help would be really appreciated. |
I'm not exactly sure what it is you want, I hope that you want to grab the first 'scorecard' and quit after it is printed. If that is the case:
Code:
sed '/End Scores/q' input.file input files contains the following (your example + extra token for checking): StartScores 1 Alice 90 points 1 Barry 12 points 1 Christabel 23 points 1 Derrick 17 points 1 Erica 12 points 1 Derrick 4 points 1 Flora 8 points 1 Barry 12 points End Scores Start Scores 2 Alice 90 points 2 Barry 12 points 2 Christabel 23 points 2 Derrick 17 points 2 Erica 12 points 2 Derrick 4 points 2 Flora 8 points 2 Barry 12 points EndScores $ sed '/End Scores/q' input.file StartScores 1 Alice 90 points 1 Barry 12 points 1 Christabel 23 points 1 Derrick 17 points 1 Erica 12 points 1 Derrick 4 points 1 Flora 8 points 1 Barry 12 points End Scores As shown, it will only show the first 'scorecard'. |
That looks like it might be the solution - thanks druuna. I'll try it out and let you know here if it worked.
Notice the simplicity of your solution compared to my overly complex attempts! I keep forgetting that sed works on a stream, not a buffer in memory! |
All times are GMT -5. The time now is 10:40 AM. |