How to pull sequential lines from a file?

WingnutOne · 09-06-2007, 01:51 PM

I have a text file that contains an inventory list of backup tapes stored in an automated tape storage bank. The file contains an entry for each storage location and each entry is 16 lines long.

I need to separate out each 16-line entry so that I can then manipulate the contents.

What function(s) would be best for this?
(The header line for each entry reads “Slot Address 1234”, “Slot Address 1235”, etc.)

Any advice would be appreciated.

AlucardZero · 09-06-2007, 02:13 PM

In what framework? Are you thinking of writing a program to do this, or do you want a command-line?

visaris · 09-06-2007, 02:13 PM

I need to separate out each 16-line entry so that I can then manipulate the contents.

I have to say I'm confused. Can't you manipulate the entries in a normal text editor? Do you want these entries to be in separate files? Do you need them in memory/variables so you can use them for an app/project? If so, what language do you want? Bash, Java, C++, C, Perl, Python, etc?

I've said this a few times today, but I guess I need to say it again. I'm really sorry if I'm coming across as being mean spirited, but you haven't really given us enough information to help you. As written, we have no idea what exactly you want, what form/language you want an answer in, or what exactly you are even trying to do.

Please make another post with LOTS! HEAPS! MOUNTAINS! of detail, and then we can direct you to a proper solution. Thanks!

trashbird1240 · 09-06-2007, 03:31 PM

I would suggest grep with the --context option. You can also use --before-context and --after-context to give you exactly the number of lines you need.

For instance

Code:

grep --after-context=16  -e "“Slot Address 1234”, “Slot Address 1235”" filename > results

Should give you what you want.

Even AWK would be overkill in this case; Perl or C would be over-massacre.

Joel

WingnutOne · 09-06-2007, 04:00 PM

Details coming up!!!

LOL! This'll be a refreshing change. On most of the other boards I've tried posting to, I've tried giving lots of details and gotten no replies at all because no one wants to take the time to read & process it all!

1. I'm running Red Hat Enterprise 4 on kernel 2.6.9-55.EL (required for the drivers that operate the tape unit, an IBM TS3100).

2. I'm using Bash, and need to write this into a Bash script.

3. I originally said the source was a "file" to try to keep it simple.
The actual source of this data is the output of an inventory check command (which is part of the software package that came with the TS3100). I'll probably pipe the output directly from the inventory command, but could dump it into a file and manipulate it from there if required.

4. I need to generate this inventory output and have the script dig through it on a daily basis.

5. The ultimate purpose of this is to compare the data on the "Slot Address" line (the numbers 4097 & 4098 below) with the tape that's actually in each storage slot and make sure that all of the tapes are where they're supposed to be. The tape's ID number in the first example below is 000102L3 (near the end of the 11th line down).
The second example is of a Slot that was empty when the inventory check was run.
(The tapes can be moved manually; otherwise I wouldn't bother.)

Here are two sample items from the output of the inventory check command:

Slot Address 4097
Slot State ..................... Normal
ASC/ASCQ ....................... 0000
Media Present .................. Yes
Robot Access Allowed ........... Yes
Source Element Address ......... 4097
Media Inverted ................. No

Volume Tag, Length 36

0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0000 - 3030 3031 3032 4C33 2020 2020 2020 2020 [000102L3 ]
0010 - 2020 2020 2020 2020 2020 2020 2020 2020 [ ]
0020 - 0000 0000 [.... ]

Slot Address 4098
Slot State ..................... Normal
ASC/ASCQ ....................... 0000
Media Present .................. No
Robot Access Allowed ........... Yes
Source Element Address Valid ... No
Media Inverted ................. No

Volume Tag, Length 36

0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0000 - 0000 0000 0000 0000 0000 0000 0000 0000 [................]
0010 - 0000 0000 0000 0000 0000 0000 0000 0000 [................]
0020 - 0000 0000 [.... ]

-----END OF EXAMPLE------

6. I have a script already made which cross-references the names of the tapes with the slot where they're supposed to be stored.

Hope that's not too much info. If you need more specifice, let me know.

AlucardZero · 09-06-2007, 04:53 PM

in that case, as mentioned by trashbird, perhaps something like:

Code:

grep --after-context=16  -e "^Slot Address"

?

WingnutOne · 09-06-2007, 05:00 PM

Quote:

Originally Posted by trashbird1240

I would suggest grep with the --context option. You can also use --before-context and --after-context to give you exactly the number of lines you need.

For instance

Code:

grep --after-context=16  -e "“Slot Address 1234”, “Slot Address 1235”" filename > results

Should give you what you want.

Even AWK would be overkill in this case; Perl or C would be over-massacre.

Joel

I entered your command line as written but didn't quite get the result I expected. The output included every line in the file from the first instance of "Slot Address" to the end.

I wonder if I might not be reading/writing some of the quote marks wrong? (I'm fairly new at this so that's entirely possible.) The inner double quotes around “Slot Address 1234” look different from the outer set, but I don't know where to find the set that's different than "standard".

Any ideas here?

Thanks again,
Doug

choogendyk · 09-06-2007, 08:55 PM

That's a pretty heavy duty tape unit to be using bundled software that can't even give you reports on what's up. Are you backing up a number of computers?

I realize it's not the question you asked, but I might be inclined to dump the bundled software and use a decent open source backup scheduler like amanda (http://www.linuxquestions.org/bookmarks/tags/amanda). It not only might manage your backups better, but the built in reporting capabilities would make it easy to do things like inventorying your tapes. You've got what, 24 slots and something like LTO 3 or 4? It seems kind of silly to have that kind of hardware and be messing around trying to get the information out of it that you want.

syg00 · 09-06-2007, 10:27 PM

I'm sure the awk-ophytes could come up with something, but I used perl. Output was:

Code:

4097    000102L3
4098    ................

WingnutOne · 09-07-2007, 08:44 AM

Quote:

Originally Posted by choogendyk

That's a pretty heavy duty tape unit to be using bundled software that can't even give you reports on what's up. Are you backing up a number of computers?

I realize it's not the question you asked, but I might be inclined to dump the bundled software and use a decent open source backup scheduler like amanda (http://www.linuxquestions.org/bookmarks/tags/amanda). It not only might manage your backups better, but the built in reporting capabilities would make it easy to do things like inventorying your tapes. You've got what, 24 slots and something like LTO 3 or 4? It seems kind of silly to have that kind of hardware and be messing around trying to get the information out of it that you want.

At this point, I'm backing up about a dozen remotely mounted iSCSI drives, but may add more in the future. Since my backup machine has no other purpose than to make backups of other servers' drives and all of the drives are mounted to the same directory, I'm using a simple tar command to get them all at once.
From the /mnt/backups/ dir:

Code:

tar -czvf /dev/IBMtape0 .

As for Amanda, I looked at it before I started this but, given the simplicity of the actual backup command, I discarded the idea because it seemed like overkill. I didn't know it could inventory the tapes though, so I'll take another look at it.

Thanks again!

trashbird1240 · 09-07-2007, 10:15 AM

Quote:

Originally Posted by WingnutOne

I entered your command line as written but didn't quite get the result I expected. The output included every line in the file from the first instance of "Slot Address" to the end.

I wonder if I might not be reading/writing some of the quote marks wrong? (I'm fairly new at this so that's entirely possible.) The inner double quotes around “Slot Address 1234” look different from the outer set, but I don't know where to find the set that's different than "standard".

Any ideas here?

Thanks again,
Doug

Hi, I'm sorry I didn't make it clear that it was an example. My intention was to point you in the right direction, however you'll have to figure out the appropriate regular expression to use. Alucadzero had the right idea. If you know how long the entries are and a unique expression on the first line, then you have all the info you need.

My best advice is:

Code:

man grep

and get your hands on Classic Shell Scripting which is a great guidebook form beginning to end. It covers regular expressions for grep in great detail.

Joel

syg00 · 09-07-2007, 10:37 AM

Try something like this - might be easier to use your current setup than install something new. Shouldn't be too hard to organise the two lists so they can be compared

Code:

#! /usr/bin/perl
use warnings ;
# Parse slot address followed by tape number
# presumes input data is well-formed.
my $pattn1 = '(^Slot Add.*\s)(\d+$)' ;
my $pattn2 = '(^0000\s.*\[)(.*)\]$' ;
my $five_dots = '^\.{5}' ;
my ($slot, $tape_no) ;
while (<>) {
  if (/$pattn1/) { 
    $slot = $2 ;
  } elsif (/$pattn2/) {
    $tape_no = $2 ;
#   $tape_no = "Empty" if ($tape_no =~ $five_dots) ;
    print "$slot\t$tape_no\n" ;    
    }
}

Just feed your data into it as STDIN. The commented line is if you want a better label than a pile of dots ...