LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-21-2010, 06:11 AM   #1
Goni
LQ Newbie
 
Registered: Sep 2005
Posts: 26

Rep: Reputation: 15
Perl read file and parse blocks


Hello,
I am trying to make a perl script which reads data from a file and parse it. The data in the file has the following syntax
Code:
    Device Physical Name     : Not Visible
    Device Symmetrix Name    : 1234
    Device Serial ID         : N/A
    Attached BCV Device      : N/A
    Attached VDEV TGT Device : N/A
    Device Capacity
        {
        Cylinders            :       5120
        Tracks               :      76800
        512-byte Blocks      :   10485760
        MegaBytes            :       5120
        KiloBytes            :    5242880
        }

    Device Physical Name     : Not Visible
    Device Symmetrix Name    : 4567
    Device Serial ID         : N/A
    Device Capacity
        {
        Cylinders            :       5120
        Tracks               :      76800
        512-byte Blocks      :   10485760
        MegaBytes            :       5120
        KiloBytes            :    5242880
        }
Each unique record starts with "Device Physical Name". So, I have a set of records within "Device Physical Name". I want to read this set of records starting from "Device Physical Name" and ends up till next "Device Physical Name".

Offcourse FS is ":", and I just want to print/or later put info in a csv file. Would appreciate if I can get any help.

Goni
 
Old 08-21-2010, 06:17 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Well this sounds very achievable Where are you stuck? What have you tried?
 
Old 08-21-2010, 06:22 AM   #3
Goni
LQ Newbie
 
Registered: Sep 2005
Posts: 26

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by grail View Post
Well this sounds very achievable Where are you stuck? What have you tried?
Since I am new to perl, that's why I asked for help.
Code:
#!/usr/bin/perl
$data_file="dev";
open(DAT, $data_file) || die("Could not open file!");
@raw_data=<DAT>;
foreach $device (@raw_data)
{
chomp $device;
($d1, $d2)=split(/\:/,$device);
while (($d1) = "Device Physical Name") {
print "$d1";
}
}
I am trying to get first field in d1 and it's value in d2. Later can play around with both d1 and its value.
 
Old 08-21-2010, 06:32 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Ok I see where you are going If you search around you will see a common way of reading from a file is as follows:
Code:
open(HANDLE, "file_name") || die "couldn't open the file!";

while($line = <HANDLE>){ # This is also seen a lot as while(<HANDLE>) but you can look that up
    print $line;
}

close(HANDLE);
Obviously you can split and do other thing other than print.

See if that helps?
 
Old 08-21-2010, 06:35 AM   #5
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Are the contents of the file in uniform? It looks like that its main delimeter is a blank line. In awk that can be easily achieved with RS = "" and FS = ":". But I'm interested to learn hacking this in Perl.

Edit: I think FS = ":" won't do. But anyway, how do you intend to save the values in csv?... Noting that some items have more parameters than the other: Attached BCV Device, Attached VDEV TGT Device..

Last edited by konsolebox; 08-21-2010 at 06:45 AM.
 
Old 08-21-2010, 06:56 AM   #6
Goni
LQ Newbie
 
Registered: Sep 2005
Posts: 26

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by konsolebox View Post
Are the contents of the file in uniform? It looks like that its main delimeter is a blank line. In awk that can be easily achieved with RS = "" and FS = ":". But I'm interested to learn hacking this in Perl.

Edit: I think FS = ":" won't do. But anyway, how do you intend to save the values in csv?... Noting that some items have more parameters than the other: Attached BCV Device, Attached VDEV TGT Device..
The contents of the file are all uniform. None of the item value is empty. In csv, the first field will become the item heading. For example, list all Devices with their size, and rest of the properties/info. Each item will become 1 heading, and each heading will have more than 1 values.

Goni
 
Old 08-21-2010, 06:59 AM   #7
Goni
LQ Newbie
 
Registered: Sep 2005
Posts: 26

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by grail View Post
Ok I see where you are going If you search around you will see a common way of reading from a file is as follows:
Code:
open(HANDLE, "file_name") || die "couldn't open the file!";

while($line = <HANDLE>){ # This is also seen a lot as while(<HANDLE>) but you can look that up
    print $line;
}

close(HANDLE);
Obviously you can split and do other thing other than print.

See if that helps?
Reading the file not a problem, how can I process a block of information while that block may or may not have a certain number of lines. 1 block may have 10 lines, other have 14.

If we do a split,
Code:
my @arr=split("\:",$line);
, the first element of the array brings all the values instead of just one.
Code:
if( ($arr[0] eq "Device Physical Name" ) )
...
won't help
 
Old 08-21-2010, 07:28 AM   #8
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Quote:
Originally Posted by Goni View Post
The contents of the file are all uniform. None of the item value is empty. In csv, the first field will become the item heading. For example, list all Devices with their size, and rest of the properties/info. Each item will become 1 heading, and each heading will have more than 1 values.

Goni
Honestly I can't parse it. Can you give us an example output with the headers(?) of what you intend... At least based from the two entries. I think it can really make things clearer. You can place comments using # if you like.
 
Old 08-21-2010, 07:34 AM   #9
Goni
LQ Newbie
 
Registered: Sep 2005
Posts: 26

Original Poster
Rep: Reputation: 15
Ok, here is what the sample output would look like, CSV output.

Code:
Device Physical Name,Device Symmetrix Name,Symmetrix ID.....
Not Visible,1234,1234567
Not Visible,3456,1234567
Not Visible,8726,1234567
Not Visible,0000,1234567
Not Visible,1234,1234567
Would that helps?
 
Old 08-21-2010, 07:48 AM   #10
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Well basing from that and from the two entries you can get output like this:
Code:
Not Visible,1234,N/A,N/A,N/A,5120,76800,10485760,5120,5242880
Not Visible,4567,N/A,5120,76800,10485760,5120,5242880
Notice the difference in the number of columns they have.

Which could only mean that you need to parse the file in a more XML-like way. Not just linear. With this the possible attributes that may occur should first be predetermined.
 
Old 08-21-2010, 10:54 AM   #11
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I will put my hand up to say I am an extreme noob when it comes to Perl. With that in mind ... have a look:
Code:
#!/usr/bin/perl

use warnings;

open(HANDLE, "file1") || die "Unable to open file1";

$counter = 0;

while($line = <HANDLE>){
    chomp($line);
    if ($line =~ /Device Physical Name/)
    {
        $counter++;
        $records = {};
    }

    if ($line ne "" && $line =~ /:/)
    {
        ($field,$value) = split(/:/, $line);
        $records->{trim($field)} = trim($value);
    }

    push @array, $records if ($line eq "") ;
}

push @array, $records if (--$counter != $#array);

for $href ( @array ) {
    print "{ ";
    for $role ( keys %$href ) {
         print "$role=$href->{$role} ";
    }
    print "}\n";
}

close(HANDLE);

sub trim
{
    my $string = shift;

    $string =~ s/^\s+//;
    $string =~ s/\s+$//;

    return $string;
}
 
Old 08-21-2010, 11:58 AM   #12
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
I would suggest letting Perl do the first level of disassembly of the file, by using the blank line as a record delimiter:
Code:
    $/="\n\n";
Having done that, each scalar read from the file will be a block of data, which can readily be split (hint) on newlines. From there, it looks pretty easy to create a hash of field names/values by splitting on ':'s.

--- rod.
 
Old 08-21-2010, 05:43 PM   #13
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
Still are these the only attributes?
Code:
Device Physical Name
Device Symmetrix Name
Device Serial ID
Attached BCV Device
Attached VDEV TGT Device
Device Capacity  # ignored
Cylinders
Tracks
512-byte Blocks
MegaBytes
KiloBytes
 
Old 08-21-2010, 07:33 PM   #14
Goni
LQ Newbie
 
Registered: Sep 2005
Posts: 26

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by konsolebox View Post
Still are these the only attributes?
Code:
Device Physical Name
Device Symmetrix Name
Device Serial ID
Attached BCV Device
Attached VDEV TGT Device
Device Capacity  # ignored
Cylinders
Tracks
512-byte Blocks
MegaBytes
KiloBytes
No, there are some additional. But they are all with same FS. I think if it works for 2, it will work for all.

grail, it depends on what is the definition of a noob you got in your dictionary but, your code seems looping each item more than 1 times. I tried it, yet to debug, give an output 86 times with "Device Physical Name" suppose to show only 4 times
 
Old 08-21-2010, 08:06 PM   #15
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
@Goni It's really important to determine all the possible attributes first since the code will depend on it. If you know the possible attributes, you can already determine where to place the values and where to reserve places for null values like with the example before:
Code:
Not Visible,1234,N/A,N/A,N/A,5120,76800,10485760,5120,5242880
Not Visible,4567,N/A,5120,76800,10485760,5120,5242880
You can expect an output like this instead
Code:
Not Visible,1234,N/A,N/A,N/A,5120,76800,10485760,5120,5242880
Not Visible,4567,N/A,,,5120,76800,10485760,5120,5242880
With that the code will be simpler since you don't have to collect all of the headers (attributes) and data first then dynamically create an order based from the collected headers, then print the data.

With the simpler version you can immediately print the data for each entry since you already know the order and where to reserve the null values.

If it can't be determined then there's no choice but to create the harder code.

@grail I thought this might be a good reference since you already know much about awk: http://perldoc.perl.org/perltrap.html

Last edited by konsolebox; 08-21-2010 at 08:14 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl Parse .bin file acandria17 Programming 2 07-07-2009 05:42 AM
Help w/ script to read file and parse variables cslink23 Linux - General 18 11-26-2006 02:22 AM
perl script to parse this file ohcarol Programming 10 11-02-2006 09:50 AM
optimizing perl parse file. eastsuse Programming 1 12-22-2004 02:49 AM
Need help with perl/bash script to parse PicBasic file cmfarley19 Programming 13 11-18-2004 05:06 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:23 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration