LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-29-2008, 12:44 AM   #1
telecom_is_me
Member
 
Registered: Jun 2008
Location: Upstate NY
Distribution: Fedora on the desk / Gentoo in the Racks
Posts: 36

Rep: Reputation: 15
Huge Data Set Analysis, Shell Script to copy specific HEX Pairs into a separate file


I've got a huge data set that I'm working with, the data is dumped from another script into a file in real time and consists of 16 Groupings of 2 Hex Characters times 10 rows, then the sequence repeats with different values. This happens over and over again.. "Multi-GB a day."

Here's an example of what I'm looking at.

Code:
5f 30 59 ed ea e1 8c 61 57 d7 36 5b e6 40 90 8c
a1 16 a2 78 7b fe 48 be 25 65 15 a3 7d ae c6 c8
de 05 df b2 b3 71 00 00 00 00 55 aa d1 9c 3c 04
5f 30 59 ed ea e1 8c 61 57 d7 36 5b e6 40 90 8c
a1 16 a2 78 7b fe 48 be 25 65 15 a3 7d ae c6 c8
de 05 df b2 b3 71 aa d1 9c 3c 04 00 00 00 00 55
5f 30 59 ed ea e1 8c 61 57 d7 36 5b e6 40 90 8c
a1 16 a2 78 7b fe 48 be 25 65 15 a3 7d ae c6 c8
de 05 df b2 b3 71 aa d1 9c 3c 04 5f 30 59 ed ea
00 00 00 00 55 e1 8c 61 57 d7 36 5b e6 40 90 8c

a1 16 a2 78 7b fe 48 be 25 65 15 a3 7d ae c6 c8
de 05 df b2 b3 71 aa d1 9c 3c 04 5f 30 59 ed ea
e1 8c 61 57 d7 00 00 00 00 55 36 5b e6 40 90 8c
a1 16 a2 78 7b fe 00 be 25 65 15 a3 7d ae 9a 00
00 00 00 00 00 00 48 9d c6 c8 de 05 df b2 b3 71
aa d1 9c 3c 04 5f 30 59 ed ea e1 8c 61 57 d7 36
5b e6 40 90 00 00 00 00 55 8c a1 16 a2 78 7b fe
48 be 25 65 15 a3 7d ae c6 c8 de 05 df b2 b3 71
aa d1 9c 3c 04 5f 30 59 ed ea e1 8c 61 57 d7 36
5b e6 40 90 8c a1 16 a2 78 00 00 00 00 55 7b fe
What I'm looking for in this case, is to copy the Hex Values that occur at 7th column x 5th row of each of the repetitions into a new file.

So if I break it down I know the following:

- Each grouping is 10 rows long
- Each grouping is 16 columns of HEX pairs wide + spaces
- There is one blank line between every 10 rows
- The specific HEX Pair I need to output is row 5 x column 7
- I need to copy the HEX pair out to a new file "so cat to a file (>)"
- I need the output file to be historic, so in this case I don't want to overwrite it. (>>)

So what I can infer:

- I need to use something like the cut command to get the right column
- I need another command to grab the correct line the first time
- I need some sort of a loop that can increment the line grab by 11 each time it loops.

---

Any suggestions?
 
Old 06-29-2008, 01:13 AM   #2
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 63
Code:
$ cat hex.pl
#!/usr/bin/perl

$row=1;
while (<>) {
    chomp;
    if (/^$/) {
        $row = 1;
        next;
    }
    print "$1\n" if $row == 5 and /^(?:[[:xdigit:]]{2} ){6}([[:xdigit:]]{2})/;
    $row++;
}

$ hex.pl hex
48
48
Pipe your output to hex.pl, redirect as you see fit.
 
Old 06-29-2008, 09:07 AM   #3
telecom_is_me
Member
 
Registered: Jun 2008
Location: Upstate NY
Distribution: Fedora on the desk / Gentoo in the Racks
Posts: 36

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by Mr. C. View Post
Code:
$ cat hex.pl
#!/usr/bin/perl

$row=1;
while (<>) {
    chomp;
    if (/^$/) {
        $row = 1;
        next;
    }
    print "$1\n" if $row == 5 and /^(?:[[:xdigit:]]{2} ){6}([[:xdigit:]]{2})/;
    $row++;
}

$ hex.pl hex
48
48
Pipe your output to hex.pl, redirect as you see fit.
If I'm reading this correctly, "and I'm not familiar with perl so bare with me" the script says:

Sets a starting point of row 1
counts down to row 5
counts over to the correct grouping
cuts that grouping
then starts the entire process over again based on the next blank line

Does that sound about right?
 
Old 06-29-2008, 09:32 AM   #4
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
If you look at the data, there seem to be repeating patterns that are not in sync with the 10-line formatting. Are you sure which data you need to extract?
 
Old 06-29-2008, 11:33 AM   #5
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 63
Quote:
Originally Posted by telecom_is_me View Post
If I'm reading this correctly, "and I'm not familiar with perl so bare with me" the script says:

Sets a starting point of row 1
counts down to row 5
counts over to the correct grouping
cuts that grouping
then starts the entire process over again based on the next blank line

Does that sound about right?
Yes, it counts to row 5, and matches hex digit 6 space-separated "nibbles", followed by a 7th, which it the pattern captures as $1. It then continues doing nothing until a blank line is seen.
 
Old 06-29-2008, 12:37 PM   #6
nx5000
Senior Member
 
Registered: Sep 2005
Location: Out
Posts: 3,307

Rep: Reputation: 57
Or by using paragraph read:
Code:
#!/usr/bin/perl
$/ = '';
while (<>) {
	chomp;
	print substr($_ , 210,2)."\n" ;
}
Quote:
$ ./aa.pl aie
48
48

Last edited by nx5000; 06-29-2008 at 12:39 PM.
 
Old 06-29-2008, 09:05 PM   #7
telecom_is_me
Member
 
Registered: Jun 2008
Location: Upstate NY
Distribution: Fedora on the desk / Gentoo in the Racks
Posts: 36

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by nx5000 View Post
Or by using paragraph read:
Code:
#!/usr/bin/perl
$/ = '';
while (<>) {
	chomp;
	print substr($_ , 210,2)."\n" ;
}
Mr. C.. Thank you again for your help, however in this case I think I'm going to go with NX5000's script... it's elegant in it's counting method and much easier for me to wrap my head around.

Both methods do work though so thank you both for a quick solution.
 
Old 06-29-2008, 09:28 PM   #8
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 63
It is indeed simple, and there are many ways to perform a task.

I chose what I thought might be more self-explanatory to you, where you could change the row and column easily. I personally also think some input validation is worthwhile, as well is being able to ignore garbage that might follow after the 16th col. But too each his own.

Cheers.
 
Old 06-29-2008, 09:55 PM   #9
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,139

Rep: Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122
Well formed data can pardon a multitude of sins.
What if the "blank line" does indeed contain whitespace ???.
 
Old 06-29-2008, 10:00 PM   #10
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 63
A trivial fix:

if (/^$/) {

if (/^\s*$/) {
 
Old 06-29-2008, 10:08 PM   #11
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,671
Blog Entries: 4

Rep: Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945
The bottom-line of this thread so-far is that:
  • Bash scripting is not the "power tool" that you need to do this job. You're wasting your time to use a tool that is too-weak for the purpose that you intend.
  • Perl is a "power tool" that is very well suited to this purpose. (It's not the only such tool in the Linux world, to be sure, but it happens to be a damn-good one.)
  • Ergo, faced with this problem, invest in a short interlude to become cursorily-familiar with Perl ... and do it just as quickly as you can.
One of the hallmarks of Linux/Unix environments ... "what all the fuss is really about," if you will ... is that there is a cornucopia of power-tools available here, and you can "write a script" in any one (or many!) of them.

In such an environment, therefore, it is very easy to make the rueful discovery that ... you're making things much harder on yourself than you actually needed to, a-n-d, you didn't even know it.

Hey... "no harm, no foul!" If, say, "you cut your teeth on Microsoft Windows," where tools other than Visual Basic (heh...) are essentially non-existent, then you might be entirely unaccustomed to find yourself in the "embarrassment of riches" that is Unix/Linux. No problem... I am not making fun at your expense. Welcome aboard! Now you know "what all the fuss is about!"
 
Old 06-29-2008, 10:48 PM   #12
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,139

Rep: Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122
Quote:
Originally Posted by Mr. C. View Post
A trivial fix:
Indeed - I always prefer this test (if applicable). Been caught by "corner cases" too often.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to read data from file to use in shell script? ozymandias Linux - Newbie 7 10-27-2006 01:19 PM
How to find and change a specific text in a text file by using shell script Bassam Programming 1 07-18-2005 07:15 PM
Shell script to copy file name with part of directory Transition Linux - General 5 01-18-2005 05:40 PM
Shell script - how to show a specific line of a text file davi_cabral Linux - Software 3 09-28-2004 01:39 PM
shell script to copy lines from a file Warmduvet Programming 2 09-14-2004 09:25 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:28 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration