LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 03-15-2009, 10:06 AM   #1
mcgao07
LQ Newbie
 
Registered: May 2008
Posts: 15

Rep: Reputation: 0
Unhappy How to find a specific data block in a huge file and then do algebra on them?


Hello folks,

First I need to locate a specific data block in a huge file (I mean hundreds of thousands of lines). That block starts with "Frequency" in the first row, and ends with "End" in the last row. I need to get the data between this two rows, 1 data in each row.

Then, I will need to do a product of all the data in that block. How can I write a script to do this?

Thank you so much!

Michael
 
Old 03-15-2009, 11:14 AM   #2
onebuck
Moderator
 
Registered: Jan 2005
Location: Midwest USA, Central Illinois
Distribution: SlackwareŽ
Posts: 11,270
Blog Entries: 3

Rep: Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445Reputation: 1445
Hi,

What have you done to date?

You could look at the 'Advanced Bash-Scripting Guide'.

This link and others are available from 'Slackware-Links'. More than just SlackwareŽ links!
 
Old 03-15-2009, 12:15 PM   #3
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,151

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
Can you provide more details?

1) Where are the words "Frequency" and "End" located in the lines?
2) Is the case of the words significant?
3) Do those words occur other places in the data stream?
4) What format is used for the data values of which you wish to compute the product? (Shell script arithmetic is, basically, integer only.)
5) Do any "frequency" numbers occur on the same line(s) as the key words?

For my own curiosity, why do you need the product of the frequencies? If the "frequencies" were, for example, event probabilities and those events were independent, then you'd be computing the probability of all those event occurring at the same time. (Although the assumption of independence is seldom justified.) If, instead, you're looking at radiation (sound, light, power) frequencies, I can't think of anything to which the product would relate.
 
Old 03-16-2009, 11:23 AM   #4
mcgao07
LQ Newbie
 
Registered: May 2008
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by PTrenholme View Post
Can you provide more details?

1) Where are the words "Frequency" and "End" located in the lines?
2) Is the case of the words significant?
3) Do those words occur other places in the data stream?
4) What format is used for the data values of which you wish to compute the product? (Shell script arithmetic is, basically, integer only.)
5) Do any "frequency" numbers occur on the same line(s) as the key words?

For my own curiosity, why do you need the product of the frequencies? If the "frequencies" were, for example, event probabilities and those events were independent, then you'd be computing the probability of all those event occurring at the same time. (Although the assumption of independence is seldom justified.) If, instead, you're looking at radiation (sound, light, power) frequencies, I can't think of anything to which the product would relate.

Hi,

I need data below the line of "Phonon frequencies:", and the the line of "end". "Phonon frequencies:" only occurs once. "end" occurs several times in the file, and this "end" actually marks the end of the file.
I need to do the product of these phonon frequencies and they are real.

The file looks like below:
...
Phonon frequencies:
+8168468677723.75879
+8173254220737.11621
...
+22835577550655.71484
end

Thank you.

Michael
 
Old 03-16-2009, 11:33 AM   #5
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 728Reputation: 728Reputation: 728Reputation: 728Reputation: 728Reputation: 728Reputation: 728
Is this homework?

As requested by onebuck, please show what work you have done and tell us specifically where you are stuck.

It's also helpful to post a sample of the data, and a sample of the desired output.
 
Old 03-16-2009, 12:21 PM   #6
malekmustaq
Senior Member
 
Registered: Dec 2008
Location: /root
Distribution: Slackware & BSD
Posts: 1,218

Rep: Reputation: 231Reputation: 231Reputation: 231
======================
First I need to locate a specific data block in a huge file (I mean hundreds of thousands of lines). That block starts with "Frequency" in the first row, and ends with "End" in the last row. I need to get the data between this two rows, 1 data in each row.

Then, I will need to do a product of all the data in that block. How can I write a script to do this?
=======================

mcgao07:

If the huge file is a simple text file you can do it yourself. Read some tutorials about bash cat and piping commands. Try google and read about "Bash Scripting". The answer is just within your reach.

If you have made initial work and have need refine your script please post it here so that everyone can help you. But most of all post your entire objective with the parameters and methods you wanted to occur along the formula, not just the product.

Good luck.
 
Old 03-16-2009, 07:21 PM   #7
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,151

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
Quote:
Originally Posted by mcgao07 View Post
Hi,

I need data below the line of "Phonon frequencies:", and the the line of "end". "Phonon frequencies:" only occurs once. "end" occurs several times in the file, and this "end" actually marks the end of the file.
I need to do the product of these phonon frequencies and they are real.

The file looks like below:
...
Phonon frequencies:
+8168468677723.75879
+8173254220737.11621
...
+22835577550655.71484
end

Thank you.

Michael
The product of numbers of that magnitude will require a very large number of digits. Here's a shell script that does it, but there is no overflow check made in bash arithmetic, and all numbers must be integers, so the answer you'll get is meaningless.
Code:
#!/bin/bash
data=("$(sed -n '/Phonon frequencies/,/end/ {p;}' $1)")
p=1
for d in ${data[@]};do
  [ -n "$(echo $d | grep [^+0-9.])" ] && continue
  v=$(echo $d | sed -n 's/[+.]//g;p')
  p=$(($p*$v))
done
echo Proudct: $p
Perhaps you'd like to re-phrase your question so a solution using, e.g., octave or some other language that can handle very large numbers without loss of precision.

The above program could be modified to extract the numbers you want from the file:
Code:
#!/bin/bash
data=("$(sed -n '/Phonon frequencies/,/end/ {p;}' $1)")
p=1
for d in ${data[@]};do
  [ -n "$(echo $d | grep [^+0-9.])" ] && continue
  echo $p
done
which you could use as an input file to your product computing program.
 
  


Reply

Tags
data, search


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Huge Data Set Analysis, Shell Script to copy specific HEX Pairs into a separate file telecom_is_me Programming 11 06-29-2008 10:48 PM
Reiserfs - how to find which file contains data in a block Vrajgh Linux - Software 8 09-14-2007 03:04 AM
find a specific file's location in a block-bitmap abhishek07 Linux - Kernel 0 06-28-2007 03:20 AM
Writeing block data to a file. exvor Programming 1 05-12-2006 09:34 PM
Data extraction from a really, really huge file. thekillerbean Linux - Software 4 04-09-2006 04:18 AM


All times are GMT -5. The time now is 07:52 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration