LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-13-2012, 04:49 AM   #1
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Rep: Reputation: Disabled
Extracting Text from file... Weather information


Howdy Users....

I have been browsing the forums however I cannot get my head around extracting the correct information from a .txt file. I have tried most commands - being new it is getting confusing but I am persisting with it....

I have a .txt file from which I am wanting to extract most of the information from it - but exclude a some of it. I would say 95% of it I am wanting and the other 5% I am happy to exclude.

Following is an example:..........






IDQ20035

Australian Government Bureau of Meteorology

Queensland









FIRE WEATHER WARNING

For the Channel Country and Maranoa and Warrego districts, and parts of the

North West, Central West, and Darling Downs and Granite Belt districts.

Issued at 4:59pm EST on Thursday the 13th of September 2012

for Thursday.



A vigorous surface trough is moving northeast through southern and western

Queensland. Very dry and gusty S to SW winds behind this trough are resulting

in:



Severe Fire Danger for the Channel Country and Maranoa and Warrego districts,

the Northwest district southwest of Mount Isa, Central West district southwest

of Blackall, and the Darling Downs and Granite Belt district southwest of Dalby.





Temperatures up to 32 degrees, relative humidity down to 10% and winds to 40

km/h are expected.



Fire dangers are expected to decrease during the evening as temperatures cool.



The next warning will be issued by 11 pm AEST Thursday.



For more information on Fire Bans and how to Prepare. Act. Survive. Visit the

Rural Fire Service web page at http://www.ruralfire.qld.gov.au or call the

Hotline.



For the latest weather information, listen to your local radio station or visit

the Bureau of Meteorology web page at http://www.bom.gov.au or call

for recorded Land and Weather Warnings.
Copyright Commonwealth of Australia 2011, Bureau of Meteorology . Users of these web pages are deemed to have read and accepted the
conditions described in the Copyright, Disclaimer, and Privacy statements
(http://www.bom.gov.au/other/copyright.shtml).




At the start, I am hoping to extract the data starting at FIRE WEATHER WARNING and finish the extraction at Land and Weather Warnings. I dont want the Copyright etc.

I would like to extract the weather warning information to a .txt file

I am having problems in figuring out exactly what commands to use to either extract the information that is wanted or exclude the information that is not wanted....

Thanks in advance for your help .... I am sure that the solution is simple - but this black duck is having problems in figuring it out.

Cheers
 
Old 09-13-2012, 04:56 AM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,823
Blog Entries: 1

Rep: Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218
Hi.

Can you place your text in the code tags? It increases its readability.
What have you done so far? Can you post your code?

See if that helps:
http://www.cyberciti.biz/faq/sed-display-text/
 
1 members found this post helpful.
Old 09-13-2012, 05:02 AM   #3
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,396

Rep: Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395
Try this
Code:
sed -n "/FIRE/,/Warn/p" weather.txt
eg http://stackoverflow.com/questions/6...-words-in-unix & see http://www.grymoire.com/Unix/Sed.html
 
1 members found this post helpful.
Old 09-13-2012, 05:12 AM   #4
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Thanks for your reply

I have tried the following:
The .txt file location is $warnings - warnings.txt being my raw info and warning_filter.txt being the filtered output for the below scripting

sed -n $warnings/warning.txt "/FIRE/,/Weather Warnings/p" $warnings/warning_filter.txt

Something doesnt seem to be right with my coding - maybe I am getting crossed eyed!!??!!
 
Old 09-13-2012, 05:16 AM   #5
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
On another topic of similar context - how would I be able to capture all the text but EXCLUDE the section below the copyright?
 
Old 09-13-2012, 05:19 AM   #6
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,396

Rep: Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395
I have to admit I don't get your follow-up post #4...
What you said was
Quote:
At the start, I am hoping to extract the data starting at FIRE WEATHER WARNING and finish the extraction at Land and Weather Warnings. I dont want the Copyright etc.
and that's what my code did.

If that's not what you wanted, please explain again with example before and after text blocks.
 
1 members found this post helpful.
Old 09-13-2012, 05:23 AM   #7
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738
The SED syntax has the command string first, and then the file name to operate on. To write the modified data to a new file, the typical construct would be:
Code:
sed 'command string' oldfile > new file
If $warnings is the path to your files, and the old file is "warning.txt", and the new file is "warning_filter.txt", then do this:
Code:
sed -n '/start/,/stop/p' $warnings/warning.text > $warnings/warning_filter.txt
where "start" and "stop" are replaced by the proper regex to define the address range.

Last edited by pixellany; 09-13-2012 at 05:24 AM.
 
1 members found this post helpful.
Old 09-13-2012, 05:25 AM   #8
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
I read your coding that it would stop at the 1st match with "warn" in it. If that is the case - it would stop early and miss some information, thats why I changed my coding to end with "Weather Warnings" - I could be wrong in my theory thou...

Maybe I have the coding wrong for sourcing the .txt file from within my machine. The location is correct - however maybe where to tell the code line where to source the file is incorrect??!!

Maybe I am trying to out do myself - hence my following question with "if I could exclude all information PRIOR to the copyright tag - then I would be a happy chap...

Something is better than none - I am starting to have no hair left...

Appreciate your input

Last edited by vk4led; 09-13-2012 at 05:27 AM. Reason: additional text
 
Old 09-13-2012, 05:25 AM   #9
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738
PS:
Note the single quotes on the command string---recommended practice unless you need double quotes--eg for a variable expansion inside the command string
 
Old 09-13-2012, 05:35 AM   #10
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,396

Rep: Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395
For your request to get everything except the lines with Copyright or copyright
Code:
grep -v -e Copyr -e copyr  weather.txt
 
1 members found this post helpful.
Old 09-13-2012, 05:37 AM   #11
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Ok - I have the output now - Thanks - problem #1 solved

Problem #2 - If I am wanting to capture everything in the text file however EXCLUDE all text below (and including) the copyright statement - this line script would not work. What would be the easiest scripting to exclude from "copyright...." on

The output is strictly for personal use - heading into the storm season - prior warning in my circumstance is of great assistance to me (fire/flood/rain)

EDIT:

I just noticed your reply - I will try it out and get back to you.

Thanks

Last edited by vk4led; 09-13-2012 at 05:38 AM. Reason: Late addition
 
Old 09-13-2012, 05:42 AM   #12
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
Thanks for your reply - looks like all solved - so simple yet so complex for this text file.

Appreciate your help - truly do!
 
Old 09-13-2012, 05:54 AM   #13
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738Reputation: 738
Have you read about address ranges in the "grymoire" link? Note that these can be nested. For example, suppose you want to print everything in the range of "START" to "STOP", EXCEPT things in the sub-range "start" to "stop":
Code:
sed -n '/START/,/STOP/{/start/,/stop/!p}' oldfile > newfile
 
2 members found this post helpful.
Old 09-13-2012, 05:58 AM   #14
vk4led
LQ Newbie
 
Registered: Dec 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
I was not aware of that function - appreciate your help. I am heading back to the script now to try it out.

Greatly appreciate the additional information.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Extracting every nth line in a text file to a new text file? paradeboy Linux - General 4 03-29-2012 10:03 PM
[SOLVED] Extracting text from a file. TheNewGuy2936 Linux - Newbie 13 04-26-2011 10:16 AM
extracting particular lines from a text file skuz_ball Programming 18 10-28-2008 12:31 PM
Extracting Information from a Flat file ryanlum Linux - General 7 11-06-2007 02:30 AM
extracting a chunk of text from a large text file lothario Linux - Software 3 02-28-2007 08:16 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:22 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration