[SOLVED] Extracting informatiom from a 'file' using shell script
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Extracting informatiom from a 'file' using shell script
I'm trying to extract certain in formation from the result of a dvbsnoop command using a shell script and I'm stuck.
Code:
------------------------------------------------------------
TS-Packet: 00000001 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
------------------------------------------------------------
TS-Packet: 00000002 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
------------------------------------------------------------
TS-Packet: 00000003 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
first occurence of something interesting (one or more consecutive lines)
some more uninteresting data
------------------------------------------------------------
TS-Packet: 00000004 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
------------------------------------------------------------
TS-Packet: 00000005 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
------------------------------------------------------------
TS-Packet: 00000006 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
something interesting (one or more consecutive lines)
some more uninteresting data
------------------------------------------------------------
TS-Packet: 00000007 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
last occurence of something interesting (one or more consecutive lines)
some more uninteresting data
------------------------------------------------------------
TS-Packet: 00000008 PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
I need to extract the first and last values of 'something interesting' as well as the packet number.
Grep with -A or -B option does not work as the number of lines between 'TS-Packet' and 'something intersting is variable. Sed als does not work as it's too eager (command used sed -n '/TS-Packet/,/==> PTS'/p).
Any suggestions how to approach this in a shell script are appreciated.
Below a real life example. I need to extract the first and the last PTS and / or the first and the last gop information (both marked in red bold in below example) and the associated TS Packets numbers.
The result I like to get (at the end) is e.g like this (for the PTS)
Unfortunately your script shows every single packet number which is not quite what I hoped for After a few failed attempts to assign the packet number to a temporary variable and only print when e.g the PTS is there, I gave up on that.
Your pointer to awk however made me do a search for 'awk extracting a block of text' on the web and resulted in http://stackoverflow.com/questions/1...ck-from-a-file (something that I did not find while searching the same with sed instead of awk).
This is now solved and will be marked as such. There are still some things to sort out (extract gop time in hh:mm:ss:ff format and error checking) but I think that I can manage that. And else it will probably be subject for another thread.
The total program till now; in red the part that this thread was about. You can pass it a single program transport stream.
Code:
#! /bin/bash
MediaFile=$1
echo "File to process: $MediaFile"
# get pmt pid from pat so we can get pmt
result=$(dvbsnoop -s ts -pd 4 -nph -tssubdecode -if $1 -N 2 0 |grep "Program_map_PID:")
PMTpid=$(echo "$result" | sed -e 's/.*: //' | sed -e 's/(.*)//')
echo "PMT pid: $PMTpid"
# get pcr pid from pmt
result=$(dvbsnoop -s ts -pd 4 -nph -tssubdecode -if $1 -N 2 $PMTpid)
PCRpid=$(echo "$result" | grep "PCR PID: " | sed -e 's/\s*PCR PID:\s*//' | cut -d ' ' -f 1)
echo "PCR pid: $PCRpid"
echo
# skip everything till the stream_type loop
result=$(echo "$result" | sed -e 's/.*Stream_type loop//')
OLDIFS=$IFS
IFS=$'\n'
streamtype=( $(echo "$result" | grep Stream_type) )
elpid=( $(echo "$result" | grep Elementary_PID) )
echo "Elementary streams:"
index=0
length=${#streamtype[@]}
while [ $index -ne $length ]
do
# create array of types and pids
t=$(echo ${streamtype[$index]} | sed -e 's/\s*Stream_type:\s*//' | cut -d ' ' -f 1)
echo "type = $t"
p=$(echo ${elpid[$index]} | sed -e 's/\s*Elementary_PID:\s*//' | cut -d ' ' -f 1)
echo "pid = $p"
# save to temporary file
dvbsnoop -s ts -pd 4 -nph -tssubdecode -if $1 $p > $t.tmp
# get first pts
echo "first packet PTS"
rs=$(awk 'BEGIN { RS="TS-Packet"; } /==> PTS/ { print RS $0; exit; }' $t.tmp)
pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
pts=$(echo "$rs" | awk '/==> PTS/{ sub(/\]/,"",$NF); print $NF }')
echo $pckt, $pts
# get last pts
echo "last packet PTS"
rs=$(awk 'BEGIN { RS="TS-Packet"; } /==> PTS/ { last = $0 } END { print RS last; }' $t.tmp)
pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
pts=$(echo "$rs" | awk '/==> PTS/{ sub(/\]/,"",$NF); print $NF }')
echo $pckt, $pts
# for video, extract gop
if [ $t -eq 2 ]
then
# get first gop
echo "first packet GOP"
rs=$(awk 'BEGIN { RS="TS-Packet"; } /group_start_code/ { print RS $0; exit; }' $t.tmp)
pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
gop=$(echo "$rs" | awk '
/group_start_code/{GSC=1};
GSC == "1"{print};
/broken_link:/{GSC=0}' )
echo "$pckt, $gop"
#get last gop
echo "last packet GOP"
rs=$(awk 'BEGIN { RS="TS-Packet"; } /group_start_code/ { last = $0 } END { print RS last; }' $t.tmp)
pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
gop=$(echo "$rs" | awk '
/group_start_code/{GSC=1};
GSC == "1"{print};
/broken_link:/{GSC=0}' )
echo "$pckt, $gop"
fi
echo ""
# next elementary stream
((index++))
done
IFS=$OLDIFS
Quote:
Originally Posted by olau
.... and the dirty way is to "while" through the file line by line and start/stop echoing it to another file.... or perl
I have considered it; C or Tcl as language as I don't have the time to learn another language at this stage
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.