LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 08-17-2013, 01:48 PM   #1
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Ubuntu 12.04, Antix19.3
Posts: 3,794

Rep: Reputation: 282Reputation: 282Reputation: 282
Extracting informatiom from a 'file' using shell script


I'm trying to extract certain in formation from the result of a dvbsnoop command using a shell script and I'm stuck.

Code:
------------------------------------------------------------
TS-Packet: 00000001   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data

------------------------------------------------------------
TS-Packet: 00000002   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data

------------------------------------------------------------
TS-Packet: 00000003   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
first occurence of something interesting (one or more consecutive lines)
some more uninteresting data

------------------------------------------------------------
TS-Packet: 00000004   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data

------------------------------------------------------------
TS-Packet: 00000005   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data

------------------------------------------------------------
TS-Packet: 00000006   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
something interesting (one or more consecutive lines)
some more uninteresting data

------------------------------------------------------------
TS-Packet: 00000007   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
last occurence of something interesting (one or more consecutive lines)
some more uninteresting data

------------------------------------------------------------
TS-Packet: 00000008   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
some uninteresting data
I need to extract the first and last values of 'something interesting' as well as the packet number.

Grep with -A or -B option does not work as the number of lines between 'TS-Packet' and 'something intersting is variable. Sed als does not work as it's too eager (command used sed -n '/TS-Packet/,/==> PTS'/p).

Any suggestions how to approach this in a shell script are appreciated.


Below a real life example. I need to extract the first and the last PTS and / or the first and the last gop information (both marked in red bold in below example) and the associated TS Packets numbers.

The result I like to get (at the end) is e.g like this (for the PTS)
Code:
00000245, 5:27:52.7274
00080676, 5:28:22.6874
Code:
------------------------------------------------------------
TS-Packet: 00000245   PID: 768 (0x0300), Length: 188 (0x00bc)
from file: ../maestreamclips/306Q@S25.mpg
------------------------------------------------------------
Sync-Byte 0x47: 71 (0x47)
Transport_error_indicator: 0 (0x00)  [= packet ok]
Payload_unit_start_indicator: 0 (0x00)  [= Packet data continues]
transport_priority: 0 (0x00)
PID: 768 (0x0300)  [= ]
transport_scrambling_control: 0 (0x00)  [= No scrambling of TS packet payload]
adaptation_field_control: 3 (0x03)  [= adaptation_field followed by payload]
continuity_counter: 14 (0x0e)  [= (sequence ok)]
    Adaptation_field: 
        adaptation_field_length: 120 (0x78)
        discontinuity_indicator: 0 (0x00)
        random_access_indicator: 0 (0x00)
        elementary_stream_priotity_indicator: 0 (0x00)
        PCR_flag: 0 (0x00)
        OPCR_flag: 0 (0x00)
        splicing_point_flag: 0 (0x00)
        transport_private_data_flag: 0 (0x00)
        adaptation_field_extension_flag: 0 (0x00)
        (Stuffing_bytes length: 119) 
        Stuffing bytes:
              0000:  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff   ................
              0010:  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff   ................
>>>
>>>
              0060:  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff   ................
              0070:  ff ff ff ff ff ff ff                               .......
    Payload: (len: 63)
==========================================================


    TS sub-decoding (245 packet(s) stored for PID 0x0300):
    =====================================================
    TS contains PES/PS stream...
    PS/PES packet (length=20): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 224 (0xe0)  [= ITU-T Rec. H.262 | ISO/IEC 13818-2 or ISO/IEC 11172-2 video stream]
        PES_packet_length: 0 (0x0000)
         ==> unbound video elementary stream... 

            PES_scrambling_control: 0 (0x00)  [= not scrambled]
            PES_priority: 1 (0x01)
            data_alignment_indicator: 0 (0x00)
            copyright: 1 (0x01)
            original_or_copy: 1 (0x01)
            PTS_DTS_flags: 3 (0x03)
            ES_rate_flag: 0 (0x00)
            additional_copy_info_flag: 0 (0x00)
            PES_CRC_flag: 0 (0x00)
            PES_extension_flag: 0 (0x00)
            PES_header_data_length: 11 (0x0b)
            PTS: 
               Fixed: 3 (0x03)
               PTS:
                  bit[32..30]: 1 (0x01)
                  marker_bit: 1 (0x01)
                  bit[29..15]: 21264 (0x5310)
                  marker_bit: 1 (0x01)
                  bit[14..0]: 24893 (0x613d)
                  marker_bit: 1 (0x01)
                   ==> PTS: 1770545469 (0x6988613d)  [= 90 kHz-Timestamp: 5:27:52.7274]
            DTS: 
               Fixed: 1 (0x01)
               DTS:
                  bit[32..30]: 1 (0x01)
                  marker_bit: 1 (0x01)
                  bit[29..15]: 21264 (0x5310)
                  marker_bit: 1 (0x01)
                  bit[14..0]: 21293 (0x532d)
                  marker_bit: 1 (0x01)
                   ==> DTS: 1770541869 (0x6988532d)  [= 90 kHz-Timestamp: 5:27:52.6874]
            stuffing bytes:

    PS/PES packet (length=76): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 179 (0xb3)  [= sequence_header_code]
            horizontal_size_value: 720 (0x02d0)
            vertical_size_value: 576 (0x0240)
            aspect_ratio_information: 2 (0x02)  [= 3:4]
            frame_rate_code: 3 (0x03)  [= 25]
            bit_rate_value: 37500 (0x00927c)  [= * 400 bit/s]
            marker_bit: 1 (0x01)
            vbv_buffer_size_value: 112 (0x0070)
            contraint_parameters_flag: 0 (0x00)
            load_intra_quantiser_matrix: 0 (0x00)
            load_non_intra_quantiser_matrix: 1 (0x01)
            non_intra_quantiser_matrix:  (8 x 64)
               16 (0x10)   [= 00010000]
               17 (0x11)   [= 00010001]
>>>
>>>
               40 (0x28)   [= 00101000]
               43 (0x2b)   [= 00101011]

    PS/PES packet (length=10): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 181 (0xb5)  [= extension_start_code]
            extension_start_code_identifier: 1 (0x01)  [= Sequence Extension ID]
            profile_and_level_indication: 
               escape_bit: 0 (0x00)
               profile_indication: 4 (0x04)  [= Main]
               level_indication: 8 (0x08)  [= Main]
            progressive_sequence: 0 (0x00)
            chroma_format: 1 (0x01)  [= 4:2:0]
            horizontal_size_extension: 0 (0x00)
            vertical_size_extension: 0 (0x00)
            bit_rate_extension: 0 (0x0000)
            marker_bit: 1 (0x01)
            vbv_buffer_size_extension: 0 (0x00)
            low_delay: 0 (0x00)
            frame_rate_extension_n: 0 (0x00)
            frame_rate_extension_d: 0 (0x00)

    PS/PES packet (length=8): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 184 (0xb8)  [= group_start_code]
            time_code:
               drop_frame_flag: 0 (0x00)
               time_code_hours: 1 (0x01)
               time_code_minutes: 24 (0x18)
               marker_bit: 1 (0x01)
               time_code_seconds: 0 (0x00)
               time_code_pictures: 0 (0x00)
            closed_gop: 1 (0x01)
            broken_link: 0 (0x00)

    PS/PES packet (length=8): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 0 (0x00)  [= picture_start_code]
            temporal_reference: 0 (0x0000)
            picture_coding_type: 1 (0x01)  [= intra-coded (I)]
            vbv_delay: 65535 (0xffff)
            extra_bit_picture: 1 (0x01)
            extra_information_picture: 255 (0xff)
            extra_bit_picture: 1 (0x01)
            extra_information_picture: 224 (0xe0)
            extra_bit_picture: 0 (0x00)

    PS/PES packet (length=9): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 181 (0xb5)  [= extension_start_code]
            extension_start_code_identifier: 8 (0x08)  [= Picture Coding Extension ID]
            f_code[forward][horizontal]: 15 (0x0f)
            f_code[forward][vertical]: 15 (0x0f)
            f_code[backward][horizontal]: 15 (0x0f)
            f_code[backward][vertical]: 15 (0x0f)
            intra_dc_precision: 0 (0x00)  [= 8 bits]
            picture_structure: 3 (0x03)  [= frame picture]
            top_field_first: 1 (0x01)
            frame_pred_frame_dct: 0 (0x00)
            concealment_motion_vectors: 0 (0x00)
            q_scale_type: 1 (0x01)
            intra_vlc_format: 1 (0x01)
            alternate_scan: 1 (0x01)
            repeat_first_field: 0 (0x00)
            chroma_420_type: 0 (0x00)
            progressive_frame: 0 (0x00)
            composite_display_flag: 0 (0x00)

    PS/PES packet (length=3120): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 1 (0x01)  [= slice_start_code]

    PS/PES packet (length=1440): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 2 (0x02)  [= slice_start_code]

...
...

    PS/PES packet (length=1404): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 34 (0x22)  [= slice_start_code]

    PS/PES packet (length=2828): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 35 (0x23)  [= slice_start_code]

    PS/PES packet (length=920): 
        Packet_start_code_prefix: 0x000001
        Stream_id: 36 (0x24)  [= slice_start_code]
Thanks in advance

Note:
Not every packet containg a PTS will contain gop information; the example has both by coincedence.

Last edited by Wim Sturkenboom; 08-17-2013 at 02:05 PM. Reason: added 'shell script' to title
 
Old 08-17-2013, 02:24 PM   #2
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
awk?
Code:
awk '/TS-Packet/{printf $2", "};
     /==> PTS/{sub(/]/,"",$NF);
               print $NF
              };
     /group_start_code/{GSC=1};
     GSC == "1"{print};
     /broken_link:/{GSC=0}' Input.file
 
1 members found this post helpful.
Old 08-17-2013, 10:59 PM   #3
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Ubuntu 12.04, Antix19.3
Posts: 3,794

Original Poster
Rep: Reputation: 282Reputation: 282Reputation: 282
awk did not look like the right tool to me and hence I did not try it. I will analyze your example, give it a shot and give feedback.

Thanks for the answer.
 
Old 08-18-2013, 03:11 AM   #4
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Ubuntu 12.04, Antix19.3
Posts: 3,794

Original Poster
Rep: Reputation: 282Reputation: 282Reputation: 282
Some feedback

Unfortunately your script shows every single packet number which is not quite what I hoped for After a few failed attempts to assign the packet number to a temporary variable and only print when e.g the PTS is there, I gave up on that.

Your pointer to awk however made me do a search for 'awk extracting a block of text' on the web and resulted in http://stackoverflow.com/questions/1...ck-from-a-file (something that I did not find while searching the same with sed instead of awk).

Code:
echo first
rs=$(awk 'BEGIN { RS="TS-Packet"; } /==> PTS/ { print RS $0; exit; }' 2.tmp)
#echo "$rs"
pckt=$(echo "$rs" |grep "TS-Packet" | cut -d ' ' -f 2)
pts=$(echo "$rs" |grep "==> PTS" | cut -d '[' -f 2 | cut -d ' ' -f 4)
echo $pckt, $pts
Code:
00000245, 5:27:52.7274]
Except for the closing bracket at the end, it's what I'm looking for for the first occurence.

Next steps are now to extract the last occurence and to create the same script incorporating your idea of extracting the gop information.

I'll get back with the total solution or if I get stuck again.

PS
If there is a pure awk way of doing the above script, I'm interested.
 
Old 08-18-2013, 06:26 AM   #5
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
This does what your awk,grep,cut does
Code:
awk '/TS-Packet/{printf $2", "};
     /==> PTS/{sub(/]/,"",$NF);
     print $NF
     }' Input.file

and this might be closer to what you want

Code:
awk '/TS-Packet/{if ($2 == "00000245" || $2 == "00000247") {printf $2", ";Interesting=1;} else {Interesting=0}};
     /==> PTS/ && Interesting == "1"{
         sub(/]/,"",$NF);
         print $NF};
     /group_start_code/ && Interesting == "1"{
         GSC=1
     };
     GSC == "1"{
     print
     };
    /broken_link:/{GSC=0};
' Input
Hardcoded your 'interesting packets'
with a bit of work you could use an array
 
Old 08-18-2013, 07:28 AM   #6
olau
LQ Newbie
 
Registered: Jun 2006
Distribution: Debian Sarge ( lol I should generalize to: Debian Testing )
Posts: 15

Rep: Reputation: 1
.... and the dirty way is to "while" through the file line by line and start/stop echoing it to another file.... or perl
 
Old 08-18-2013, 10:18 AM   #7
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Ubuntu 12.04, Antix19.3
Posts: 3,794

Original Poster
Rep: Reputation: 282Reputation: 282Reputation: 282
This is now solved and will be marked as such. There are still some things to sort out (extract gop time in hh:mm:ss:ff format and error checking) but I think that I can manage that. And else it will probably be subject for another thread.

The total program till now; in red the part that this thread was about. You can pass it a single program transport stream.

Code:
#! /bin/bash

MediaFile=$1
echo "File to process: $MediaFile"

# get pmt pid from pat so we can get pmt
result=$(dvbsnoop -s ts -pd 4 -nph -tssubdecode -if $1 -N 2 0 |grep "Program_map_PID:")
PMTpid=$(echo "$result" | sed -e 's/.*: //' | sed -e 's/(.*)//')
echo "PMT pid: $PMTpid"

# get pcr pid from pmt
result=$(dvbsnoop -s ts -pd 4 -nph -tssubdecode -if $1 -N 2 $PMTpid)
PCRpid=$(echo "$result" | grep "PCR PID: " | sed -e 's/\s*PCR PID:\s*//' | cut -d ' ' -f 1)
echo "PCR pid: $PCRpid"
echo

# skip everything till the stream_type loop
result=$(echo "$result" | sed -e 's/.*Stream_type loop//')

OLDIFS=$IFS
IFS=$'\n'
streamtype=( $(echo "$result" |  grep Stream_type) )
elpid=( $(echo "$result" |  grep Elementary_PID) )

echo "Elementary streams:"
index=0
length=${#streamtype[@]}
while [ $index -ne $length ]
do
  # create array of types and pids
  t=$(echo ${streamtype[$index]} | sed -e 's/\s*Stream_type:\s*//' | cut -d ' ' -f 1)
  echo "type = $t"
  p=$(echo ${elpid[$index]} | sed -e 's/\s*Elementary_PID:\s*//' | cut -d ' ' -f 1)
  echo "pid =  $p"

  # save to temporary file
  dvbsnoop -s ts -pd 4 -nph -tssubdecode -if $1 $p > $t.tmp

  # get first pts
  echo "first packet PTS"
  rs=$(awk 'BEGIN { RS="TS-Packet"; } /==> PTS/ { print RS $0; exit; }' $t.tmp)
  pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
  pts=$(echo "$rs" | awk '/==> PTS/{ sub(/\]/,"",$NF); print $NF }')
  echo $pckt, $pts

  # get last pts
  echo "last packet PTS"
  rs=$(awk 'BEGIN { RS="TS-Packet"; } /==> PTS/ { last = $0 } END { print RS last; }' $t.tmp)
  pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
  pts=$(echo "$rs" | awk '/==> PTS/{ sub(/\]/,"",$NF); print $NF }')
  echo $pckt, $pts

  # for video, extract gop
  if [ $t -eq 2 ]
    then
    # get first gop
    echo "first packet GOP"
    rs=$(awk 'BEGIN { RS="TS-Packet"; } /group_start_code/ { print RS $0; exit; }' $t.tmp)
    pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
    gop=$(echo "$rs" | awk '
       /group_start_code/{GSC=1};
       GSC == "1"{print};
       /broken_link:/{GSC=0}' )
    echo "$pckt, $gop"

    #get last gop
    echo "last packet GOP"
    rs=$(awk 'BEGIN { RS="TS-Packet"; } /group_start_code/ { last = $0 } END { print RS last; }' $t.tmp)
    pckt=$(echo "$rs" | awk '/TS-Packet/{ print $2 }')
    gop=$(echo "$rs" | awk '
       /group_start_code/{GSC=1};
       GSC == "1"{print};
       /broken_link:/{GSC=0}' )
    echo "$pckt, $gop"
  fi
  echo ""


  # next elementary stream
  ((index++))
done
IFS=$OLDIFS
Quote:
Originally Posted by olau View Post
.... and the dirty way is to "while" through the file line by line and start/stop echoing it to another file.... or perl
I have considered it; C or Tcl as language as I don't have the time to learn another language at this stage
 
Old 08-18-2013, 10:20 AM   #8
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Ubuntu 12.04, Antix19.3
Posts: 3,794

Original Poster
Rep: Reputation: 282Reputation: 282Reputation: 282
Oh, and the output

Code:
File to process: ../maestreamclips/306Q@S25.mpg
PMT pid: 257 
PCR pid: 768

Elementary streams:
type = 2
pid =  768
first packet PTS
00000245, 5:27:52.7274
last packet PTS
00080676, 5:28:22.6874
first packet GOP
00000245,         Stream_id: 184 (0xb8)  [= group_start_code]
            time_code:
               drop_frame_flag: 0 (0x00)
               time_code_hours: 1 (0x01)
               time_code_minutes: 24 (0x18)
               marker_bit: 1 (0x01)
               time_code_seconds: 0 (0x00)
               time_code_pictures: 0 (0x00)
            closed_gop: 1 (0x01)
            broken_link: 0 (0x00)
last packet GOP
00079005,         Stream_id: 184 (0xb8)  [= group_start_code]
            time_code:
               drop_frame_flag: 0 (0x00)
               time_code_hours: 1 (0x01)
               time_code_minutes: 24 (0x18)
               marker_bit: 1 (0x01)
               time_code_seconds: 29 (0x1d)
               time_code_pictures: 5 (0x05)
            closed_gop: 1 (0x01)
            broken_link: 0 (0x00)

type = 4
pid =  769
first packet PTS
00000022, 5:27:52.2960
last packet PTS
00003959, 5:28:22.2480

type = 6
pid =  770
first packet PTS
00000001, 5:27:52.2873
last packet PTS
00000750, 5:28:22.2473
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
No such file or directory message when extracting the linux cpio file on window vista echang General 7 10-05-2012 02:19 PM
editing a very large HTML file (or, extracting URLs from a file) Chriswaterguy Linux - Software 3 11-27-2007 06:07 PM
extracting gz file..... b123coder Linux - Newbie 1 11-21-2004 07:55 AM
extracting this file... Xylicon Linux - General 4 01-23-2003 08:49 AM
Red Hat Book not enough informatiom CoolSights2000 Linux - Newbie 3 01-12-2003 11:35 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:13 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration