LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-10-2012, 08:07 AM   #1
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69
Blog Entries: 6

Rep: Reputation: 1
Editing SRT files


Hi all

For most of the people here, this will be basic stuff. But I'm still learning, so bear with me .

I have a SRT file (subtitles) and it doesn't match with the movie. The subs are 3 seconds too late, so I will need to add 3 seconds to start and end timestamp.

I used the following commands so far.
Code:
grep -P "[0-9].:..:.." The.Glenn.Miller.Story.srt | cut -c7-8
So now I get the seconds and I will need to add 3 to it.

First question, how can I do that easily?
Second question, how can I manage the minutes, because when I need to add 3 seconds to 58, I will also need to incerment the minutes...

Or do you guys have a better way?
Thanks in advance!

BR

Dragonix

This is an example of how the SRT looks like.
Code:
115
00:11:31,167 --> 00:11:35,365
blablabla

116
00:11:35,407 --> 00:11:37,557
blablabla

117
00:11:37,607 --> 00:11:41,122
blablabla
 
Old 11-10-2012, 09:54 AM   #2
replica9000
Senior Member
 
Registered: Jul 2006
Distribution: Debian Unstable
Posts: 1,122
Blog Entries: 2

Rep: Reputation: 259Reputation: 259Reputation: 259
Are you encoding the subtitles into the movie, or just using them during playback? Either way, there is usually a way to set the delay through the player during playback.
 
Old 11-10-2012, 10:23 AM   #3
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69

Original Poster
Blog Entries: 6

Rep: Reputation: 1
I'm not burning them on the video file (maybe not yet)
But indeed it's a good point you're making, but then again it might be a good practice to get me starting
 
Old 11-10-2012, 11:43 AM   #4
clifford227
Member
 
Registered: Dec 2009
Distribution: Slackware 14
Posts: 282

Rep: Reputation: 64
I find this pretty good:

http://home.gna.org/subtitleeditor/
 
Old 11-11-2012, 04:21 AM   #5
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69

Original Poster
Blog Entries: 6

Rep: Reputation: 1
I believe you are missing the point here...
I am aware of there being programs capable of doing this without any efforts, but I want to do it myself and creating a script for it... (like I said before).

Instead of giving me links of programs, I ask for your help for this script.
I do appreciate the links and ready-solutions...

So, again, can you help me out with the script?!
 
Old 11-11-2012, 05:30 AM   #6
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Here's proof of concept using an awk script
Code:
#!/usr/bin/awk -f

BEGIN {
    incr = 30
}

/^[0-9][0-9]:[0-5][0-9]:[0-5][0-9],[0-9]+ --> [0-9][0-9]:[0-5][0-9]:[0-5][0-9],[0-9]+$/ {
    split($1, start_t, ":|," )
    start_t[3] += incr
    if ( start_t[3] > 59 )
    {
        start_t[3] -= 60
        start_t[2] += 1
        if ( start_t[2] > 59 )
        {
            start_t[2] -= 60
            start_t[1] += 1
        }
    }
    printf "%02d:%02d:%02d,%d %s %s\n", start_t[1], start_t[2], start_t[3], start_t[4], $2, $3

    next
}

{ print }
Better to pass incr as a command line argument using awk's -v option and error trap it is not greater than 59 or better still modify the code to handle > 59 seconds.

EDIT: and, of course, add handling of the second time string.

Last edited by catkin; 11-11-2012 at 05:31 AM.
 
Old 11-11-2012, 08:12 AM   #7
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Since the %H:%M:%S,%N is a standard format accepted by the date command, you can a script like this:
Code:
#!/bin/bash
#
delay="3 seconds"

while read start s end
do
  if [[ $start =~ ^..:..:..,... ]]
  then
    start=$(date -d "$start $delay" +%H:%M:%S,%N)
    end=$(date -d "$end $delay" +%H:%M:%S,%N)
    echo ${start:0:12} $s ${end:0:12}
  else
    echo $start $s $end
  fi
done < file.srt
%N is the format for nanoseconds that you can easily truncate to milliseconds using the shell variable expansion to extract substrings. Anyway, this is slower than the awk code suggested by catkin, since it makes use of a great number of time functions calls.
 
Old 11-12-2012, 01:57 AM   #8
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69

Original Poster
Blog Entries: 6

Rep: Reputation: 1
Howly smokes
This what I was talking about!

Now I need to decode it so I can understand it :d haha!

I will have some question about this (most likely), so when I do, I will post them here!
 
Old 11-12-2012, 02:06 AM   #9
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69

Original Poster
Blog Entries: 6

Rep: Reputation: 1
Quote:
Originally Posted by colucix View Post
Since the %H:%M:%S,%N is a standard format accepted by the date command, you can a script like this:
Code:
#!/bin/bash
#
delay="3 seconds"

while read start s end
do
  if [[ $start =~ ^..:..:..,... ]]
  then
    start=$(date -d "$start $delay" +%H:%M:%S,%N)
    end=$(date -d "$end $delay" +%H:%M:%S,%N)
    echo ${start:0:12} $s ${end:0:12}
  else
    echo $start $s $end
  fi
done < file.srt
%N is the format for nanoseconds that you can easily truncate to milliseconds using the shell variable expansion to extract substrings. Anyway, this is slower than the awk code suggested by catkin, since it makes use of a great number of time functions calls.
First few questions:
1. the while line, what does it do? how does the command now what to read?
2. What does =~ do?

Be aware, I'm kinda new to all of this..
So be patient haha
 
Old 11-12-2012, 02:38 AM   #10
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Quote:
Originally Posted by dragonix View Post
First few questions:
1. the while line, what does it do? how does the command now what to read?
2. What does =~ do?

Be aware, I'm kinda new to all of this..
So be patient haha
Don't worry -- we were all new to this once

The while ... done command loops while the command after while returns true.

The read command reads from stdin until there are no more lines to read. Each time there is a line to read it returns true (0). When stdin runs out it returns false (non-zero).

stdin is supplied to the while loop (and hence to the read command) by the input redirect from file "file.srt" on the end of the while loop, done < file.srt

So the read command is reading from file.srt line by line. It breaks the input at spaces and/or tabs. The first "word" is assigned to variable start, the second to variable s (presumably s for string) and all the rest to variable end.

=~ is the regular expression comparison operator. Details here in the [[...]] section (but the man 7 regex page is more helpful than the man 3 regex page cited).

Last edited by catkin; 11-12-2012 at 02:40 AM.
 
1 members found this post helpful.
Old 11-12-2012, 02:52 AM   #11
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69

Original Poster
Blog Entries: 6

Rep: Reputation: 1
nice!
Indeed, I've been doing some scripting with this.
Step by step, to get the feeling with it and to see what it does.

But one thing seems to be failing (when I use the script).

Code:
echo ${start:0:12} $s ${end:0:12}
I get the following result (example)
Code:
01:+53:+02,+ --> 01:+54:+02,+
The times doesn't matter here, but I mean the format.
How do I get rid of the '+' and how do I get my nanoseconds back ?
Is this some advanced scripting with echo?
Where can I find the explanation of this?

EDIT
I increased the value of the echo from 12 to 15. So I have the nanoseconds back.
That is one question solved

Last edited by dragonix; 11-12-2012 at 02:57 AM.
 
Old 11-12-2012, 02:53 AM   #12
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Quote:
Originally Posted by dragonix View Post
1. the while line, what does it do? how does the command know what to read?
It doesn't know. Since you're interested in lines that have 3 fields separated by space, it assigns each of them to the three variables, start, s and end respectively. If a line has less than three fields a null string is assigned to the last variable(s). If a line has more than three fields, the extra fields are assigned to the last variable all together. In any case the line is printed out as it is (provided it was originally separated by blank spaces).

Quote:
Originally Posted by dragonix View Post
2. What does =~ do?
It is the regular expression match operator. The string at the right-hand side of the operator is interpreted as a regular expression. The condition is true if the string at the left-hand side matches the regular expression. Basically it selects only those line that contain a time specification in the format of the SRT files. It should select the desired lines in the 99.99% of the cases, since doubtfully a subtitle contains a string in that format.

Therefore, the time delay is applied to the start and ending time only for the related lines, whereas the other ones are printed untouched. You can easily verify the result using diff between the original file and the newly created one.

Edit: sorry, I didn't see previous replies before posting!

Last edited by colucix; 11-12-2012 at 02:56 AM.
 
1 members found this post helpful.
Old 11-12-2012, 02:53 AM   #13
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Quote:
Originally Posted by dragonix View Post
1. the while line, what does it do? how does the command know what to read?
It doesn't know. Since you're interested in lines that have 3 fields separated by space, it assigns each of them to the three variables, start, s and end respectively. If a line has less than three fields a null string is assigned to the last variable(s). If a line has more than three fields, the extra fields are assigned to the last variable all together. In any case the line is printed out as it is (provided it was originally separated by blank spaces).

Quote:
Originally Posted by dragonix View Post
2. What does =~ do?
It is the regular expression match operator. The string at the right-hand side of the operator is interpreted as a regular expression. The condition is true if the string at the left-hand side matches the regular expression. Basically it selects only those line that contain a time specification in the format of the SRT files. It should select the desired lines in the 99.99% of the cases, since doubtfully a subtitle contains a string in that format.

Therefore, the time delay is applied to the start and ending time only for the related lines, whereas the other ones are printed untouched. You can easily verify the result using diff between the original file and the newly created one.

Edit: sorry, I didn't see previous replies before posting!

Last edited by colucix; 11-12-2012 at 02:56 AM.
 
1 members found this post helpful.
Old 11-12-2012, 02:58 AM   #14
dragonix
Member
 
Registered: Nov 2012
Location: Belgium
Distribution: Ubuntu 12.04
Posts: 69

Original Poster
Blog Entries: 6

Rep: Reputation: 1
No prob!
The more info, the better

And also, everybody explains in a different way. So it's nice to have different approaches to the same problem!
Thanks

EDIT
In the meantime, I found this about the echo command

Quote:
Another expansion that exists is to extract substrings from the expanded value using the form ${VARffset:length}. This works in the expected form: offsets start at zero, if you don't specify a length it goes to the end of the string. For example:

str=abcdefgh
echo ${str:0:1}
echo ${str:1}

outputs "a" and "bcdefgh".
So that makes sense now, but now I need to get rid of the '+'...

EDIT 2
I did it like this (atm)

Code:
#!/bin/bash
#
delay="3 seconds"

while read start s end
do
  if [[ $start =~ ^..:..:..,... ]]
  then
    start=$(date -d "$start $delay" +%H:%M:%S,%N)
    end=$(date -d "$end $delay" +%H:%M:%S,%N)
    echo ${start:0:15} $s ${end:0:15} | tr -d '+'
  else
    echo $start $s $end
  fi
done < file.srt
Is that a proper solution? Or is there an easier or better way to do this?
And how do I decrease the seconds with 3 ? (made a mistake, instead of adding I should be decreasing it )

EDIT 3
W00t!!
I guess I found it

Code:
#!/bin/bash
#
delay="3 seconds ago"

while read start s end
do
  if [[ $start =~ ^..:..:..,... ]]
  then
    start=$(date -d "$start $delay" +%H:%M:%S,%N)
    end=$(date -d "$end $delay" +%H:%M:%S,%N)
    echo ${start:0:15} $s ${end:0:15} | tr -d '+'
  else
    echo $start $s $end
  fi
done < file.srt
-------------------------------------
Final Edit
This is my final code for the script. And it works
Happy me!!!

I ran it like
Code:
# ./script.sh > file_edit.srt
Code:
#!/bin/bash
#
# To decrease the amount of time by 3 seconds
#

# Changing the values
delay="3 seconds ago"

while read start s end
do
	if [[ $start =~ ^..:..:..,... ]]
	then
		start=$(date -d "$start $delay" +%H:+%M:+%S,+%N)
		end=$(date -d "$end $delay" +%H:+%M:+%S,+%N)
		echo ${start:0:15} $s ${end:0:15} | tr -d '+'
	else
		echo $start $s $end
	fi
done < file.srt
If there are any improvements possible, let me know!!

Last edited by dragonix; 11-12-2012 at 05:49 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] convert sub/idx to srt ?? alirezaimi Linux - Desktop 2 04-26-2012 01:46 PM
hi im looking for a good yet simple player for my ubuntu that can read .srt files guySch Linux - Software 2 07-01-2008 01:05 PM
subtitles(.sub;.srt) not working in kafeine or totem soloco Linux - Software 0 12-12-2005 05:14 AM
mplayer doesn't work with .srt subfile donkey301 Linux - Software 2 02-25-2005 10:23 AM
mplayer doesn't work with .srt sub donkey301 Debian 1 02-25-2005 08:18 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:01 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration