LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Plain text subtitle converter script (https://www.linuxquestions.org/questions/linux-newbie-8/plain-text-subtitle-converter-script-4175437390/)

OutsiderFilms 11-16-2012 01:18 AM

Plain text subtitle converter script
 
Hello All,

I have a subtitle file (a plain text file) that contain lines like this:

---------------------------------------------
Start subtitles

00:01:25:05 00:01:28:07
Any stuff that does not sell at all?

00:01:30:24 00:01:32:15
Here is one!
You got it?

00:01:34:05 00:01:37:06
Of course I have!
Why shouldn't I bring this?

and so forth

---------------------------------------------

It needs to be converted to a text file like this:

---------------------------------------------
Start subtitles

1
00:01:25,166 --> 00:01:28,233
Any stuff that does not sell at all?

2
00:01:30,800 --> 00:01:32,500
Here is one!
You got it?

3
00:01:34,166 --> 00:01:37,200
Of course I have!
Why shouldn't I bring this?

and so forth

---------------------------------------------

Details:

The original file has the following format:

LINEBREAK
n1:n2:n3:n4 m1:m2:m3:m4LINEBREAK
textLINEBREAK,
text (OPTIONAL, some subtitles might have only one line of text)
LINEBREAK

The converted file should have the following format. The last number in the timecode m4=n4*33.33, rounded to the nearest integer, could be either rounded up or down:

SUBTITLENUMBER
n1:n2:n3,ROUND(n4*33.33) --> m1:m2:m3,ROUND(m4*33.33)
textLINEBREAK
textIF THERE'S A SECOND LINE OF TEXT
LINEBREAK

----------------------------------------------

I guess, the algorithm would be:

1 open original file
2 create a new file
3 initiate a line number counter at 1
4 write the line number value to the new file
5 write a linebreak
6 read the chunk of timecode till you hit SPACE (gives n1:n2:n3:n4)
7 perform the calculation ROUND(n4*33.33)
8 write the new timecode chunk n1:n2:n3,ROUND(n4*33.33) to file
9 write"SPACE-->SPACE"
10 read the next chunk of timecode till hit LINEBREAK (gives m1:m2:m3:m4)
11 perform the calculation in line 7
12 write the new timecode chunk m1:m2:m3,ROUND(m4*33.33) to file
13 write a linebreak
14 read the text till you reach a linebreak
15 copy it to the new file
16 repeat


I suppose it could also do it this way:

1) Check if the first character on the line is a number. If it is, then, in the new file,
write the line number
insert a linebreak
do the calculation and write the result
insert a linebreak
2) If the first characted on the line is text
just copy the text and linebreaks as is

NOTES:

1 The text lines might contain all sorts of characters and punctuation like ,.:!?()[]/"' and so forth or even accented characters

2 The text lines will NEVER begin with a number and ALWAYS end with a linebreak

3 The timecode lines ALWAYS begin with a number and ALWAYS end with a linebreak

Thanks for reading all this!

The source text file is generated by an editing application (Avid Media Composer).
The output is a subrip format text file, used by Youtube.
The finished script would be the first open source "Avid DS to Subrip convertor script". None exist (for Linux) right now, as far as I know.

I'll share copies around and will be happy to call it the "LQ Avid to Youtube Subtitle Script" !

Cheers,

Amit

OutsiderFilms 11-16-2012 03:13 AM

Hello All,

I've managed to find a program that already does what I wanted (Subtitle Editor)
home.gna.org/subtitleeditor/

I had ignored this program earlier because it did not list Avid DS as a supported filetype. Some rooting around let to the discovery that the "plain text" mode can be used instead.

Thanks for everyone who tried (or is still trying) to work out the code.

I'm downloading a tutorial on perl... Hopefully, will manage to solve problems such as these myself!

Cheers,

Amit


All times are GMT -5. The time now is 11:46 AM.