LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 01-18-2009, 03:21 AM   #1
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Removing voices from soundtrack recording? (not karaoke)


I'm doing a class project where we want to overdub the original soundtrack of a video with new dialog, and I need to find some way to remove or mute the original voices.

I have the soundtrack converted to a .wav file (actually, the source is from a videotape--converted to dvd using some Windows recorder--from which I ripped and converted the ac3 audio). I've tried using the various "karaoke" and voice removal filters from audacity, mplayer, and the like, but those aren't working*. They just mute the whole soundtrack equally.

I've googled around but haven't found anything except professional-level (i.e. expensive) solutions. Does anyone have any suggestions for me? I have an old win2000 notebook available if there are any windows-only solutions, or I'd even be willing to go through the file and remove each voice manually, if I just knew how to isolate them from the rest of the soundtrack.

Thanks in advance for any advice.




[*These filters apparently work by inverting one channel and subtracting whatever appears on both sides, on the theory that most music recordings place the voice tracks in the middle of the stereo range. But this won't work for mono recordings or ones that have dialog outside of the middle range.]
 
Old 01-18-2009, 05:51 AM   #2
ronlau9
Senior Member
 
Registered: Dec 2007
Location: In front of my LINUX OR MAC BOX
Distribution: Mandriva 2009 X86_64 suse 11.3 X86_64 Centos X86_64 Debian X86_64 Linux MInt 86_64 OS X
Posts: 2,369

Rep: Reputation: Disabled
Every sound also voices has a main frequency and harmonic frequencies .
What you need to is to remove or mute the main frequency and at least the first harmonic frequency
You also lost the sounds that use the same fequencies , there nothing you can do to prevent this.
So what you have to do is build a filter who does the job.
Or buy or build very good equalizer than you can manipulate the sound even more.
Bad thing a very good equalizer did not come cheap
Building one how are youŕe technical skills
 
Old 01-18-2009, 08:51 AM   #3
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195

Rep: Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043
You are right about the principle of how Karaoke filters remove the voices. Unfortunately what Ronlau says is also true.

However, when you hear the dialog you can be sure that the music of background noise is not audbible anymore. Note that I say "when you HEAR". Take that literally. When you hear the voice, the background sound is irrelevant and not heard by us. It is the principle of MPEG compression: removing all but the strongest sounds, as the weaker sounds are not percepted by us anyway.

So maybe you could combine two methods: filter out all human frequencies, but do so ONLY during dialogs, or during actual speech. You can also distinguish between male and female voices and apply a different filter depending on who is speaking. An advantage is that speech often starts and stops abrubtly so it is not too hard to define you mute window. The bad news is that is remains manual work to define the range to be muted.

Hopefully your newly dubbed over voice is not shorter than what you remove. But then again, since you only remove frequencies for the human voice (and those are limited, really, try 300-3000 Hz) there might remain plenty of original background noise to mask the silence.

I wonder if this idea could work.

jlinkels
 
Old 01-22-2009, 09:30 AM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Original Poster
Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Thanks for the replies. Sorry to be late getting back; I was seriously busy the last couple of days.


Quote:
Originally Posted by jlinkels View Post
So maybe you could combine two methods: filter out all human frequencies, but do so ONLY during dialogs, or during actual speech. You can also distinguish between male and female voices and apply a different filter depending on who is speaking. An advantage is that speech often starts and stops abrubtly so it is not too hard to define you mute window. The bad news is that is remains manual work to define the range to be muted.
That's what I was kind of thinking needed to be done. The removal doesn't have to be perfect, just muted enough not to interfere with the dubbed-in overlay, especially since the overlay would mask a lot of the holes.

I'm willing to attempt to go in and manually do it (though I'm worried that it might be biting off more than I can chew ). The problem is that I just don't know how to go about it. I've tried to work a bit with audacity, but I really don't know what I'm doing. I'd be very grateful if I could get an explanation of the basic steps necessary for isolating and removing a set frequency range. Once I know how to do the basics, I can play around and see if I can get it working.
 
Old 01-22-2009, 10:08 AM   #5
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195

Rep: Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043
You can experiment a bit with Audacity. Find a piece of audio where a vocal is present. Mark it. Go to Effects -> Equalizer. Create a curve which drops off at 400 Hz and comes up at 3000 Hz. Apply it to the marked piece. Then play it and see if it is satisfactory. You can undo the equalization, shift the dip from 400 - 3000 Hz a bit and apply again. Don't forget to undo like I did, then you won't hear the difference between one filter and the other.

If have noticed that the voice almost disappears, what remains is some kind of background noise, and indication there is something.

Not good enough for Karaoke at any measure. But if you dub over a voice in the empty space it might not be bad. I think you can do that in Audacity as well.

jlinkels
 
Old 01-24-2009, 09:08 AM   #6
renjithrajasekaran
Member
 
Registered: Jan 2009
Posts: 29

Rep: Reputation: 15
"Audacity" is a good - and most importantly FREE - tool.

If you want something even more functionality laden - try "Cool Edit Pro"
(But, it might cost you a bit!)


Linux Archive

Last edited by renjithrajasekaran; 01-25-2009 at 03:05 AM.
 
Old 01-25-2009, 03:42 AM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Original Poster
Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Thanks again. I'm playing around with the equalizer and I'm getting some partial success. It looks like most of the file won't be too much trouble because there's not much background audio for about 75% of the dialog, but there are a few parts, especially near the end, where there are significant sound effects and/or music, and I'm having a lot of trouble separating out the voice frequencies from the rest of it. The results tend to leave the remaining sounds a bit muddled (and the voices don't go away completely, but I guess that's expected). Is there any more "scientific" way of determining the frequencies involved, or am I just limited to trial and error here? I've tried out the "plot spectrum" analyzer, but all my samples show pretty much the same, fairly smooth, curve, starting high in the lower ranges, and then dropping off from the middle up.

BTW, is there some way I can select multiple points in the frequency curve at once? It's getting quite frustrating having to manually move them one at a time.

Last edited by David the H.; 01-25-2009 at 03:48 AM.
 
Old 01-25-2009, 03:48 AM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Original Poster
Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Oh, and while I'm at it, a question for future consideration. When we get the new voices recorded (we'll probably do them in a couple of weeks), what's the easiest way to merge/overlay them onto the existing soundtrack?
 
Old 01-25-2009, 04:40 AM   #9
Shadow_7
Senior Member
 
Registered: Feb 2003
Distribution: debian
Posts: 4,137
Blog Entries: 1

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
Karaoke works because the voice is the only track centered in the stereo image. Unfortunately you don't have anything like that with a soundtrack(most of the time). Outside of EQ-ing the voice out, which can't really completely get rid of it, you're SOL. Unless you get/use some high end plugin that can Analyze / Identify / Delete the subject matter. Most likely not available for free if at all for linux. But as they say in the audio world, the easiest way to remove a noise (like wind noise) from your recording, is to NOT record it to start with.

$ sox -m original.wav voice_over.wav original_with_voice_over.wav
(assuming all tracks are the same format of course)
(might be it's own app "soxmix" in modern versions)

Is there some reason you've got to keep the original soundtrack, or parts of it? It sounds to me like the project is to recreate the entire soundtrack and replace the original with the new track. Which IMO would be much quicker (and more of a sure thing) than trying to tweak the original. Plus you can change the lines / storyline and turn that sappy drama into a comedy.
 
Old 01-25-2009, 05:28 AM   #10
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Original Poster
Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Oh, I know it will be impossible to get it perfect. But I can't simply substitute our own recording because there are also sound effects and music that we won't be able to reproduce. Basically, I need something like the effect you get with a karaoke filter, on a track where the karaoke filters won't work.

I think that the equalizer technique will be "good enough", but I want to try to get it sounding as nice as I can in the time we have available. As you can probably tell though, I'm not exactly a master at audio mixing and this is the first time I've decided to try something this complex. I'm learning as I go.

I'll try sox for the mixing, but I might have to combine multiple segments into one file. I'll need to be able to control the timing. Perhaps I can try slicing the original into the same-sized segments, mix them, then recombine them.
 
Old 01-25-2009, 01:31 PM   #11
Shadow_7
Senior Member
 
Registered: Feb 2003
Distribution: debian
Posts: 4,137
Blog Entries: 1

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
You can edit / sync audio in audacity or ardour. A bit more user friendly than sox. But with sox you can also create nul wav files and such to pad sound files to the right sync points. Sox without the -m concatenates wav files by default. As long as the wav types already match, same bits, rate, channels and such.

Sound effects can be recreated. Granted that it's not easy and one more thing to sync to the source. But it can be done. Soundtracks on the other hand would be difficult to reproduce. Although if you buy the soundtrack album of the movie, it may have the same rendition of audio in it's entirety without voice overs. Just one other possible option. In either case, credits at the end of the movie should list composition / performer for such things.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
karaoke software linuxmandrake Linux - Software 1 01-13-2008 04:38 AM
Karaoke (CD+G) woes.. clausawits Linux - Software 0 10-26-2005 02:29 PM
Transcode DVD soundtrack rip skip problem fr8_liner Linux - Software 0 11-08-2004 05:32 PM
xine only plays the soundtrack, no speech or sound effects sohmc Linux - Software 5 05-08-2004 06:49 AM
True Crime Streets Of LA SoundTrack Helix General 1 12-12-2003 05:44 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:26 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration