I tend to demux and remux the streams. Although with avconv (debian) which is a fork of ffmpeg. There's the -an (no audio), -sn (no subtitles), and -vn (no video) options plus the -codec:a copy and -codec:v copy options. With avprobe (or ffprobe) to identify what the streams codecs are. It's a bit of a storage waste to do this as you can end up with 4x's the usage. Original, Original audio and video only extracts, Edited audio and video only extracts, the New remuxed media file.
The sox application can do various edits and effects on the CLI. But I tend to use audacity as you can see what the file needs before doing an edit. My flow of sorts might looks like:
$ avprobe media.mkv
$ avconv -i media.mkv -an -sn -codec:v copy -y video_only.mkv
$ avconv -i media.mkv -vn -sn -codec:a copy -y audio_only.aac
$ faad -o audio.wav audio_only.aac
$ audacity audio.wav
$ avconv -i video_only.mkv -i edited_audio.wav \
-codec:v copy -codec:a aac -b:a 128k \
Dynamic range compression is pretty evil / unnatural. If it's only a few LOUD parts you can select the not loud parts and amplify them per part for a more natural sounding result. You can also use audacity's amplify with negative numbers to make things softer too. With 0 being the loudest, -6 being what videos should be normalized to, and -50 being silent. Or some such with that there dB / decibel stuffs.