By looking at the mplex man page, there's a sync offset option -O (capital o). You can specify the offset by placing something like -O 10s for a 10 second offset. I think that the offset means delaying the video by a certain amount with regards to the video. I've never had to use it in mplex, so I'm not sure about that. However, I had bad sync issues with Avidemux (with MPEG-TS files) and correcting them is done pretty much by trial and error. You have to match up lip movement with spoken words or sounds with actions as close as possible. Indeed, its a tedious and sometimes frustrating process, especially when the sync changes for EVERY file you work with.
A much better solution would be to look at the program(s) that generated the separate A/V streams and check if it has an option for better sync, or perhaps look for a better program (in my case).
|