There are several possibilities to do that, but none of them has the same workflow as such applications under windows - therefore you have to accept a possibly steep learning curve.
First, to montage single images into a video stream/container, there would be "Image Magick" - which can do almost anything, but it's not that simple and you have to work on the command line.
Second, there would be tools like "Kino" or "Cinelerra" - real video editing software. I don't know if they do what you want, but they exist.
Third, there's the successor of "Film Gimp" - "CinePaint".
Fourth, the "Gimp Animation Plugin" might help - it's called "GAP".
For re-encoding or conversion of video formats, there's always "mencoder".
After checking your example site, I would start with Perl and Image Magick (or whatever binding you prefer), snap the three face areas of the example image, remember their positions and then step from face-area one to face area two, copy the appropriate area again, step, copy and so on and montage the single images into one animation, convert it and add an audiostream with mencoder.
But a nice "click me here and here and press start" tool I don't know.