I can duplicate the effect, but I believe it is mechanical.
What I've noticed is that when I get the unexpected upper-case characters in the sequence, an examination of their Xevents show that the KeyPress event for that character comes between the caps_lock's KeyPress and KeyRelease events.
Normal character input is actioned on the keypress, The caps_locks effect appears to turn on on it's KeyPress event, but only turns off on its KeyRelease. While the caps_lock key remains depressed you always get a upper-case letter no matter what state it was in before the keypress or will be in after the keyRelease.
This may be the cause of the perceived latency as the key release event's timing will depend on the strength of the mechanical spring in the keyboard and how much travel is necessary to break the contact, not to mention you're probably slower at removing your finger than you were at applying it. For exceptionally fast typists, this may be long enough to cover the following KeyPress.
Well, that's my theory for what it's worth.
Changing the caps_lock off to action on keyPress rather than KeyRelease would probably fix this, but I don't know how one would go about doing that.
I can't type fast enough to see this problem in normal use, and I never use caps-lock anyway (well, never on purpose