If you restrict the zoom (as Google does) to certain factors (eg following the sequence 1,2,5,10,20,50,100,200,500, etc or just 1,2,4,8,16,32 etc), you make the job simpler.
This means that the zoom and pan is always aligned to the pixels (though you have to skip pixels as you zoom out).
Google also have more than one level of imaging; if you zoom through the levels, you will notice changes from one set to another at some of the levels.
Give that the most storage is used by the deepest zoom level, it doesn't actually cost much more to generate/store the map at every zoom level (rather than have a single data structure). For an extra 30% storage, you can store all the zoom levels. Though this assumes that the map is not dynamically changing.
So I'd suggest for a start that you just have a simple pixel map at the deepest zoom level, with an x and y offset and zoom level. As you zoom out, the zoom level will increase (1,2,5,10,20,50, etc). The pan rate is directly proportional to this zoom level. And when you are generating the zoomed out content, the zoom level also is the step size for selecting pixels (or groups of pixels).
If you need something more sophisticated (arbitrary zoom/angles, 3D imaging), then it is worth reading an introductory book on 3D graphics, covering topics such as mipmaps.
Last edited by neonsignal; 08-15-2009 at 10:12 PM.