Well i expected this question... Here i go,
if the offset is too low for example, only 2 bits then, each page would contain only 2^2=4bytes memory locations and since remaining 30 bits represent 2^30 pages addressing capability, there would be 2^30 pages. Now the problem is "your every program (even if its small i.e of 40 bytes) tends to order for large number of pages (40/4=10), by which the MMU cannot manage good speed in physical address translation for the CPU. Same way, if the offset is of large quantity (ex: 30) bits, then each page would have consisted (2^30) bytes, in which case evry single program (even if its of 2byte sized pgm) would taken 2^30 bytes of memory since every time the memory is given to the program in terms of pages. so to overcome the bottlenecks of these two overhead constraints, the engineer's choice was made.
That is 20 bits for pg.no and 12 bits for offset to gain the acceptable efficiency.
Convinced????