I agree with the above, bash and awk will not be enough to complete a project of such scale, or if somehow it does work, they will be inefficient and insufficient.
I guess perl is a better choice, but maybe you should instead join or fork one of the existing translation tools out there:
It is no easy work to start one on your own from scratch, although if you are a genius, you could and maybe even make something better than anyone else.
As for perl just see the site:
most important links are there.