I'm trying to fix an inherited web site where the prior regime used html files that were named without the .html extension.

The new webhost does not understand how to serve these files (but works fine when the file is renamed with the .html extension added).
I've figured out how to find the affected files using grep, so I can script the renaming of the files to add the extension fine. The problem is that the files have cross-reference HREF URLs which use the "base" vs "base.html" syntax, so I need to edit the
content of the files too.
I was hoping to be able to automate this task, but I've not yet found a way to do this simply via a conventional regex based search & replace (eg with a perl 1-liner) that accounts for the fact that some HREFs have extensions and should be left alone, but any being without one need to have it added. The only good news is that any "non-extension" HREF is definitely meant to be .html... so no further distinction is necessary.
Example:
Code:
href="polarityrealizationtherapy"
should be
href="polarityrealizationtherapy.html"
but
href="http://www.polaritytherapy.org/polarity/index.html"
and
href="polarity_brochure.pdf"
should be left unchanged
Any hints for techniques to use in this situation? Thanks!