For the longer table of higher-code substitutes you could also use numeric encoding.
Something like this, perhaps:
Code:
on Encode(str)
set lstChars to characters of str
repeat with i from 1 to length of lstChars
set lngCode to id of item i of lstChars
if lngCode > 127 then set item i of lstChars to ("&#" & lngCode as string) & ";"
end repeat
lstChars as Unicode text
end Encode