View Single Post
I have MovableType running on a Linuk/Apache server at an ISP somewhere in the U.S.A. every possible parameter/property has been set to utf-8, yet sometimes when OmniWeb loads my templates or entries, erroneous characters get introduced; usually "&t" or "&uo" or "&" just in front of the leading character which is usually a < character.

currently, the problem does not occur in Opera.

a description of my problem as reported to the MovableType people follows:

========== BEGIN ==========

hard to explain but basically,

often when I open an entry or a template in the MT admin interface, the data is not loaded in utf-8 encoding. I know this for two reasons:

1. in such a case, the first character of the data is usually an ampersand & instead of a < character.

2. if I choose Encoding from the browser and manually switch to utf-8, the browser reloads the page. it only does this when the page is not already in the chosen encoding.

if I reload the page once, twice, three or four times, the data will eventually be loaded in its correct encoding.

real life example from one of my templates currently loaded in MT:

&t<?xml version="1.0" encoding="utf-8"?>
html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<html xmlns="">

see the "&t" ? it shouldn't be there.

and skipping through my entries one at a time, the fourth one showed this:

&t<p>I came to L.A. to study singing, nothing else, and the studies are coming along wonderfully. I have one-on-one lessons with my teacher once a day from Monday to Friday.

again, the "&t" shouldn't be there.

the problem is, it's dangerous, and inconvenient. inconvenient because each time I load an entry or a template, I have to manually choose utf-8 encoding to be sure that the data is not corrupt.

dangerous because if I forget to reload the data or change the encoding and some of the characters have been loaded incorrectly, the template/entry will be saved in a corrupt state; eg, unbalanced <> characters, semi-colons where there should have been quotation marks, etc.

my blog's MySql database is set to use utf-8. mt's templates have the utf-8 encoding charset statement. so do mine.

I've seen a config field called ExportEncoding. is this or other config fields relevant, even though the manual says that it concerns activity log.

note 1. most of my templates are linked, but that is probably not related to this problem because entries don't have linked files.

note 2. chinese text is not visible as chinese text when browsing the MySql databases. Chinese only appears in some of my entries but there is no obvious correlation between the appearance of Chinese text and the encoding problem.

note 3. NoHTMLEntities is set to 1.

note 4. .htaccess has now had the "AddDefaultCharset utf-8" rule for 4 or 5 minutes and the extra & characters are still appearing in random templates and entries; usually at the beginning of the entry/template code.

========== END ==========

any ideas?

I particularly find OW's "Assume pages use text encoding" confusing. what exactly does "Assume" mean? the same problem occurs whether I have this set to utf-8 or latin 1.

the problem *seems* to be worse when I use the "Request web pages in my preferred language/s".

in the end though, I suspect that OW has encoding issues.

is there anything I can do to test OW or could you test it with my MT site?

in the meantime, I'll probably have to use Opera instead of OW for my critical MT editing work. the encoding errors happen fast less on Opera; almost never.
