Gea-Suan Lin's BLOG

Sunday, April 16, 2006

XML 規格問題

XML 對於 ‘<’、’>’ 以及 ‘&’ 在 http://www.w3.org/TR/2004/REC-xml-20040204/#syntax 有蠻完整的說明,當初寫 wretchblog-rss20.pl (無名 Blog 轉 RSS 2.0) 時有參考過:

The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings “&amp;” and “&lt;” respectively. The right angle bracket (>) MAY be represented using the string “&gt;”, and MUST, for compatibility, be escaped using either “&gt;” or a character reference when it appears in the string “]]>” in content, when that string is not marking the end of a CDATA section.

不過當初沒有管到 ‘>’ 這個,再加上置換的順序錯了,於是就爛了:實戰演練:無名 blog 資料移轉