Posted by Andy Dingley on 01/29/08 12:46
On 28 Jan, 17:56, Jeff <jeff@spam_me_not.com> wrote:
> I had thought that since everything is CMS driven
_Inside_ a CMS, there are strong arguments for using XHTML (or at
least, some XML that shares the same XML schema)
On publishing from a CMS, it's usually easier to serve the document to
the web as HTML than it is as XHTML (usable XHTML, meeting Appendix C
and the requirements of good web practice). This is certainly the case
for XSLT-based output. It's hard to achieve Appendix C from XSLT -
XSLT wants to serve it as "XML standards compliant" content, which
isn't appropriate for IE. There's a simple switch to flip it into
"HTML output", but no similar switch for "Appendix C XHTML".
> that I can just create something like an RSS feed, that would have a bit of html in it
> (like "strong" or "i" or br) and serve that depending on accept type.
Have you read the infamous article on this, "Myth of RSS version
comaptibility" from Dive Into Mark? You ought to.
> Now, it's not hard to generate well formed RSS
There's never anything simple about non-trivial RSS, because RSS 2.0
doesn't have a competent specification. There's no clear way to embed
HTML in it. Practical experience favours escaping through entity
encoding. This is different to using CDATA sections, but similar in
meaning. Both are a way to embed "<" safely, but both do it by
embedding "<" as a mere character, stripping away all implication that
it might be marking the start of a HTML tag.
Your RSS reader _might_ later decide to assume that any "words" that
are "wrapped" in angle brackets should thus be treated as HTML tags.
This works (and it's how it's done), but it's far from robust. It has
* Such embedded HTML content can't be validated as being valid HTML,
outside of a final RSS tool that knows about this assumption.
* How do you publish a HTML tutorial that is marked up in plain text,
not HTML? What does this mean:
<title>HTML elements Introduction Course</title>
<description>Today we'll meet the <BR> element!</
* It's hardly rare to use this style of markup in plain text either:
<description>Set the value of the <customer-identifier>
In this case, that isn't a HTML tag at all.
> Will those bits then also have to be in
> correct xhtml? In other words: <br> or <br />?
Just about the only constant for embedding (X)HTML into RSS is that
it's not done through XML or XML namespacing. This is for two reasons:
* RSS (2.0 specs) doesn't grok namespacing, as it's not defined to be
XML (in a compliant sense).
* XML namespacing requires balanced tags and closed elements. This is
a restrictive thing to impose upon embedding HTML fragments. Consider
<html:p>A honking great list, which we've truncated for display.
Now that's a reasonable fragment to want to embed, but it's
impractical by namespacing.
[Back to original message]