Reply to Re: cdata and javascript — HTML — IT news, forums, messages

Posted by Andy Dingley on 02/01/08 11:18

On 29 Jan, 18:33, Jeff <jeff@spam_me_not.com> wrote:

> All I'm doing is taking the CMS data and outputing it as html. That's
> pretty easy as the CMS is nothing but a collection of heading,paragraph,
> image, list, class... objects. One set of those after another. That's
> all html is anyways.

Are you familiar with the MVC pattern (Model View Controller)? It
sounds as if your system here is very far from it, which isn't a good
thing.

Now MVC has most to offer in an interactive context, and for simple
RESTful view-only apps then you really don't gain much benefit from
separating out the Controller. You do however still benefit from
separable Model & View.

Separating Model & View begins by recognising that a good CMS has
quite different data models for "in" and "out" content. The way that
content editors work with content is quite different to how the
eventual pages operate. Editors care about "document
structure" (paras, headers) and may even use HTML as a suitable markup
language for it. They also care a great deal about metadata such as
indexing terms, authoring audit trail and editorial control. The
final page may embed the document content relatively unchanged, but it
will build page wrappers and navigation structures from different
routes, such as "site skins" and queries to build navigation list from
queries across the index metadat on a number of pages. To allow easy
editing you need a content structure that modeal what the editors
need. To allow flexible querying (you'll re-write this a _lot_ over a
sites' lifetime) then you need a clearly separate view that isn't
tightly limited by the underlying DB structure.

There are two ways to build a non-MVC CMS. One treats both portions as
"Model", one treats both as "View". Both have merged things that
shouldn't be merged, which becomes a serious limitation long-term,
once you try and do the inevitable maintenance changes over the
project's lifetime.

A "pure View" architecture stores chunks of HTML in the CMS database
and spits these chunks back out on request. Its characteristics are
that page authoring is hard (content authors are still having to work
in HTML) and its not "smart" for manipulating the content DB as
_content_, rather then as its final presentation.

A "pure Model" architecture stores abstract content, then applies a
hard-coded view process through scripting. This is probably the more
common, especially for page-scripting languages like ASP or JSP with
Scriptlets. iit may be quite a powerful system internally, with good
editing features and smart querying or index-generation. The downside
is that the "view" layer is unclear (probably hard-coded scripts) and
is inflexible to modify.

Classic ways to break Model-View separation are to allow "objects" or
"lists" to be embedded _inside_ pages. If you have any sort of
structure that links content objects together, make sure that they're
stored outside of these objects in another structure, then embed them
within objects only at view time.

> I understand that many CMS's have an editor component that edits
> like a "word" doc. I've always thought that was wrong.

A CMS really should allow "content" to be edited in any damn way you
please. Requiring this to be HTML isn't so bad (HTML is an OK format
to use and Word is a bit paper-centric. Consider DocBook too for some
cases). Where it goes wrong is when you allow _content_ authors to
start hard-coding view-related issues that shouldn't be generated
until query-time on the CMS DB.

> There's two issues that arise and the first is what to do with
> linebreaks in paragraphs and headings. Generally the author expects to
> see those as newlines. So I convert those to <br>'s with an option not to.

Options are bad. Options give the content editor a way to start hard-
coding (or at least implying) the final format after the view
operation.

Give your content editors a "para" and a "linebreak" structure (even a
page or section break too). Most word processors support this,
although few typists are trained to appreciate the difference. Render
this at view-time appropriately, accurately and consistently.

> The other is what to do with extraneous markup the author adds.

Stop them doing it - it's extraneous, after all. The authors should be
given an adequate set of content markup for what they need to describe
and they should be strongly discouraged from using anything else as
well, particularly slipping in little HTML fragments. A necessary
condition to allow this is providing them an adequate content markup
to begin with, and extending this as necessary.

If editors can insert HTML they'll do so, and they'll do it with badly-
understood, invalid HTML 3.2 snippets that they've scraped up from
some cargo-cult website. Or maybe HTML 5.

You might know how to control whitespace, but your users will do it by
inserting repeated <br>.

Users aren't always smart, but they are persistent. If it's possible
to mis-use the system, they'll do so. Your only hope of avoiding
"wrong" use of it isn't by trying to stamp it out, it's by giving them
a "right" alternative, making it good enough to be useful, and
training them so that they use it.

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация