|
Posted by Jukka K. Korpela on 04/11/07 07:02
Scripsit Gérard Talbot:
> - KompoZer 0.77 markup cleaner will fix nested lists, remove trainling
> <br> that WYSIWYG HTML editors often leave, remove align attributes in
> empty table cells, remove empty blocks (like <p></p>). HTML Tidy will
> do all this, except maybe fix nested lists
Thanks for the heads-up.
Such tools should _not_ be used without great discretion.
Apart from fixing nested lists, which is a vague expression and could mean
just about anything, all of these operations change the document and cause
largely unpredictable effects on its visual appearance.
For example, authors and editors often insert consecutive <br> tags to
produce some vertical spacing. That's a wrong approach, but so is the
operation of blindly removing them. The author wanted to create some
spacing, so the author should decide what to do. Maybe the spacing _could_
be removed. Maybe some simple CSS code should be added while removing the
tags.
Even "cleaning" <td align="right"></td> to <td></td> is wrong if you don't
know what will happen, and a simple program surely cannot know that. Maybe
the attribute is there for no good reason, but it's possible that it's there
intentionally, e.g. because some client-side script will change the
element's content to nonempty and the author wanted that content to be
right-aligned.
> - HTML Tidy (April 2007 version) has to be your first tool because it
> is mighty powerful and amazing at fixing severely poorly coded
> webpages.
I didn't know there's a new version of Tidy; I thought the software was
effectively frozen. Now I'm afraid I need to take a look, and I'm afraid I
will be disappointed. When I last tested Tidy, it did _far too much_
"fixing", making wild assumptions and even changing simple presentational
HTML to awfully ugly and poorly structured tag soup in a CSS flavor as well
as changing my perfectly good ISO-8895-1 characters into messy "escapes".
> The nice thing about HTML Tidy is that you can use it on a batch of
> many webpages. It's highly configurable (with about 100 parameters
> possible: see http://tidy.sourceforge.net/docs/quickref.html
> )
> and very powerful.
That might be nice, but if the defaults for the parameters are poor, I
cannot really recommend it to most people. Few people will be capable of
setting, say, 50 parameters to reasonable values when the programmer was not
able to do that.
> HTML Tidy will also fix validation markup errors but not all of them.
> You'll still need to validate your webpages with a true SGML parser
> software.
That sounds odd. If it is mightly powerful etc. etc., how come it can't do
the fairly simple job of SGML validation - at least with the DTD fixed to
one of HTML DTDs?
--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/
Navigation:
[Reply to this message]
|