|
Posted by mbstevens on 11/30/06 22:12
On Thu, 30 Nov 2006 12:05:58 -0800, schmoozes wrote:
> http://www.ng2000.com/news.php?tp=html
>
> The XHTML definition demands all tags to be lower-cased. Your page will
> not validate otherwise and will therefore not be valid XHTML. If you
> write all your XHTML by yourself, it shouldn't be an issue. You simply
> write all tags in lower-case. Now, imaging situations where you're not
> in control over the code being written. One situation is when you let
> visitors/users of the website
The C++ code after going through a couple of pages:
____________________________________________________
private static string LowerCaseHtml(string html)
{
string[] tags = new string[] {
"p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
"h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
"tr", "table", "th", "td", "tbody", "thead", "tfoot",
"input", "select", "option", "textarea", "em", "strong"
};
foreach (string s in tags)
{
html = html.Replace("<" + s.ToUpper(), "<" + s).Replace("/" + s.ToUpper() + ">", "/" + s + ">");;
}
return html;
}
_________________________________________________
It's a nice try, but would you mind running it over the following
sentence, and letting us know what the results are:
<P>Colonel Altman said "Target the Border, boys!"</P>
Looking at the code without actually running it,
my guess is that you'll get:
<P>colonel altman said "target the border, boys!"</P>
The problem is that you have to
separate out strings that are parts of tags from those that
are just part of text that gets displayed on a web page.
You would normally want an (X)HTML parser to do this.
Languages like Perl and Python have libraries and modules
that provide (X)HTML parsing capabilities. You link them
in with a single line of code. I haven't checked
C++ lately, but I bet it does, too.
Tidy, I think, can also accomplish this. You can find it
through the w3c website.
Navigation:
[Reply to this message]
|