|
Posted by mbstevens on 11/30/06 22:23
On Thu, 30 Nov 2006 22:12:21 +0000, mbstevens wrote:
> On Thu, 30 Nov 2006 12:05:58 -0800, schmoozes wrote:
>
>> http://www.ng2000.com/news.php?tp=html
>>
>> The XHTML definition demands all tags to be lower-cased. Your page will
>> not validate otherwise and will therefore not be valid XHTML. If you
>> write all your XHTML by yourself, it shouldn't be an issue. You simply
>> write all tags in lower-case. Now, imaging situations where you're not
>> in control over the code being written. One situation is when you let
>> visitors/users of the website
>
> The C++ code after going through a couple of pages:
> ____________________________________________________
> private static string LowerCaseHtml(string html)
> {
> string[] tags = new string[] {
> "p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
> "h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
> "tr", "table", "th", "td", "tbody", "thead", "tfoot",
> "input", "select", "option", "textarea", "em", "strong"
> };
>
> foreach (string s in tags)
> {
> html = html.Replace("<" + s.ToUpper(), "<" + s).Replace("/" + s.ToUpper() + ">", "/" + s + ">");;
> }
>
> return html;
> }
> _________________________________________________
>
>
> It's a nice try, but would you mind running it over the following
> sentence, and letting us know what the results are:
>
> <P>Colonel Altman said "Target the Border, boys!"</P>
>
> Looking at the code without actually running it,
> my guess is that you'll get:
>
> <P>colonel altman said "target the border, boys!"</P>
>
> The problem is that you have to
> separate out strings that are parts of tags from those that
> are just part of text that gets displayed on a web page.
>
> You would normally want an (X)HTML parser to do this.
>
> Languages like Perl and Python have libraries and modules
> that provide (X)HTML parsing capabilities. You link them
> in with a single line of code. I haven't checked
> C++ lately, but I bet it does, too.
>
> Tidy, I think, can also accomplish this. You can find it
> through the w3c website.
If it passes the test sentence, you might also try it on:
<img src="Alt/Target/Span.jpg" alt="Colonel Altman said 'Target the
Border, boys!'" HEIGHT=20 WIDTH=36 />
Begin to see why a fairly elaborate parser is needed?
Navigation:
[Reply to this message]
|