|
Posted by Jukka K. Korpela on 03/23/06 10:06
"Simon" <spambucket@example.com> wrote:
> The problem is many fold.
Indeed, and the problem you described isn't your real problem.
> First I run a blog host site, so the user can
> enter what they want,
Don't let them do that.
> (and I don't want to stop them).
Then accept the consequences (including spam that will be sent - I find it
odd that you take destructive measures - fake address in From field - against
suspected E-mail spam, which can be handled rather well these days, but don't
worry the least about blog spam).
> Sometimes they enter links that are 200 characters long
You are not inserting the URLs as text, are you? Decent blog software should
be able to let users enter real links, with link texts and URLs as separate
things and with URLs used in the internal code (HTML source) only.
> or things like "I
> am soooooooooo.[x200chars]..ooo bored"
Prevent it or accept the consequences.
> Then on the homepage I display the last 25 messages, (the first 45 words
> or 250 chars).
So you truncate messages but not words. What's the point?
> In fact at the moment it is fine, and I was a bit lucky to notice the
> problem.
This sounds like the explanation to what the common saying "No problem"
really means (in some cultures at least): there is a problem, but it has not
exploded yet.
> That is why I wanted to break the long words.
Huh? So why don't you do that? Why would you try to leave it to browsers,
which have even less an idea of what is going on?
> I could check every entry for words that are more than 'xx' chars in a
> row but I was hopping to find a simpler way,
Surely. Check for the lengths of "words", with "word" defined as a maximal
sequence of non-whitespace characters, and truncate a "word" longer than a
reasonable limit, preferably indicating the truncation and making the
unabridged version available somehow, e.g. making
sooooooooooooooooooooooooooooooooooooooooooooooooooo
appear as
<span title="sooooooooooooooooooooooooooooooooooooooooooooooooooo">
soooo<span class="trunc">[…]</span>ooo</span>
with some suitable styling like
..trunc { color: #555; background: white; }
to indicate that the notation is not part of actual user input.
The reasonable limit depends on the language, of course.
> I think that kind of problem should be handled client side.
You are very wrong here.
--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
Navigation:
[Reply to this message]
|