You are here: character conversion from MS Word to HTML « PHP Programming Language « IT news, forums, messages
character conversion from MS Word to HTML

Posted by saul.baizman on 02/19/07 18:55

Here's a brief description of the problem. My organization has a
client who cuts and pastes information from Microsoft Word documents
into web-based forms, whose contents is then displayed on a website. I
wish to convert the special characters, such as ellipses and trademark
symbols (and whatever else Word might throw at us) into a proper HTML
entity (™) or character reference (®) if the entity does
not exist.

Before you make any suggestions, let me share a brief overview of my
previous attempts at a solution so neither of us wastes his time.
Right now, I'm using a combination of the character map returned by
get_html_translation_table(HTML_ENTITIES) and some kludgy code which
manually maps the Unicode value of an MS Word special character to its
HTML equivalent. For example,

$replace_array[chr(226).chr(128).chr(152)] = "‘" ;

I'd like to be able to do the above operation automatically / across
the board for wacky Word characters. I suspect I may need to use the
mbstring functions. If you have any advice, I'm happy to send helpful
folks some chocolate for their troubles.

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация