You are here: Re: Convert HTML to Text « HTML « IT news, forums, messages
Re: Convert HTML to Text

Posted by Jim Higson on 03/10/06 16:48

cawoodm@gmail.com wrote:

> I have written a simple RegEx which strips all tags from an HTML file
> and replaces them with spaces.
>
> This was fine until I noticed that some tags should not be replaced
> with spaces. For example in the HTML:
> <b>H</b>ello World
> My program will generate "H ello World" effectively breaking a word
> apart.
>
> Where could I get an "authoritative" list of tags which should result
> in a space and which shouldn't. I presume these are mostly block
> elements like div, br, hr, table etc...

How about using this?

http://www.mbayer.de/html2text/

--
Jim

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация