You are here: Re: Efficient way to rip html « HTML « IT news, forums, messages
Re: Efficient way to rip html

Posted by Ben C on 10/03/06 18:25

On 2006-10-03, Arthur Rhodes <rhodesr@no.spam.com> wrote:
> I'm building a web store and I have to create a large number of
> product descriptions. The distributors do not provide spec sheets
> or marketing materials to me in html format. Instead, they advise
> me to simply copy the descriptions from their web sites.
>
> The problem is that the descriptions I need to copy are embedded
> in complex pages, with nested tables, etc. Simply copying the
> page source doesn't seem to be that useful. I end up having to
> cut out lots of table code, etc., and usually make mistakes that
> are time consuming to figure out and fix.
>
> The other alternative is to copy the text and then recreating the html
> formatting from scratch.
>
> Is there an easier way?

Python, and Beautiful Soup.

http://www.crummy.com/software/BeautifulSoup/

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация