Reply to Re: Analyze and read in html file — PHP Programming Language

Posted by Malcolm Dew-Jones on 06/09/06 20:04

Radium (uh5d@rz.uni-karlsruhe.de) wrote:
: Hi,

: what i want is something similar to th simple-xml extension of php, but for
: html.

: I have to analyze and read in certain tags from a html file in a comfortable
: manner.
: Is there a php extension/library which makes this possible?

In php, not that I know off though I would like to be wrong.

If you know any perl then use the excellent HTML::Parser. It handles just
about anything that a web site might throw at it. You could use the perl
script to build a PHP script

Assume text input something like

<html><head><title>example page</title> (etc)

So write a perl script with handlers something like (totally pseudo code)

sub do_start_tag
{
my $tag_name = this is available in the parser, but I forget how
print TMP_PHP_SCRIPT , "handle_tag('$tag_name');\n";
}

sub do_text
{
my $raw_text = this is available in the parser, but I forget how
my $safe_text = quotemeta($raw_text);
print TMP_PHP_SCRIPT , "handle_text('$safe_text');\n";
}

sub do_end_tag
{
my $tag_name = this is available in the parser, but I forget how
print TMP_PHP_SCRIPT , "handle_end_tag('$tag_name');\n";
}

From that you would get a temporary files with lines like

handle_tag('html');
handle_tag('head');
handle_tag('title');
handle_text( 'example page');
handle_end_tag('title');
handle_end_tag('head');

Your main php script would run the perl script, and then run the temporary
php script (example shown just above), and your php functions like
handle_tag etc would be called just as if you had been able to parse the
data directly from within php.

$0.10

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация