|
Posted by klenwell on 05/07/07 23:46
Hi Gilles,
I'm not a regex guru, but I can see a spot a couple problem areas in
your expression:
1. The core syntax could probably be simplified using something like
this:
|^<title>([^<]+)</title>$|i
I hope I got that right -- I usually have to test my expression a few
times before I get all the nuances right. :)
2. smiU - That's modifier overkill. The U here and the ? in your
expression are probably reacting to each other in unexpected ways. If
you don't know about this page, it can help:
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
I have a prefab function I've used for this very thing, but
unfortunately I don't have access to it that moment. Hopefully,
someone will be along shortly with the proper syntax. In the
meantime, I hope this helps in a more general sense.
Regards,
Tom
On May 7, 4:30 pm, Gilles Ganault <nos...@nospam.com> wrote:
> Hello
>
> I went through some examples, tried a bunch of things... but still
> can't figure out why I can't extract the TITLE section of a web page
> using preg_replace():
>
> -----------
> <?php
>
> $url = "http://www.cnn.com";
>
> $response = file_get_contents($url);
>
> $output=preg_replace("|<title>(.+?)</title>|smiU",
> "TITLE=$1",
> $response);
>
> $fp = fopen ("output.html", "w");
> fputs ($fp,$output);
> fclose($fp);
> -----------
>
> Any idea?
>
> Thanks!
[Back to original message]
|