Re: preg_match_all Maximum execution time of 60 seconds error — PHP Programming Language

You are here: Re: preg_match_all Maximum execution time of 60 seconds error « PHP Programming Language « IT news, forums, messages

Posted by Rik on 08/17/06 22:12

Shuan wrote:
> I am trying to grab sites like craigslist, parse with regular
> expression and put some content into database.
>
> $request -> fetch( $region_link );
>
> if( !$request -> error ){
> $pageContent = $request -> results;
>
> $regionpattern =
> "/<a[^>]*href=\"(\/s\/SL\/sg_maY.*)\".*>.*<img.*alt=\"(.*)\".*id=\"btn.*\">/
> siU";
>
> if(preg_match_all( $regionpattern, $pageContent, $categorylinks ))

I was almost tempted to say it was a greedyness issue, before I spotted the /U.
Dodged a bullet there :-).

If I interprete you regex correctly, try this rewrite (I tend to use dots very
sparingly, I'm more a fan of negative character classes, in which proper
greediness is more usefull). I'm not really sure it will gain much on the
resources consumption, but we can try:

'|<a[^>]*?href="(/s/SL/sg_maY[^"]*)"[^>]*>.*?<img[^>]*?alt="([^"]*)"[^>]*?id="bt
n[^"]*"[^>]*>|si

I'd suggest a foreach loop also, instead your for loop:

foreach($categorylinks[1] as $link){
$category_link="http://www.mysite.com".$link;
include( "pagecrawler.php" );//I'm still curious what this does....
}

Or if you do use capture 2:
if(preg_match_all( $regionpattern, $pageContent, $categorylinks,
PREG_SET_ORDER)){
foreach($categorylinks as $link){
$category_link="http://www.mysite.com".$link[1];
include( "pagecrawler.php" );//I'm still curious what this does....
}
}

If you still have issues I'd like to see/know the actual site you're leeching
right now :-).(If you're trying to get a page all at once, be sure to unset()
unused/past variables.) I don't know what your actual pagecrawler.php does, but
if it doesn't use capture 2 you might as well not capture it.

Grtz,
--
Rik Wasmus

Navigation:

Next in forum: Re: even after proper error checking, I still get PHP warnings
Prev in forum: Re: Why no television programmes about php ?
Thread view: Re: preg_match_all Maximum execution time of 60 seconds error

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация