|
Posted by "Richard Lynch" on 10/31/05 19:50
On Mon, October 31, 2005 9:27 am, Yannick Mortier wrote:
> <tr><td><img src="http://www.runescape.com/img/hiscores/attack.gif"
> valign="bottom" width=16 height=16 /></td><td> </td><td><a
> href="hiscoreuser.cgi?username=zezima&category=1"
> class=c>Attack</a></td><td align="right">4</td><td
> align="right">99</td><td align="right">53,156,556</td></tr>
>
>
> and I apply preg_match_all:
>
> preg_match_all("/(<tr><td><img
> src=\"http:\/\/www.runescape.com\/img\/hiscores\/attack.gif\"
> valign=\"bottom\" width=16 height=16 \/><\/td><td> <\/td><td><a
> href=\"hiscoreuser.cgi\?username=)([\w])+(&category=1\"
> class=c>Attack<\/a><\/td><td align=\"right\">)([1-9])+(<\/td><td
> align=\"right\">)([1-9])+(<\/td><td
> align=\"right\">)([1-9,])+(<\/td><\/tr>)/",$seite,$attack);
>
> ($seite is the string)
When trying to web-scrape data like this, I would recommend that you
try to focus on things that are NOT likely to change, rather than the
HTML bits that probably will change.
When you HAVE to use the HTML, focus on the smallest elements of HTML
that you can to identify what you want, so your odds of an altered
HTML page will be less likely to affect you.
I would try this:
'/username=(.*)\\&.*"right">([0-9]*).*"right>([0-9]*).*"right>([0-9,]*)/'
PS FOR SURE, you need 0-9 and not 1-9 for your numbers:
Rank: 10
Score: 45,067,13
etc
> Can you explain me how I can get those values?
< and > are probably being interpreted as special characters or
something.
--
Like Music?
http://l-i-e.com/artists.htm
Navigation:
[Reply to this message]
|