| 
 Posted by Philip Ronan on 08/26/05 14:33 
"Basta" wrote: 
 
> I'm trying to retrieve information of a website using PHP and Curl. 
> This is the code I use: 
>  
(snip) 
>  
> This results in a 403 forbidden page. However if I type the url 
> http://teletekst.nos.nl/ in my browser then it works fine (also with 
> cookies disabled). 
 
That's probably because the owners of teletekst.nos.nl are fed up with 
having idiot robots crawling all over their site and stealing its content. 
 
If you had bothered to visit <http://teletekst.nos.nl/robots.txt> you might 
have noticed that robots are not permitted to access this website. You're 
getting a 403 response because their website has identified that you're 
accessing it improperly. 
 
There are probably some things you could do to bypass the blocks on this 
website, but I'm not going to tell you what they are. Create your own 
content. Don't steal it from other websites. 
 
--  
phil [dot] ronan @ virgin [dot] net 
http://vzone.virgin.net/phil.ronan/
 
  
Navigation:
[Reply to this message] 
 |