|
Posted by J.O. Aho on 01/03/06 15:37
kinh@yahoo.com wrote:
> In my current PHP project, I have to read pages from a website and
> parse the data. If using the Internet browser, I would have to do the
> following steps:
>
> Step 1: specify the state to display by using:
> www.foo.com/selectState.asp?state=ca
> and it will redirect to
> www.foo.com/list.asp?city=sanjose&page=1
>
> Step 2. www.foo.com/list.asp?city=sanjose&page=2
> Step 3. www.foo.com/list.asp?city=sanjose&page=3
> ...
> and go on...
>
> I can use fopen($url, 'r') and fgets() to get the contents and read
> all the data back into a file. That's the easy part. But the data
> returned is not the correct one at Step 1. It returns something like
> the custom page-not-found error. I guess the redirection causing that
> problem.
> If I continue to read the content as Step 2, I get an empty page: seems
> like the 'state' session variable has not been defined in Step 1.
>
> Could someone help me with this problem ? Thanks,
You could use a external tool for fetching the pages in question, wget has
been around for quite many years and has been proved to be an excellent tool
for this, allows you to fake headers in case the site you are getting the page
from requires "reference page" to server some of the pages..
//Aho
[Back to original message]
|