|
Posted by Christoph Burschka on 06/13/07 12:12
Gilles Ganault schrieb:
> Hello
>
> There's a site I like that doesn't support RSS, so I'd like to
> write a PHP script that my RSS client can call, that will connect to
> the site, authenticate through a POST message, suck the page that
> shows the new articles, open each article and grab contents through
> regexes, and then disconnect.
>
> Problem is, the site uses a PHPSESSID cookie to keep track of who the
> reader is from page to page.
>
> Is there a tool in PHP that can grab web pages and store cookies to
> keep the remote site happy?
>
> Thank you.
You can send cookies and even POST parameters with fsockopen(), if
libcurl isn't installed in your version of PHP. Of course, fsockopen()
isn't http-specific, so you'd have to format the request and parse the
response yourself.
Personally, I love the HTTP client implemented by the Drupal CMS. It's a
single function, short, powerful, and can handle http and https equally.
Since it uses only PHP library functions (as opposed to other functions
defined by Drupal), it can easily be copied into a separate script file
and included by your application:
http://api.drupal.org/api/HEAD/function/drupal_http_request
(Drupal is under the GPL, and given that you're spidering the site for
your personal use anyway, I doubt there are any issues with that.)
--
cb
[Back to original message]
|