|
Posted by Al on 09/26/83 11:37
As far as I know there is no limit... except probably in the gigabyte
range, or whatever memory your server/php has available to it
obviously.
As for getting the html into a string, there are two main ways: using
the native fopen() etc. commands from php and using an extension
library such as cURL.
Using the first method is slightly easier, but you'll need to have
allow_url_fopen enabled. See
http://uk.php.net/manual/en/function.file-get-contents.php (and related
functions) and
http://uk.php.net/manual/en/ref.filesystem.php#ini.allow-url-fopen for
more information.
Using the second method requires a nice wrapper function but after that
it's relatively easy to code. Unfortunately you'll also require the
library to be installed, although the cURL library is installed with
most server setups as far as I know.
A basic cURL using script I got from a firefox downloads ticker script
and modified slightly is presented below:
<?php
function getResponse($url, $port, $timeout) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_PORT, $port);
$data = curl_exec($ch);
curl_close($ch);
if ($data === NULL)
return "Error"; // no data was retrieved, I'm not sure what
you'd want to do here
return $data;
}
// standard wrapper for getresponse (so you don't
// have to do port/timeout business every time)
function extract_html($url) {
return getResponse($url, 80, 25);
}
$siteString = extract_html("http://www.example.com/");
?>
Navigation:
[Reply to this message]
|