|
|
Posted by R. Rajesh Jeba Anbiah on 06/24/09 11:55
joe t. wrote:
<snip>
> There is a website that requires me to log in using a web-form.
> Obviously, POST vars are sent and verified and on success i'm given a
> Session and/or Cookie. Within this logged-in area, there are links
> leading to data query result pages. "Click here for your recent
> transactions" kind of thing.
>
> Those results pages are what i want to get to, but through some kind of
> script that parses the results that get served out, not by user
> interaction. i want to send a request for a link within that logged in
> area and have the results served to my script, then parse out specific
> data from those results and in turn serve them to a user in my own
> page.
<snip>
Such "web scraping" can be done with cURL <http://in.php.net/curl>
(need to set cookie support). Not all sites would allow web scraping
and will try to block automation with "CAPTCHA" (google it). Some sites
will even use Ajax based rendering which will then make the cURL
process a big tough (though I heard that cURL can work with Mozilla
JavaScript engine). In that case, it will be better to go for Delphi or
VB 6 as we can use WebBrowser component and can automate clicks, etc
with DOM object.
--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/
Navigation:
[Reply to this message]
|