|
Posted by Toby A Inkster on 06/24/07 08:21
Ming wrote:
> It works, but it is simply too slow to get the real (destination) URLs
> for tens of thousands of redirecting URLs.
Well, one slow server will delay the whole thing. You might want to speed
it up by using concurrency: i.e. you have a queue of tens of thousands of
URLs which need "handling", and several "handlers" which each run a loop
requesting a URL to resolve, resolving it and then storing the result.
You'll also need one thread to be a "queue manager" and one to be a
"result storer".
Overall, as DNS and HTTP can be quite a slow business, I'd recommend about
12 handlers, one queue manager and one storer. The queue manager and
storer can be a SQL database server if you like!
Now, technically PHP is capable of doing this, but some other languages,
like Perl and C are a bit better for writing multi-threaded applications.
--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 3 days, 11:56.]
A New Look for TobyInkster.co.uk
http://tobyinkster.co.uk/blog/2007/06/22/new-look/
Navigation:
[Reply to this message]
|