|
Posted by Andy Hassall on 09/29/07 20:04
On Sat, 29 Sep 2007 12:32:24 -0700, qwertycat@googlemail.com wrote:
>On Sep 29, 7:51 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>> Just wondering - why do you need to fork processes, anyway? There's a
>> lot of overhead in doing it, and if they're all CPU bound anyway you
>> aren't going to gain anything (unless you have a potload of CPU's).
>>
>> Forking is good if you have different processes using different
>> resources. But when they have to contend for the same resource,
>> performance often goes down.
>
>Instead of writing a PHP script that downloads 2 million headers from
>a newsgroup in a single connection (which will cause PHP to crash
>anyway as it'll reach 500MB+ memory usage),
Well, presumably you're doing something with this data, like saving it to a
file or database? In which case you stream it from the network into the
database, rather than read it *all* into memory, and only *then* start saving
it?
>I thought it would be
>better to launch 4 processes do download it in chunks of 50,000
>headers - with 4 connections to the same NNTP server.
Yes, it may well be worth doing this to get better throughput (depending where
the bottleneck is), but I wouldn't have thought that the memory limit's the
issue, so long as you're streaming the data through.
I'm still not quite sure about the second level of forking you have in there
though; so there's 1 initial parent, 4 children reading from the server, but
then each has multiple children processing this data? Unless you have masses of
CPUs, you're unlikely to gain anything at that level; the 4 2nd level processes
may as well do the processing as they stream the data in from the network?
(As always, It Depends).
Back to the general question though, when you start forking, you've got child
process management to work out. One child process is relatively easy, more than
one means you have to do a bit more work to send (and receive) signals and
other IPC stuff (since you have to work out *which* child process you're
talking to), and work out what happens if either a child, or a parent process
terminates unexpectedly, or hangs. More than two processes and more than one
level of parent/child doesn't really get any more complicated as such, but
there's more processes to go wrong :-)
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Navigation:
[Reply to this message]
|