How can I safely fork?

esrun

New member
Sep 24, 2007
33
3
0
www.esrun.co.uk
I'm using phps exec command to fork off many command line tools at once.

Right now I've set a limit of 150 processes to be running at any one time (across the whole operating system).

I'm running ps ax | wc -l to find out how many processes are currently running. Then I'm subtracting that amount from 150.

If there's 80 processes running then I can launch another 70.

I'm doing this on a loop.

The strange thing is that when watching the number of running processes with activity monitor, It's spiking to 200, 300 and so on until the system finally crashes.

I just can't work out why it's doing this. Does anyone have any ideas?
 


Or if someone knows of a better way to fork many processes while staying within a certain limit (to avoid overloading the system) then I'd love to hear that too!

Right now I'm trying to avoid the use of pcntl_fork
 
Is your process control set up similar to this?:


Code:
if(fork() == 0)//pid ==0 is a child process
{
 //do whatever
}

else
{
 if (number_of_processes >= 150)
 {
    //kill some off
 }
}
 
Essentially this:

Code:
$process_count = trim(`ps ax | wc -l`);

while(1){
	if($process_count < 150){
		//fork another process
	} else {
		//Do nothing, wait for process_count to lower
	}
}

Cheers
 
Bah I have to admit something. I made a mistake somewhere else in the script which was causing more processes to be spawned than are meant to.

The code above is actually okay. Oh well, at least anyone else looking for a simple fork can read this thread.

Thanks jryan21
 
That's probably not how you want to do that. A decent approach is to create an array of about 150 proc_open resource handles, and iterate through them, replacing them with new processes until you're out of processes to run, and process_count will never increase in your while count because it's a scalar, not a closure.
 
That's probably not how you want to do that. A decent approach is to create an array of about 150 proc_open resource handles, and iterate through them, replacing them with new processes until you're out of processes to run, and process_count will never increase in your while count because it's a scalar, not a closure.

Sorry Jcash, you're right about the sample I provided above. That's not actually the code I'm using, I just threw it together quickly for that post.

It should be:

Code:
while(1){
        $process_count = trim(`ps ax | wc -l`);

	if($process_count < 150){
		//fork another process
	} else {
		//Do nothing, wait for process_count to lower
	}
}

I'll look into the proc_open idea you mentioned. Do you think it would be quicker checking those processes than checking the number of processes running using ps ax ?
 
There can't be more than 150 running if you slot up 150 resource handles, and you'll never have to spin, if your processes run similarly you can generally iterate through all your resource handles and go back to the first at about the time it has output, if the time varies you need to use a stream_select block and proc_open a new resource handle for whatever's in the read array.
 
There can't be more than 150 running if you slot up 150 resource handles, and you'll never have to spin, if your processes run similarly you can generally iterate through all your resource handles and go back to the first at about the time it has output, if the time varies you need to use a stream_select block and proc_open a new resource handle for whatever's in the read array.

That's true, I think that's a much safer way to do it if I want to ensure I stay within a limit and without relying on third party date (ps ax). What I meant was that do you think it would be quicker checking those 150 resources to see which are active vs. my current method of running ps ax to see how many processes are running in total.

Right now, a modification of the code I posted above is working quite well. I'm checking the number of processes running, then working out how many free spaces I have, and then launching that number of processes. On a loop. But I'm really looking to save every millisecond so I'll compare your idea against this code to see which is quicker :)

Thanks for your advice.
 
What you are doing probably works, but generally it's best to use some form of IPC (Inter-Process Communication) to establish bi-directional communication with your child processes and then re-use them. Inter-process communication - Wikipedia, the free encyclopedia . Processes are not like threads and there is a high cost to create/destroy them.

The basic model is something like this: start 150 child processes and have them all ask for work from the parent process, then the parent process will give them work and when they are done they will report they are done and then ask for more work and repeat, although you need to make sure they go into an efficient wait/sleep state when they aren't working.

You could create something like this with SysV IPC PHP: Semaphore Functions - Manual or you could use sockets, which would involve various read/write functions and usage of select, which is called socket_select in PHP.