Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Parallel-ForkManager CPAN distribution.

Report information
The Basics
Id:
91357
Status:
new
Priority:
Low/Low

People
Owner:
Nobody in particular
Requestors:
bjb [...] fcoo.dk
Cc:
AdminCc:

BugTracker
Severity:
(no value)
Broken in:
(no value)
Fixed in:
(no value)



Subject: ForkManager SIGNAL issue
Date: Thu, 12 Dec 2013 10:58:53 +0000
To: "bug-Parallel-ForkManager@rt.cpan.org" <bug-Parallel-ForkManager@rt.cpan.org>
From: Bjarne Büchmann <bjb@fcoo.dk>

Dear Wizards,

 

I have encountered a problem in Parallel::ForkManager, which under certain conditions halts the code.

 

I have narrowed the problem down to sub wait_on and the line (around line 600 in the present code):

 

local $SIG{CHLD} = sub { } if ! defined $SIG{CHLD};

 

The problem occurs not while setting this signal handler, but rather when going back to the original (which will have to be undef) on leaving the sub.

If the (parent) process is hit with more SIGCHLD(s) at just this right time, perl writes (on STDERR):

 

Use of uninitialized value in block exit at /.../perl5/Parallel/ForkManager.pm line 601.

Unable to create sub named "" at /.../perl5/Parallel/ForkManager.pm line 601.

 

I have tested in a small script without ForkManager, and it seems to be a general problem. I see the following solutions:

1. Remove the “local” scope on $SIG{CHLD} in sub wait_on. This has the side effect of changing an unset value (which likely means IGNORE on most systems) for the entire process. It is not impossible that it could affect some scripts using ForkManager.

2. The calling script (outside ForkManager) can set $SIG{CHLD}, such that the local scope never gets used, ie. I may set “$SIG{CHLD}= sub { };” before using the ForkManager object.

 

There may be other solutions, but I think that localizing %SIG is the culprit. Try e.g. the code snippet here:

http://perl-is-fun.blogspot.dk/2007/04/multi-core-and-signal-handling.html

It still applies under my perl 5.10.1. On that page, Luke writes:

<citation>

The fix? Don't local()ize $SIG within the inside loop or scope. Chances are when Perl resets the $SIG{CHLD} variable when leaving scope, it momentarily leaves it un-set before returning it to the original, pre-local() global value.

</citation>

 

I should say that it is not easy to get “just the right” circumstances to trigger the error with Parallel::ForkManager. But some of our scripts use a few hundred threads all finishing within one or two seconds, and that sometimes - but only sometimes - triggers the error. If the exiting child processes are staggered just a little in time, then the issue does not appear.

 

I hope to hear from you,

 

Best,

 

Bjarne Büchmann, PhD.

 



This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.