简体   繁体   中英

Is there some concurrency limitation with trap handlers in Ruby?

I run the following code and (one or more) of the CLD traps gets lost thus leaving a defunct (zombie) process which has not had its exit status collect by use of Process.wait

require 'pp'

children = []
trap("CLD") do
  cpid = Process.wait
  puts "CLD from pid #{cpid} at #{Time.now}"
  6.times {|i| puts "  ... Waiting[#{i}] in CLD trap for pid #{cpid}"; sleep 0.5}
  puts "OK, finished slow trap at #{Time.now} for pid #{cpid}"
  children.delete cpid
end

4.times {|i|
    if child_pid = fork # parent
      puts "In parent, child_pid[#{i}] = #{child_pid}"
      children.push child_pid
    else
      puts "In child[#{i}], PID=#{$$}"
      sleep 0.2
      puts "In child[#{i}], PID=#{$$} ... exiting"
      exit!
    end
  }

while true
  sleep 2
  exit if children.length == 0
  puts "[#{Time.now}] ... Parent still waiting for \n"
  pp children
  sleep 8
end

Here is a sample run output:

[admin@jcmsa pe2]# ruby multitrap.rb
In parent, child_pid[0] = 9285
In child[0], PID=9285
In parent, child_pid[1] = 9289
In child[1], PID=9289
In parent, child_pid[2] = 9293
In child[2], PID=9293
In parent, child_pid[3] = 9297
In child[3], PID=9297
In child[0], PID=9285 ... exiting
In child[3], PID=9297 ... exiting
In child[1], PID=9289 ... exiting
In child[2], PID=9293 ... exiting
CLD from pid 9285 at 2011-02-03 13:31:20 -0800
  ... Waiting[0] in CLD trap for pid 9285
CLD from pid 9289 at 2011-02-03 13:31:20 -0800
  ... Waiting[0] in CLD trap for pid 9289
CLD from pid 9293 at 2011-02-03 13:31:20 -0800
  ... Waiting[0] in CLD trap for pid 9293
  ... Waiting[1] in CLD trap for pid 9293
  ... Waiting[2] in CLD trap for pid 9293
  ... Waiting[3] in CLD trap for pid 9293
  ... Waiting[4] in CLD trap for pid 9293
  ... Waiting[5] in CLD trap for pid 9293
OK, finished slow trap at 2011-02-03 13:31:23 -0800 for pid 9293
  ... Waiting[1] in CLD trap for pid 9289
  ... Waiting[2] in CLD trap for pid 9289
  ... Waiting[3] in CLD trap for pid 9289
  ... Waiting[4] in CLD trap for pid 9289
  ... Waiting[5] in CLD trap for pid 9289
OK, finished slow trap at 2011-02-03 13:31:25 -0800 for pid 9289
  ... Waiting[1] in CLD trap for pid 9285
  ... Waiting[2] in CLD trap for pid 9285
  ... Waiting[3] in CLD trap for pid 9285
  ... Waiting[4] in CLD trap for pid 9285
  ... Waiting[5] in CLD trap for pid 9285
OK, finished slow trap at 2011-02-03 13:31:28 -0800 for pid 9285
[2011-02-03 13:31:28 -0800] ... Parent still waiting for
[9297]
[2011-02-03 13:31:38 -0800] ... Parent still waiting for
[9297]
[2011-02-03 13:31:48 -0800] ... Parent still waiting for
[9297]
[2011-02-03 13:31:58 -0800] ... Parent still waiting for
[9297]

... and so on ...

Then 'ps axf' shows

 9283 pts/2    Sl+    0:00  |   |       \_ ruby multitrap.rb
 9297 pts/2    Z+     0:00  |   |           \_ [ruby] <defunct>

In my experiments

  3 children .... sometimes gets a zombie
  4 children .... more often gets a zombie
  5 children .... always gets a zombie, sometimes more than one

What is the limitation here?

How can I set up a CLD trap handler to handle as many concurrent child exits as I need?

The ruby version is ruby 1.9.1p243 (2009-07-16 revision 24175) [x86_64-linux]

Thanks ...

I have addressed a similar issue in Python. Depending on how ruby implements its signal handling, it is likely that the trap handler will not be called again if a child terminates while a previous invocation of the trap handler is running.

So a safe design is to use a loop and reap all the children possible in a single trap handler run. To do this the assumption is that you already have the list of children - which is the case for you.

-- trap(chld) for each child_pid in children rc = waitpid(pid, WNOHANG) if reaped then remove entry from list

Idiomatic Ruby hints:

  1. 4.times {|i| is usually 4.times do |i| when multiline.
  2. pp children can be replaced with "waiting for #{children.inspect}" unless you meant the two things to be on separate lines.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM