简体   繁体   中英

Is Ruby Thread-Safe by default?

I ran following code in console for more than 10 times, and it resulted in same output, 100000 .

i = 0
1000.times do
Thread.start { 100.times { i += 1 } }
end
i

Shouldn't it give me different output, as I'm reading and updating i , using multiple threads. It made me wonder, is ruby actually thread-safe by default? If not, then why do I see same output always?

ps If you say, it is not thread-safe by default then can you share a simple example, which will give me different result when I run in rails console?

Edit:

In other words, is above code running 1000 threads concurrently? If yes, then result shouldn't be 100000 ALWAYS . If no, then how to run multiple threads concurrently?

If I add puts , then print order of i will change. It implies threads are interleaving each other, but are they running concurrently?

I'm not asking, how to make this thread-safe. I understand concepts of mutex / locking & synchronous/asynchronous process. Because I understand them, I'm failing to understand output of this code.

No code is thread-safe automatically, you have to work to make it thread safe.

In particular the += operation is actually three operations: read, increment, write. If these get mixed in with other threads you can have wildly unpredictable behaviour.

Consider the following series of events on two threads:

  A       B
-------------
READ
         READ
INCR
         INCR
WRITE
         WRITE

This is the simplest case where you'll have two increment operations yet since they both use the same original value one of them is nullified.

In my testing this is less likely to occur on a dual core system but practically a constant problem on four core machines simply because many behave like two loosely connected dual core systems, each with its own cache. It's even more pronounced when using JRuby where the threading support is much better. That example code you have yields random answers for me, anywhere from 98200 to 99500.

To make this thread safe you must employ a Mutex or use an atomic increment operation from a library like Concurrent Ruby which will give you the tools to do this safely.

The alternative is to avoid mixing data between threads or use a structure like Queue to manage communication. No two threads should ever be manipulating the same object without a Mutex.

Huh!! Finally I found a way to prove, that it will not result 100000 always on irb.

Running following code gave me the idea,

100.times do
i = 0
1000.times do
Thread.start { 100.times { i += 1 } }
end
puts i
end

I see different values, most of the times. Mostly, it ranges from 91k to 100000 .

" In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by an operating system scheduler. A thread is a light-weight process. "

irb(main):001:0> def calculate_sum(arr)
irb(main):002:1>   sleep(2)
irb(main):003:1>   sum = 0
irb(main):004:1>   arr.each do |item|
irb(main):005:2*     sum += item
irb(main):006:2>   end
irb(main):007:1>   sum
irb(main):008:1> end
=> :calculate_sum
irb(main):009:0>
irb(main):010:0* @items1 = [12, 34, 55]
=> [12, 34, 55]
irb(main):011:0> @items2 = [45, 90, 2]
=> [45, 90, 2]
irb(main):012:0> @items3 = [99, 22, 31]
=> [99, 22, 31]
irb(main):013:0>
irb(main):014:0* threads = (1..3).map do |i|
irb(main):015:1*   Thread.new(i) do |i|
irb(main):016:2*     items = instance_variable_get("@items#{i}")
irb(main):017:2>     puts "items#{i} = #{calculate_sum(items)}"
irb(main):018:2>   end
irb(main):019:1> end
=> [#<Thread:0x2158ab8@(irb):15 run>, #<Thread:0x2158860@(irb):15 run>, #<Thread:0x2158488@(irb):15 run>]
irb(main):020:0> threads.each {|t| t.join}
items3 = 152
items2 = 137
items1 = 101
=> [#<Thread:0x2158ab8@(irb):15 dead>, #<Thread:0x2158860@(irb):15 dead>, #<Thread:0x2158488@(irb):15 dead>]
irb(main):021:0>

This is a basic example of threading a process in Ruby. You have the main method calculate_sum that takes an array as an argument @item1, @item2, @item3 . From there you make three threads threads = (1..3) map them into their own variable .map do |i| and start a new Thread instance with the variable that the thread was mapped to, Thread.start(i) .

From here you create an item variable that is equal to whatever the instance variable is items = instance_variable_get(<object>) output the result of the calculations, puts "items#{<thread-variable>} = #{calculate_sum(items)}" .

As you can see the threads begin to run simultaneously => [#<Thread:0x2158ab8@(irb):15 run>, #<Thread:0x2158860@(irb):15 run>, #<Thread:0x2158488@(irb):15 run>] . The threads are all executed by calling each thread and joining them threads.each {|t| t,join} threads.each {|t| t,join} .

The last section is the most important, the threads are all run and die at the same time, however, if a thread has a very long process that thread must end before the program will end. Example:

irb(main):023:0> Thread.new do
irb(main):024:1*   puts t
irb(main):025:1>   Thread.new do
irb(main):026:2*     sleep(5)
irb(main):027:2>     puts h
irb(main):028:2>   end
irb(main):029:1> end
=> #<Thread:0x2d070f8@(irb):23 run>
irb(main):030:0> hello
goodbye

The second thread never exits, so it will keep running the process until you cut the execution.

In the main example the end has => [#<Thread:0x2158ab8@(irb):15 dead>, #<Thread:0x2158860@(irb):15 dead>, #<Thread:0x2158488@(irb):15 dead>] because all the threads finish the process, and exit immediately. In order for my process to finish you would have to provide an exit for the second thread.

I hope this answers your questions.

Unfortunately, Ruby does not have an officially specified memory model like Java has since Java 5 or C++ has since C++11.

In fact, Ruby really does not have an official specification at all , although there have been multiple attempts at it, all of them have the same problem, that the designers of Ruby aren't actually using them. So, the only specification that Ruby has is basically "whatever YARV does". (And, for example the ISO Ruby Language Specification simply doesn't specify the Thread class, thus side-stepping the problem altogether.)

BUT!!! For concurrency, this is basically unusable, because YARV is incapable of running threads in parallel, so a lot of concurrency issues simply do not arise in YARV and so the core library doesn't protect against those issues! However, if we were to say that the concurrency semantics of Ruby are whatever YARV does, the question now becomes: is the fact that we can't have parallelism part of the semantics? Is the fact that the core libraries aren't protected part of the semantics?

That's a struggle that implementations like JRuby, Rubinius, IronRuby, MacRuby, etc., which have threads that can run in parallel, are facing. And they are still working on figuring out the answers.

So, the tl;dr answer to your question is: We don't know whether Ruby is thread-safe because we don't know what the threading semantics of Ruby are.

It's quite common for multi-threaded programs that work fine on YARV to break on JRuby, for example, but again, is it the program's fault or JRuby's? We can't tell because we don't have a specification that tells us what the multi-threaded behavior of a Ruby implementation should be. We could take the easy way out and say, well, Ruby is whatever YARV does, and when the program works on YARV, then we must change JRuby so that the program also works on YARV. However, parallelism is actually one of the main reasons why people choose JRuby in the first place, so this is simply not feasible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM