简体   繁体   中英

Difference between synchronized method and synchronized block

I'm trying to understand the difference between synchronized block and synchronized methods by examples. Consider the following simple class:

public class Main {
    private static final Object lock = new Object();
    private static long l;
    public static void main(String[] args) { 

    }

    public static void action(){
        synchronized(lock){
            l = (l + 1) * 2;
            System.out.println(l);
        }
    }
}

The compiled Main::action() will look as follows:

public static void action();
  Code:
     0: getstatic     #2                  // Field lock:Ljava/lang/Object;
     3: dup
     4: astore_0
     5: monitorenter                      // <---- ENTERING
     6: getstatic     #3                  // Field l:J
     9: lconst_1
     10: ladd
     11: ldc2_w        #4                  // long 2l
     14: lmul
     15: putstatic     #3                  // Field l:J
     18: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
     21: getstatic     #3                  // Field l:J
     24: invokevirtual #7                  // Method java/io/PrintStream.println:(J)V
     27: aload_0
     28: monitorexit                       // <---- EXITING
     29: goto          37
     32: astore_1
     33: aload_0
     34: monitorexit                       // <---- EXITING TWICE????
     35: aload_1
     36: athrow
     37: return

I thought we'd better use synchronized blocks instead of synchronized methods because it provides more encapsulation preventing clients to affect on synchronization policy (with synchronized method any client can acquire the lock affecting the synchronization policy). But from the performance standpoint it seemed to me pretty much the same. Now consider the synchronized-method version:

public static synchronized void action(){
    l = (l + 1) * 2;
    System.out.println(l);
}

public static synchronized void action();
Code:
   0: getstatic     #2                  // Field l:J
   3: lconst_1
   4: ladd
   5: ldc2_w        #3                  // long 2l
   8: lmul
   9: putstatic     #2                  // Field l:J
  12: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
  15: getstatic     #2                  // Field l:J
  18: invokevirtual #6                  // Method java/io/PrintStream.println:(J)V
  21: return

So, in the synchronized method version there're much less intructions to execute, so I would say that it's faster.

QUESTION: Is a synchronized method faster than a synchronized block?

A quick test using the Java code posted at the bottom of this answer resulted in the synchronized method being faster. Running the code on a Windows JVM on an i7 resulted in the following averages

synchronized block: 0.004254 s

synchronized method: 0.001056 s

This would imply that the synchronized method is actually faster as per your byte-code assessment.

What confused me, however, was the stark difference in the 2 times. I would have presumed that the JVM would still have a lock on the underlying synchronized method and the difference in times would be negligible, this was not the end result however. Since the Oracle JVM is closed, I took a look at the OpenJDK hotspot JVM source and dug into the byte code interpreter that handles the synchronization methods/blocks. To reiterate, the following JVM code is for the OpenJDK, but I would presume the official JVM has something similar in nature to how it handles the situation.

When a .class file is built, if a method is synchronized, byte code is put in that alerts the JVM that the method is synchronized (similar to byte code being added if the method is static/public/final/varargs , etc.), and the underlying JVM code sets a flag on the method structure to this effect.

When the byte-code interpreter hits byte-code for method invocation, the following code is called before the method is invoked that checks if it needs to be locked:

case method_entry: {
  /* CODE_EDIT: irrelevant code removed for brevities sake */

  // lock method if synchronized
  if (METHOD->is_synchronized()) {
      // oop rcvr = locals[0].j.r;
      oop rcvr;
      if (METHOD->is_static()) {
        rcvr = METHOD->constants()->pool_holder()->java_mirror();
      } else {
        rcvr = LOCALS_OBJECT(0);
        VERIFY_OOP(rcvr);
      }
      // The initial monitor is ours for the taking
      BasicObjectLock* mon = &istate->monitor_base()[-1];
      oop monobj = mon->obj();
      assert(mon->obj() == rcvr, "method monitor mis-initialized");

      bool success = UseBiasedLocking;
      if (UseBiasedLocking) {
        /* CODE_EDIT: this code is only run if you have biased locking enabled as a JVM option */
      }
      if (!success) {
        markOop displaced = rcvr->mark()->set_unlocked();
        mon->lock()->set_displaced_header(displaced);
        if (Atomic::cmpxchg_ptr(mon, rcvr->mark_addr(), displaced) != displaced) {
          // Is it simple recursive case?
          if (THREAD->is_lock_owned((address) displaced->clear_lock_bits())) {
            mon->lock()->set_displaced_header(NULL);
          } else {
            CALL_VM(InterpreterRuntime::monitorenter(THREAD, mon), handle_exception);
          }
        }
      }
  }

  /* CODE_EDIT: irrelevant code removed for brevities sake */

  goto run;
}

Then, when the method completes and returns to the JVM function handler, the following code is called to unlock the method (note that the boolean method_unlock_needed is set before the the method is invoked to bool method_unlock_needed = METHOD->is_synchronized() ):

if (method_unlock_needed) {
    if (base->obj() == NULL) {
      /* CODE_EDIT: irrelevant code removed for brevities sake */
    } else {
      oop rcvr = base->obj();
      if (rcvr == NULL) {
        if (!suppress_error) {
          VM_JAVA_ERROR_NO_JUMP(vmSymbols::java_lang_NullPointerException(), "");
          illegal_state_oop = THREAD->pending_exception();
          THREAD->clear_pending_exception();
        }
      } else {
        BasicLock* lock = base->lock();
        markOop header = lock->displaced_header();
        base->set_obj(NULL);
        // If it isn't recursive we either must swap old header or call the runtime
        if (header != NULL) {
          if (Atomic::cmpxchg_ptr(header, rcvr->mark_addr(), lock) != lock) {
            // restore object for the slow case
            base->set_obj(rcvr);
            {
              // Prevent any HandleMarkCleaner from freeing our live handles
              HandleMark __hm(THREAD);
              CALL_VM_NOCHECK(InterpreterRuntime::monitorexit(THREAD, base));
            }
            if (THREAD->has_pending_exception()) {
              if (!suppress_error) illegal_state_oop = THREAD->pending_exception();
              THREAD->clear_pending_exception();
            }
          }
        }
      }
    }
}

The statements CALL_VM(InterpreterRuntime::monitorenter(THREAD, mon), handle_exception); and CALL_VM_NOCHECK(InterpreterRuntime::monitorexit(THREAD, base)); , and more specifically the functions InterpreterRuntime::monitorenter and InterpreterRuntime::monitorexit are the code that is called in the JVM for both synchronized methods and blocks to lock/unlock the underlying objects. The run label in the code is the massive byte-code interpreter switch statement that handles the different byte codes being parsed.

From here, if a synchronized block opcode (the monitorenter and monitorexit byte-codes) is encountered, the following case statements are run (for monitorenter and monitorexit respectively):

CASE(_monitorenter): {
    oop lockee = STACK_OBJECT(-1);
    // derefing's lockee ought to provoke implicit null check
    CHECK_NULL(lockee);
    // find a free monitor or one already allocated for this object
    // if we find a matching object then we need a new monitor
    // since this is recursive enter
    BasicObjectLock* limit = istate->monitor_base();
    BasicObjectLock* most_recent = (BasicObjectLock*) istate->stack_base();
    BasicObjectLock* entry = NULL;
    while (most_recent != limit ) {
      if (most_recent->obj() == NULL) entry = most_recent;
      else if (most_recent->obj() == lockee) break;
      most_recent++;
    }
    if (entry != NULL) {
      entry->set_obj(lockee);
      markOop displaced = lockee->mark()->set_unlocked();
      entry->lock()->set_displaced_header(displaced);
      if (Atomic::cmpxchg_ptr(entry, lockee->mark_addr(), displaced) != displaced) {
        // Is it simple recursive case?
        if (THREAD->is_lock_owned((address) displaced->clear_lock_bits())) {
          entry->lock()->set_displaced_header(NULL);
        } else {
          CALL_VM(InterpreterRuntime::monitorenter(THREAD, entry), handle_exception);
        }
      }
      UPDATE_PC_AND_TOS_AND_CONTINUE(1, -1);
    } else {
      istate->set_msg(more_monitors);
      UPDATE_PC_AND_RETURN(0); // Re-execute
    }
}

CASE(_monitorexit): {
    oop lockee = STACK_OBJECT(-1);
    CHECK_NULL(lockee);
    // derefing's lockee ought to provoke implicit null check
    // find our monitor slot
    BasicObjectLock* limit = istate->monitor_base();
    BasicObjectLock* most_recent = (BasicObjectLock*) istate->stack_base();
    while (most_recent != limit ) {
      if ((most_recent)->obj() == lockee) {
        BasicLock* lock = most_recent->lock();
        markOop header = lock->displaced_header();
        most_recent->set_obj(NULL);
        // If it isn't recursive we either must swap old header or call the runtime
        if (header != NULL) {
          if (Atomic::cmpxchg_ptr(header, lockee->mark_addr(), lock) != lock) {
            // restore object for the slow case
            most_recent->set_obj(lockee);
            CALL_VM(InterpreterRuntime::monitorexit(THREAD, most_recent), handle_exception);
          }
        }
        UPDATE_PC_AND_TOS_AND_CONTINUE(1, -1);
      }
      most_recent++;
    }
    // Need to throw illegal monitor state exception
    CALL_VM(InterpreterRuntime::throw_illegal_monitor_state_exception(THREAD), handle_exception);
    ShouldNotReachHere();
}

Again, the same InterpreterRuntime::monitorenter and InterpreterRuntime::monitorexit functions are called to lock the underlying objects, but with much more overhead in the process, which explains why there is the difference in times from a synchronized method and synchronized block.

Obviously both the synchronized method and synchronized block have their pros/cons to consider when using, but the question was asking about which is faster, and based on the preliminary test and the source from the OpenJDK, it would appear as though the synchronized method (alone) is indeed faster than a synchronized block (alone). Your results may vary though (especially the more complex the code is), so if performance is an issue it's best to do your own tests and gauge from there what might make sense for your case.

And here's the relevant Java test code:

Java Test Code

public class Main
{
    public static final Object lock = new Object();
    private static long l = 0;

    public static void SyncLock()
    {
        synchronized (lock) {
            ++l;
        }
    }

    public static synchronized void SyncFunction()
    {
        ++l;
    }

    public static class ThreadSyncLock implements Runnable
    {
        @Override
        public void run()
        {
            for (int i = 0; i < 10000; ++i) {
                SyncLock();
            }
        }
    }

    public static class ThreadSyncFn implements Runnable
    {
        @Override
        public void run()
        {
            for (int i = 0; i < 10000; ++i) {
                SyncFunction();
            }
        }
    }

    public static void main(String[] args)
    {
        l = 0;
        try {
            java.util.ArrayList<Thread> threads = new java.util.ArrayList<Thread>();
            long start, end;
            double avg1 = 0, avg2 = 0;
            for (int x = 0; x < 1000; ++x) {
                threads.clear();
                for (int i = 0; i < 8; ++i) { threads.add(new Thread(new ThreadSyncLock())); }
                start = System.currentTimeMillis();
                for (int i = 0; i < 8; ++i) { threads.get(i).start(); }
                for (int i = 0; i < 8; ++i) { threads.get(i).join(); }
                end = System.currentTimeMillis();
                avg1 += ((end - start) / 1000f);
                l = 0;
                threads.clear();
                for (int i = 0; i < 8; ++i) { threads.add(new Thread(new ThreadSyncFn())); }
                start = System.currentTimeMillis();
                for (int i = 0; i < 8; ++i) { threads.get(i).start(); }
                for (int i = 0; i < 8; ++i) { threads.get(i).join(); }
                end = System.currentTimeMillis();
                avg2 += ((end - start) / 1000f);
                l = 0;
            }
            System.out.format("avg1: %f s\navg2: %f s\n", (avg1/1000), (avg2/1000));
            l = 0;
        } catch (Throwable t) {
            System.out.println(t.toString());
        }
    }
}

Hope that can help add some clarity.

The number of instructions is really not all that different considering that your synchronized block has a goto which negates 6 or so instructions after it.

It really boils down to how best to expose an object across multiple threads of access.

On the contrary synchronized method in practice should be much slower than a synchronized block as synchronized method would be making more code sequential.

However if both contain the same amount of code then there shouldn't be much difference in the performance which is supported by test below.

Supporting classes

public interface TestMethod {
    public void test(double[] array);
    public String getName();
}

public class TestSynchronizedBlock implements TestMethod{
    private static final Object lock = new Object();

    public synchronized void test(double[] arr) {
        synchronized (lock) {
            double sum = 0;
            for(double d : arr) {
                for(double d1 : arr) {
                    sum += d*d1;
                }
            }
            //System.out.print(sum + " ");
        }

    }

    @Override
    public String getName() {
        return getClass().getName();
    }
}

public class TestSynchronizedMethod implements TestMethod {
    public synchronized void test(double[] arr) {
        double sum = 0;
        for(double d : arr) {
            for(double d1 : arr) {
                sum += d*d1;
            }
        }
        //System.out.print(sum + " ");
    }

    @Override
    public String getName() {
        return getClass().getName();
    }
}

Main Class

import java.util.Random;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;

public class TestSynchronizedMain {
    public static void main(String[] args) {
        TestSynchronizedMain main = new TestSynchronizedMain();
        TestMethod testMethod = null;

        Random rand = new Random();
        double[] arr = new double[10000];
        for(int j = 0; j < arr.length; j++) {
            arr[j] = rand.nextDouble() * 10000;
        }

        /*testMethod = new TestSynchronizedBlock();
        main.testSynchronized(testMethod, arr);*/

        testMethod = new TestSynchronizedMethod();
        main.testSynchronized(testMethod, arr);

    }

    public void testSynchronized(final TestMethod testMethod, double[] arr) {
        System.out.println("Testing " + testMethod.getName());

        ExecutorService executor = Executors.newCachedThreadPool();
        AtomicLong time = new AtomicLong();
        AtomicLong startCounter = new AtomicLong();
        AtomicLong endCounter = new AtomicLong();

        for (int i = 0; i < 100; i++) {
            executor.submit(new Runnable() {
                @Override
                public void run() {
                    // System.out.println("Started");
                    startCounter.incrementAndGet();
                    long startTime = System.currentTimeMillis();

                    testMethod.test(arr);

                    long endTime = System.currentTimeMillis();
                    long delta = endTime - startTime;
                    //System.out.print(delta + " ");
                    time.addAndGet(delta);
                    endCounter.incrementAndGet();
                }
            });
        }

        executor.shutdown();
        try {
            executor.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);
            System.out.println("time taken = " + (time.get() / 1000.0) + " : starts = " + startCounter.get() + " : ends = " + endCounter);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

Main Output in multiple runs

1.  Testing TestSynchronizedBlock
    time taken = 537.974 : starts = 100 : ends = 100

    Testing TestSynchronizedMethod
    time taken = 537.052 : starts = 100 : ends = 100

2.  Testing TestSynchronizedBlock
    time taken = 535.983 : starts = 100 : ends = 100

    Testing TestSynchronizedMethod
    time taken = 537.534 : starts = 100 : ends = 100

3.  Testing TestSynchronizedBlock
    time taken = 553.964 : starts = 100 : ends = 100

    Testing TestSynchronizedMethod
    time taken = 552.352 : starts = 100 : ends = 100

Note: Test was done on windows 8, 64 bit, i7 machine. Actual time is not important but the relative value is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM