简体   繁体   English

为什么我自己的AtomicLong慢于JDK中提供的速度?

[英]why my own AtomicLong are slower than the one provide in the JDK?

I was writing my own AtomicLong class and I just found that the function I had is much slower than the one provided in the Unsafe class. 我正在编写自己的AtomicLong类,但我发现我拥有的功能比Unsafe类中提供的功能要慢得多。 I am wondering why? 我想知道为什么吗?

Below are the codes I have: 以下是我的代码:

public interface Counter {
    void increment();
    long get();
}


public class PrimitiveUnsafeSupportCounter implements Counter{

    private volatile long count = 0;
    private Unsafe unsafe;
    private long offset;

    public PrimitiveUnsafeSupportCounter() throws IllegalAccessException, NoSuchFieldException {
        Field f = Unsafe.class.getDeclaredField("theUnsafe");
        f.setAccessible(true);
        this.unsafe = (Unsafe) f.get(null);
        this.offset = this.unsafe.objectFieldOffset(PrimitiveUnsafeSupportCounter.class.getDeclaredField("count"));
    }

    @Override
    public void increment() {

        this.unsafe.getAndAddLong(this, this.offset, 1);
    }

    @Override
    public long get() {
        return this.count;
    }
}

public class CounterThread implements Runnable {

    private Counter counter;

    public CounterThread(Counter counter){
        this.counter = counter;
    }
    @Override
    public void run() {

        for (int i = 0; i < 100000; i ++){
            this.counter.increment();
        }
    }
}

class Test{

    public static void test(Counter counter) throws NoSuchFieldException, IllegalAccessException, InterruptedException {

        ExecutorService executor = Executors.newFixedThreadPool(1000);

        long start = System.currentTimeMillis();
        for (int i = 0 ; i < 1000; i++){
            executor.submit(new CounterThread(counter));
        }

        executor.shutdown();
        executor.awaitTermination(1, TimeUnit.MINUTES);
        long stop = System.currentTimeMillis();

        System.out.println(counter.get());
        System.out.println(stop - start);
    }

}

public class Main {
    public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException, InterruptedException {

        Counter primitiveUnsafeSupportCounter = new PrimitiveUnsafeSupportCounter();
        Test.test(primitiveUnsafeSupportCounter);

    }

}

it takes about 3000ms to finish the above codes. 完成上述代码大约需要3000毫秒。 however, it takes about even 7000ms if I used the below codes instead of this.unsafe.getAndAddLong(this, this.offset, 1); 但是,如果我使用以下代码代替this.unsafe.getAndAddLong(this, this.offset, 1); ,则甚至需要7000 this.unsafe.getAndAddLong(this, this.offset, 1); .

long before;
do {
     before = this.unsafe.getLongVolatile(this, this.offset);
} while (!this.unsafe.compareAndSwapLong(this, this.offset, before, before + 1));

I went through the source codes of getAndAddLong and found it does nearly the same thing as the above codes, so what should I miss? 我遍历了getAndAddLong的源代码,发现它与上述代码几乎相同,所以我应该错过什么?

That's JVM intrinsic and hand-written loop version has highly inefficient compiled code for the purpose. 这是JVM固有的和手写的循环版本,为此目的,它的编译代码效率极低。 On x86 you can have atomic version of such read-modify-write operations via lock prefix. 在x86上,您可以通过lock前缀获得此类读取-修改-写入操作的原子版本。 See Intel Manual 8.1.2.2 Software Controlled Bus Locking : 请参阅英特尔手册8.1.2.2软件控制的总线锁定

To explicitly force the LOCK semantics, software can use the LOCK prefix with the following instructions when they are used to modify a memory location. 为了显式强制使用LOCK语义,当用于修改内存位置时,软件可以将LOCK前缀与以下说明配合使用。

In particular you can have something like lock add op1 op2 . 特别是,您可以使用lock add op1 op2类的东西。 In your example you test the result of cmpxchg and do some jump which is obviously slower. 在您的示例中,您测试了cmpxchg的结果并进行了一些明显较慢的跳转。 Also as far as I remember on x86 volatile access requires some sort of mfence or lock to ensure memory ordering. 而且据我所记得,在x86上,易失性访问需要某种mfencelock以确保内存顺序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM