简体   繁体   中英

Slow multi-threaded java app: is this due to access to static object?

FIRST, AN INTRO...

I have a set of classes that inherit from the same class Feature but are different from each other in that each of them uses different information for its computation. So here's one example:

public class FeatureA extends Feature 
{
    private MyTableA table = null;

    public FeatureA(final String fName, final MyTableA table) {
        super(fName);
        this.table = table;
    }

    public Double compute(String input) 
    {
        return table.computeProduct(String input);
    }
}

public class MyTableA {

    private static  HashMap<String,Integer> map1 = null;    
    private static  HashMap<String,Double> map2 = null;

    public MyTableA(final String serFilePath)
    {
        // map1 and map2 are deserizlied here from serFilePath
    }

    public Double computeProduct(String input)
    {
        Integer val1 = map1.get(input);
        Double val2 = map2.get(input);

        Double res = val1 * val2;

        return res;
    }

So FeatureA depends on an object MyTableA and that object is deserialized when my app is loaded.

This app is multithreaded, with separate threads processing different input data tokens in the same way. For each token, I compute feature values, including FeatureA . MyTableA is the same for all data tokens (as a trivial example, a feature could be someone's phone number looked up by their name in a single global phone book.

NOW, THE PROBLEM...

I have been adding more and more features to my system and each new one added a tiny bit of extra processing time. I added one more and allof a sudden my program runs a lot slower. Like 10 times slower than a version without it. I've been trying to pinpoint the source of the problem without much luck.

QUESTION: is it possible that the slowdown is caused by multiple threads trying to map1.get(input) and map2.get(input) from the same static objects? If so, how could I fix this?

UPDATE : The app has a server and client components with the multithreaded server getting data tokens from multiple copies of the client. In fact, there are 3 RedHat machines that are being used: 1 copy of the server and about 100 copies of the client run on each of the machines. Clients from all 3 machines send data to all 3 servers (so clients on machine1 do not just send data to server1). I have used the profiler like this: java -agentlib:hprof=cpu=samples ... Is this the right way to profile in this case?

Below are the top 5 traces for the "Fast" case, ie with fewer features, and the "Slow" case, ie with more features. java.io.FileInputStream.read0 is part of deserialization of a bunch of maps at start.

Server1_Fast:rank   self  accum   count trace method
Server1_Fast-   1 73.21% 73.21%  405018 300953 java.net.PlainSocketImpl.socketAccept
Server1_Fast-   2  5.63% 78.84%   31145 300698 java.io.FileInputStream.read0
Server1_Fast-   3  4.33% 83.16%   23928 301032 java.lang.UNIXProcess.waitForProcessExit
Server1_Fast-   4  4.28% 87.45%   23685 301031 java.io.FileInputStream.readBytes
Server1_Fast-   5  4.08% 91.53%   22570 301001 java.net.SocketInputStream.socketRead0
--
Server1_Slow:rank   self  accum   count trace method
Server1_Slow-   1 43.66% 43.66%  374607 301136 java.lang.UNIXProcess.forkAndExec
Server1_Slow-   2 23.38% 67.04%  200653 301130 java.io.FileInputStream.readBytes
Server1_Slow-   3  9.74% 76.78%   83571 301058 java.net.PlainSocketImpl.socketAccept
Server1_Slow-   4  7.44% 84.23%   63876 301131 java.lang.UNIXProcess.waitForProcessExit
Server1_Slow-   5  3.70% 87.92%   31711 300690 java.io.FileInputStream.read0
--
Server2_Fast:rank   self  accum   count trace method
Server2_Fast-   1 76.35% 76.35%  427397 300917 java.net.PlainSocketImpl.socketAccept
Server2_Fast-   2  5.21% 81.57%   29183 300690 java.io.FileInputStream.read0
Server2_Fast-   3  4.23% 85.80%   23689 300965 java.net.SocketInputStream.socketRead0
Server2_Fast-   4  4.12% 89.92%   23083 300691 java.io.FileInputStream.readBytes
Server2_Fast-   5  3.09% 93.02%   17320 300993 java.lang.UNIXProcess.waitForProcessExit
--
Server2_Slow:rank   self  accum   count trace method
Server2_Slow-   1 50.19% 50.19%  173210 301024 java.net.PlainSocketImpl.socketAccept
Server2_Slow-   2  9.10% 59.28%   31391 300686 java.io.FileInputStream.read0
Server2_Slow-   3  6.81% 66.09%   23507 300687 java.io.FileInputStream.readBytes
Server2_Slow-   4  5.44% 71.54%   18789 301094 java.lang.UNIXProcess.waitForProcessExit
Server2_Slow-   5  5.38% 76.92%   18571 301093 java.io.FileInputStream.readBytes
--
Server3_Fast:rank   self  accum   count trace method
Server3_Fast-   1 73.38% 73.38%  410860 300954 java.net.PlainSocketImpl.socketAccept
Server3_Fast-   2  6.81% 80.20%   38134 300692 java.io.FileInputStream.read0
Server3_Fast-   3  4.95% 85.14%   27700 300693 java.io.FileInputStream.readBytes
Server3_Fast-   4  3.76% 88.91%   21071 301038 java.lang.UNIXProcess.waitForProcessExit
Server3_Fast-   5  3.75% 92.65%   20974 301037 java.io.FileInputStream.readBytes
--
Server3_Slow:rank   self  accum   count trace method
Server3_Slow-   1 48.32% 48.32%  166867 301048 java.net.PlainSocketImpl.socketAccept
Server3_Slow-   2 10.62% 58.94%   36686 300693 java.io.FileInputStream.read0
Server3_Slow-   3  8.19% 67.13%   28280 300690 java.io.FileInputStream.readBytes
Server3_Slow-   4  5.22% 72.35%   18022 301119 java.lang.UNIXProcess.waitForProcessExit
Server3_Slow-   5  5.06% 77.41%   17464 301118 java.io.FileInputStream.readBytes

Does this info shed any light on the problem?

In general, if you have lots of threads using (ie reading and updating) a shared Map object then:

  • If you don't synchronize properly, your code will not be thread-safe. This is liable to give you incorrect behavior of various kinds that may be hard to reproduce, and may be platform dependent.

  • If you do synchronize properly, then the shared Map will be potential performance bottleneck. However, whether it is an actual bottleneck will depend on the Map implementation and/or how synchronization is implemented, as well as the workload.


Is it possible that the slowdown is caused by multiple threads trying to map1.get(input) and map2.get(input) from the same static objects?

Yes it is possible. (The fact that it is static is not relevant. The key point is that the objects are shared by multiple threads.) It is also possible that the real problem is something completely different; eg something about the functionality you have added, or something else that you have not revealed to us.

If so, how could I fix this?

It's not possible to say given the level of detail you have provided. (The code you provided is clearly a "mock-up". There is nothing we can learn from it.)

If the assumptions above are correct, then choosing a different Map class or a different synchronization / locking scheme might help.

However, that is just guesswork. You should not rely on guesswork when trying to fix performance problems. It is better to use a profiler to identify the actual performance bottlenecks, then figure out what is causing them; eg is it contention, or something else entirely that is the real problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM