简体   繁体   English

Java进程高CPU

[英]Java process high CPU

Below is my code and thread dump. 下面是我的代码和线程转储。 I've no clue on why the CPU goes 100%. 我不知道为什么CPU会100%运行。 Anybody can help here ? 有人可以在这里帮忙吗? The below method is the suspect ,I'm just trying to parse a result set and write it to csv with Apache CSV (commons-csv-1.5.jar). 下面的方法是可疑的,我只是想解析一个结果集,并使用Apache CSV(commons-csv-1.5.jar)将其写入csv。 When i comment the line invoking this method, cpu remains at 3%. 当我评论调用此方法的行时,cpu保持在3%。

public static void writeResultSetToFile(ResultSet resultSet, String fileName) {

    BufferedWriter writer = null;
    CSVPrinter csvPrinter = null;

    //If the file with the same filename already exist, a date stamp is appended to the end of the file.
    if(checkIfFileExist(fileName)) {
        LOGGER.info("FILE EXIST:"+fileName);
        String fileNamePostFix = new SimpleDateFormat(Constants.FORMAT_yyyyMMddHHmm).format(new Date());
        fileName=fileName.concat(Constants.UNDERSCORE).concat(fileNamePostFix);
        LOGGER.info("WRITING TO FILE: "+fileName);
    }

    try {
        ResultSetMetaData metadata = resultSet.getMetaData();
        writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileName),StandardCharsets.UTF_8));

        //Fetch the column header from the metadata and inserts to an arrayList
        columnCount = metadata.getColumnCount();
        List<String> headerList = new ArrayList<String>(0);
        for (int i = 1; i <= columnCount; i++) {
            headerList.add(metadata.getColumnName(i));
        }
        String[] headerArray = new String[headerList.size()];
        headerArray = headerList.toArray(headerArray);

        //Creates a csv printer with the column names fetched from the database 
        csvPrinter = new CSVPrinter(writer, CSVFormat.DEFAULT.withHeader(headerArray).withDelimiter(Constants.C_DELIMITER));

        recordCount = 0;
        List<String> valueList = new ArrayList<String>(0);
        while (resultSet.next()) {
            recordCount++;
            for (int i = 1; i <= columnCount; i++) {
                valueList.add(resultSet.getString(i));
            }
            csvPrinter.printRecord(valueList);
            valueList = new ArrayList<String>(0);
        }

    } catch (SQLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

    finally {
        if (writer != null) {
            try {
                writer.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        if (csvPrinter != null) {
            try {
                csvPrinter.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

THREAD DUMP 螺纹转储

Full thread dump OpenJDK 64-Bit Server VM (24.171-b01 mixed mode):

"pool-2-thread-1" prio=10 tid=0x00007f2eb839d000 nid=0x264e runnable [0x00007f2ea6dc4000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:153)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
        - locked <0x00000000e01bc220> (a java.io.BufferedInputStream)
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
        at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
        at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
        at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
        at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_FetchResults(TCLIService.java:515)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.FetchResults(TCLIService.java:502)
        at com.cloudera.hive.hivecommon.api.HS2Client.fetchNRows(HS2Client.java:321)
        at com.cloudera.hive.hive.api.ExtendedHS2Client.fetchNRows(ExtendedHS2Client.java:499)
        at com.cloudera.hive.hivecommon.api.HS2Client.fetchRows(HS2Client.java:301)
        at com.cloudera.hive.hivecommon.dataengine.BackgroundFetcher.run(BackgroundFetcher.java:138)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
        at java.lang.Thread.run(Thread.java:748)

"Service Thread" daemon prio=10 tid=0x00007f2eb80b4800 nid=0x2562 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007f2eb80b2000 nid=0x2561 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007f2eb80af000 nid=0x2560 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f2eb80ad000 nid=0x255f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f2eb807f800 nid=0x255e in Object.wait() [0x00007f2eb45f4000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000e00247f0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00000000e00247f0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" daemon prio=10 tid=0x00007f2eb807d800 nid=0x255d in Object.wait() [0x00007f2eb46f5000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000e000e1c8> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00000000e000e1c8> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f2eb800b800 nid=0x2557 runnable [0x00007f2ec05fb000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.String.split(String.java:2289)
        at java.lang.String.split(String.java:2355)
        at com.cloudera.hive.hivecommon.dataengine.HiveJDBCQueryAnalyserUtils.queryAnalysis(HiveJDBCQueryAnalyserUtils.java:49)
        at com.cloudera.hive.hivecommon.api.HS2Buffer.getData(HS2Buffer.java:181)
        at com.cloudera.hive.hivecommon.api.HS2Client.getData(HS2Client.java:705)
        at com.cloudera.hive.hivecommon.dataengine.HiveJDBCResultSet.getData(HiveJDBCResultSet.java:265)
        at com.cloudera.hive.jdbc.common.SForwardResultSet.getData(SForwardResultSet.java:4590)
        at com.cloudera.hive.jdbc.common.SForwardResultSet.getString(SForwardResultSet.java:2138)
        at xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.utils.FileUtils.writeResultSetToFile(FileUtils.java:153)
        at xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.utils.DatabaseUtils.executeQueryAndWriteToFile(DatabaseUtils.java:135)
        at xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.processor.JDBCProcessor.processCustomQueries(JDBCProcessor.java:84)
        at xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.processor.JDBCProcessor.process(JDBCProcessor.java:47)
        at xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.main.App.main(App.java:49)

"VM Thread" prio=10 tid=0x00007f2eb8077000 nid=0x255c runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f2eb8021000 nid=0x2558 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f2eb8023000 nid=0x2559 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f2eb8025000 nid=0x255a runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f2eb8027000 nid=0x255b runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f2eb80bf800 nid=0x2563 waiting on condition

JNI global references: 234

Heap
 PSYoungGen      total 160768K, used 126191K [0x00000000f5500000, 0x00000000ffd80000, 0x0000000100000000)
  eden space 149504K, 84% used [0x00000000f5500000,0x00000000fd033ef8,0x00000000fe700000)
  from space 11264K, 0% used [0x00000000fe700000,0x00000000fe708000,0x00000000ff200000)
  to   space 10240K, 0% used [0x00000000ff380000,0x00000000ff380000,0x00000000ffd80000)
 ParOldGen       total 349184K, used 192583K [0x00000000e0000000, 0x00000000f5500000, 0x00000000f5500000)
  object space 349184K, 55% used [0x00000000e0000000,0x00000000ebc11df8,0x00000000f5500000)
 PSPermGen       total 21504K, used 12023K [0x00000000d5a00000, 0x00000000d6f00000, 0x00000000e0000000)
  object space 21504K, 55% used [0x00000000d5a00000,0x00000000d65bde38,0x00000000d6f00000)

Well, not an answer, but some reminders that need space: 好吧,这不是答案,而是一些需要空间的提醒:

1) You're using an ArrayList, starting with zero capacity... it has to re-size its buffer quite often, because arraylist size grows like this: 0 -> 1 -> 2 -> 4 -> 7 -> 11 -> 17 . 1)您正在使用从零容量开始的ArrayList ...它必须经常调整其缓冲区的大小,因为arraylist的大小会像这样增长: 0 -> 1 -> 2 -> 4 -> 7 -> 11 -> 17 So instead of List<String> valueList = new ArrayList<String>(0); 所以代替List<String> valueList = new ArrayList<String>(0); use List<String> headerList = new ArrayList<String>(columnCount); 使用List<String> headerList = new ArrayList<String>(columnCount); Instead of valueList = new ArrayList<String>(0); 代替valueList = new ArrayList<String>(0); use valueList.clear(); 使用valueList.clear();

2) You should use resource-try-catch, lot easier to handle. 2)您应该使用resource-try-catch,它更容易处理。

3) The code - or rather: behind the curtains - is a lot of parsing... parsing the SQL result, then 'parsing' the data (has to be escaped in a CSV compatible way). 3)代码-或更确切地说:在幕后-进行了大量解析...解析SQL结果,然后“解析”数据(必须以CSV兼容的方式进行转义)。 Do not underestimate this! 不要小看这个! Even more so, if this CSVPrinter does some additional formatting, like additional spacing to keep the text file looking like a table, etc. 更重要的是,如果此CSVPrinter进行了其他一些格式设置,例如使文本文件看起来像表格一样的额外间距。

4) In addition, Apache libraries are not know for their speed, nor for their resource efficiency! 4)另外,Apache库不知道它们的速度,也不知道它们的资源效率!

5) Manually flushing (as read in the comments) is not good for performance! 5)手动冲洗(如注释中所述)不利于性能!

6) From what it looks like, your method writeResultSetToFile() is called from a loop. 6)从外观writeResultSetToFile() ,您的方法writeResultSetToFile()是从循环中调用的。 I do not know, but if you tell us about '3% usage' this sounds like a sustained task, ie a loop. 我不知道,但是如果您告诉我们“ 3%的使用率”,这听起来像是一项持续的任务,即循环。 So, SUPPOSING there is a loop, and that loop is directly responsible for the 3% CPU usage, it seems to loop some thousand times per second. 因此,假设存在一个循环,该循环直接导致3%的CPU使用率,似乎每秒循环数千次。 If you now - on each call - retrieve and store TWO MILLION lines of text, this will slow down the whole program. 如果现在-在每次调用中-检索并存储两百万行文本,这将减慢整个程序的速度。 CONSIDERABLY. 相当。 Maybe you should not write that file so often? 也许您不应该如此频繁地写入该文件? Maybe once a minute suffices? 也许一分钟一次就足够了吗? However often, you might think about using a decoupled thread that runs parallel to the main loop. 但是,通常您可能会考虑使用与主循环并行运行的解耦线程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM