简体   繁体   English

线程卡在Thread.join()中,即使其他线程已终止

[英]Thread stuck in Thread.join(), even though other thread has terminated

Our Java application starts a worker thread (using Thread.start() ). 我们的Java应用程序启动一个工作线程(使用Thread.start() )。 Shortly thereafter it calls Thread.join() on the worker thread. 此后不久,它在工作线程上调用Thread.join() The worker thread does some stuff and terminates. 工作线程执行一些操作并终止。 The first thread exits the call to join() and goes on its merry way. 第一个线程退出对join()的调用并继续它的快乐方式。 Standard stuff: 标准材料:

Thread t = new WorkerThread();
t.start();

// Blah blah

t.join();

class WorkerThread extends Thread {
    public void run() {
        // Do some stuff
    }
}

At least that's how it's supposed to work, and how it does work in any case we can reproduce. 至少这是它应该如何工作,以及它在任何情况下我们可以重现它是如何工作的。 We have one customer, however, who is persistently running into trouble. 然而,我们有一个客户,他们一直遇到麻烦。

Looking at the threads using PsiProbe, they see that the worker thread is created. 使用PsiProbe查看线程,他们看到创建了工作线程。 It runs for awhile, but after some time disappears from the list of threads. 它运行一段时间,但一段时间后从线程列表中消失。 This happens at an unexpected time (based on timing of other events related to the worker thread). 这发生在意外时间(基于与工作线程相关的其他事件的时间)。 The main thread never gets out of the join() call. 主线程永远不会退出join()调用。

This would seem to break join() 's contract, and implies to me some sort of JVM-level error. 这似乎会破坏join()的契约,并向我暗示某种JVM级别的错误。 Has anybody witnessed behavior like this, or have any idea what could cause it? 有没有人目睹这样的行为,或者知道是什么原因造成的?

EDIT 3-3-11: 编辑3-3-11:

I'm still waiting for conclusive data from the customer, but it seems likely that I didn't really know what I thought I knew: the main thread is likely not blocking in join() at all, but at a point prior to it. 我还在等待来自客户的确凿数据,但似乎我真的不知道我以为我知道的是什么:主线程可能根本没有在join()中阻塞,但是在它之前的某个点上。

Thanks to all for the ideas, anyway. 无论如何,感谢所有的想法。

A quick search of the Java Bugs Database didn't throw up anything that matches your symptoms. 快速搜索Java Bugs数据库并没有抛出任何符合您症状的东西。 But it would be worth your while doing a more extensive search. 但是,在进行更广泛的搜索时,这是值得的。


However, it is worth noting that a JVM bug is only one of many possible theories (see the comments), and there is very little evidence for any theory at this stage. 但是,值得注意的是,JVM错误只是众多可能的理论之一(参见评论),现阶段几乎没有任何理论证据。 If I were you, I would: 如果我是你我会:

  1. figured out what the customer's platform is 弄清楚客户的平台是什么
  2. do a search of the Java bugs database for any known bugs that seem to match the observed symptoms (such as they are). 在Java漏洞数据库中搜索任何看似与观察到的症状相匹配的已知错误(例如它们)。
  3. if you find a bug that looks like a "hit", evaluate it further, try to confirm that it is really the customer's problem, and see if there are any ways to mitigate it. 如果您发现一个看起来像“点击”的错误,请进一步评估,尝试确认它确实是客户的问题,并查看是否有任何方法可以缓解它。
  4. in addition ... 此外 ...
    • look at the other possible theories; 看看其他可能的理论; eg see above 例如见上文
    • try and figure out what is different about the customer's installation compared with others who don't have this problem 尝试找出与没有此问题的其他人相比,客户安装的不同之处
    • add more monitoring, and try to get the customer to use that version. 添加更多监控,并尝试让客户使用该版本。

(On the last bullet point, it may be necessary to point out the obvious to the customer. If they want the bug fixed, they may need to do extra stuff to help you track down the bug, like installing running an "experimental" version of the application for a bit.) (在最后一个要点上,可能有必要向客户指出明显的问题。如果他们想要修复错误,他们可能需要做额外的事情来帮助你追踪错误,比如安装运行“实验”版本有点申请。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM