简体   繁体   English

Java进程滞留在SIGTERM处理程序中,删除文件

[英]Java process stuck in SIGTERM handler deleting a file

I have a java process that is failing to respond to a SIGTERM. 我有一个Java进程无法响应SIGTERM。 This is a sporadic issue but occurred on several servers at the same time. 这是一个零星问题,但同时在多台服务器上发生。 I know that I can just kill it with SIGKILL but want to understand why it may be stuck not responding to the SIGTERM signal(s) that have been sent to it. 我知道我可以使用SIGKILL杀死它,但想了解为什么它可能无法响应已发送给它的SIGTERM信号。 Looking at the output of jstack , it seems the following threads are stuck: 查看jstack的输出,似乎以下线程被卡住了:

"SIGTERM handler" daemon prio=10 tid=0x00007fcb30006000 nid=0x259c waiting for monitor entry [0x00007fc98cd2c000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.lang.Shutdown.exit(Shutdown.java:212)
        - waiting to lock <0x00000005fc7cdc60> (a java.lang.Class for java.lang.Shutdown)
        at java.lang.Terminator$1.handle(Terminator.java:52)
        at sun.misc.Signal$1.run(Signal.java:212)
        at java.lang.Thread.run(Thread.java:745)

"SIGTERM handler" daemon prio=10 tid=0x00007fcb30005000 nid=0x30cb waiting for monitor entry [0x00007fc982282000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.lang.Shutdown.exit(Shutdown.java:212)
        - waiting to lock <0x00000005fc7cdc60> (a java.lang.Class for java.lang.Shutdown)
        at java.lang.Terminator$1.handle(Terminator.java:52)
        at sun.misc.Signal$1.run(Signal.java:212)
        at java.lang.Thread.run(Thread.java:745)

"SIGTERM handler" daemon prio=10 tid=0x00007fcb30004000 nid=0x1305 waiting for monitor entry [0x00007fc982a8a000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.lang.Shutdown.exit(Shutdown.java:212)
        - waiting to lock <0x00000005fc7cdc60> (a java.lang.Class for java.lang.Shutdown)
        at java.lang.Terminator$1.handle(Terminator.java:52)
        at sun.misc.Signal$1.run(Signal.java:212)
        at java.lang.Thread.run(Thread.java:745)

"SIGTERM handler" daemon prio=10 tid=0x00007fcb30003000 nid=0xa3e runnable [0x00007fc982585000]
   java.lang.Thread.State: RUNNABLE
        at java.io.UnixFileSystem.delete0(Native Method)
        at java.io.UnixFileSystem.delete(UnixFileSystem.java:265)
        at java.io.File.delete(File.java:1035)
        at java.io.DeleteOnExitHook.runHooks(DeleteOnExitHook.java:80)
        at java.io.DeleteOnExitHook$1.run(DeleteOnExitHook.java:49)
        at java.lang.Shutdown.runHooks(Shutdown.java:123)
        at java.lang.Shutdown.sequence(Shutdown.java:167)
        at java.lang.Shutdown.exit(Shutdown.java:212)
        - locked <0x00000005fc7cdc60> (a java.lang.Class for java.lang.Shutdown)
        at java.lang.Terminator$1.handle(Terminator.java:52)
        at sun.misc.Signal$1.run(Signal.java:212)
        at java.lang.Thread.run(Thread.java:745)

There are many other threads, but nothing looks out of the ordinary in them. 还有许多其他线程,但没有什么与众不同。 The thread holding the lock appears to be deleting some file. 持有该锁的线程似乎正在删除某些文件。 Does anyone know what this file may be and why deletion of it may be stuck? 有谁知道这个文件可能是什么,为什么删除它会卡住?

It is deleting a file that was marked for deletion on JVM termination with File.deleteOnExit . 它正在删除使用File.deleteOnExit标记为在JVM终止时删除的文件。 This is probably a temporary file your code or a library of yours created. 这可能是您的代码或您创建的代码的临时文件。 It is not clear to me why this could be hanging. 我不清楚为什么会挂起。 Maybe the file is on a very slow file system (a remote file system), or some I/O device is dead? 也许文件位于速度很慢的文件系统(远程文件系统)上,或者某些I / O设备已死?

Finding out which file this is could be complicated, if you don't find a likely candidate when looking for callers to deleteOnExit in your code. 如果在寻找调用者以在代码中deleteOnExit时找不到合适的候选者,那么找出哪个文件可能会很复杂。

You could check for likely candidates in the list of open files of your JVM (with lsof ), but the file might not be opened currently. 您可以在JVM(使用lsof )打开的文件列表中检查可能的候选对象,但是当前可能无法打开该文件。

Probably the best bet is to attach strace to your JVM with strace -e unlink,unlinkat -p <PID> right before sending SIGTERM and check what files the JVM attempts to delete. 最好的选择是在发送SIGTERM之前strace -e unlink,unlinkat -p <PID>使用strace -e unlink,unlinkat -p <PID>strace附加到JVM,然后检查JVM尝试删除哪些文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM