简体   繁体   English

IBM JRE gencon策略中的歧义和详细信息:gc输出

[英]Ambiguities in IBM JRE gencon policy and verbose:gc output

I'm assisting a project in tuning their application server environments, and am seeing some rather confusing output in the verbose:gc logs from the IBM JRE, which I'm much less familiar with than Hotspot. 我正在协助一个项目来调整他们的应用程序服务器环境,并且我看到来自IBM JRE的冗长的gc日志中的一些相当混乱的输出,我比Hotspot更不熟悉。 In many cases, what is there appears to contradict what I've seen in documentation from the vendor. 在许多情况下,似乎与我在供应商的文档中看到的内容相矛盾。 This is for an app running on Websphere, running on IBM's 1.7_64 JRE. 这适用于在IBM的1.7_64 JRE上运行的在Websphere上运行的应用程序。 This particular run was with the gencon GC policy, with a 4GB heap size. 这个特定的运行是使用gencon GC策略,堆大小为4GB。 Here is a sample of the verbose:gc output for a global GC that is a source of numerous ambiguities which I'll detail below: 以下是全局GC的详细:gc输出示例,它是众多歧义的来源,我将在下面详细介绍:

<exclusive-start id="8574" timestamp="2015-09-08T22:48:45.819" intervalms="2919.893">
    <response-info timems="0.229" idlems="0.069" threads="43" lastid="0000000005FF6800" lastname="WebContainer : 37" />
</exclusive-start>
<sys-start id="8575" timestamp="2015-09-08T22:48:45.820" intervalms="3028473.245" />
<cycle-start id="8576" type="global" contextid="0" timestamp="2015-09-08T22:48:45.820" intervalms="3028473.281" />
<gc-start id="8577" type="global" contextid="8576" timestamp="2015-09-08T22:48:45.820">
    <mem-info id="8578" free="462493104" total="4155310080" percent="11">
        <mem type="nursery" free="80236904" total="934084608" percent="8" />
        <mem type="tenure" free="382256200" total="3221225472" percent="11">
            <mem type="soa" free="221195336" total="3060164608" percent="7" />
            <mem type="loa" free="161060864" total="161060864" percent="100" />
        </mem>
        <remembered-set count="16780" />
    </mem-info>
</gc-start>
<allocation-stats totalBytes="745771152" >
    <allocated-bytes non-tlh="193377184" tlh="552393968" />
    <largest-consumer threadName="WebContainer : 49" threadId="000000000659E600" bytes="156075608" />
</allocation-stats>
<gc-op id="8579" type="mark" timems="609.025" contextid="8576" timestamp="2015-09-08T22:48:46.429">
    <trace-info objectcount="9486509" scancount="7257956" scanbytes="246533016" />
    <finalization candidates="1528" enqueued="411" />
    <ownableSynchronizers candidates="32" cleared="0" />
    <references type="soft" candidates="63260" cleared="0" enqueued="0" dynamicThreshold="30" maxThreshold="32" />
    <references type="weak" candidates="27120" cleared="11" enqueued="0" />
    <references type="phantom" candidates="17361" cleared="7885" enqueued="7885" />
    <stringconstants candidates="153217" cleared="105"    />
</gc-op>
<gc-op id="8580" type="classunload" timems="1.114" contextid="8576" timestamp="2015-09-08T22:48:46.430">
    <classunload-info classloadercandidates="4306" classloadersunloaded="26" classesunloaded="26" quiescems="0.000" setupms="0.955" scanms="0.108" postms="0.051" />
</gc-op>
<gc-op id="8581" type="sweep" timems="18.520" contextid="8576" timestamp="2015-09-08T22:48:46.449" />
<gc-op id="8582" type="compact" timems="2209.475" contextid="8576" timestamp="2015-09-08T22:48:48.658">
    <compact-info movecount="9482641" movebytes="539414400" reason="compact on aggressive collection" />
</gc-op>
<gc-end id="8583" type="global" contextid="8576" durationms="2839.085" timestamp="2015-09-08T22:48:48.659">
    <mem-info id="8584" free="3615704600" total="4155310080" percent="87">
        <mem type="nursery" free="834618336" total="934084608" percent="89" />
        <mem type="tenure" free="2781086264" total="3221225472" percent="86">
            <mem type="soa" free="2620025400" total="3060164608" percent="85" />
            <mem type="loa" free="161060864" total="161060864" percent="100" />
        </mem>
        <pending-finalizers system="367" default="44" reference="7885" classloader="0" />
        <remembered-set count="15785" />
    </mem-info>
</gc-end>
<cycle-end id="8585" type="global" contextid="8576" timestamp="2015-09-08T22:48:48.659" />
<sys-end id="8586" timestamp="2015-09-08T22:48:48.659" />
<exclusive-end id="8587" timestamp="2015-09-08T22:48:48.659" durationms="2839.704" />

The issues include: 问题包括:

1) The timestamps for GC operations (marked gc-op) seem to be for when the operation finishes, not from when it starts. 1)GC操作的时间戳(标记为gc-op)似乎是在操作完成时,而不是从启动时开始。 IBM doc implies that the timestamp is when that operation occurred ( http://www-01.ibm.com/support/knowledgecenter/SSYKE2_7.0.0/com.ibm.java.zos.71.doc/diag/tools/gcpd_verbose_operation.html ). IBM doc暗示时间戳是该操作发生的时间( http://www-01.ibm.com/support/knowledgecenter/SSYKE2_7.0.0/com.ibm.java.zos.71.doc/diag/tools/gcpd_verbose_operation。 HTML )。 However, when looking at the timestamps and durations for the operation (identified by timems attributes), the numbers don't add up. 但是,在查看操作的时间戳和持续时间(由timems属性标识)时,数字不会相加。 They do add up and make sense, however, if the timestamp is interpreted as when the operation finished, not when it started. 但是,如果将时间戳解释为操作完成时,而不是在操作开始时,它们会加起来并且有意义。 To illustrate, in the example output above the following operations were performed, and these are their provided durations and timestamps (I've rounded the numbers): 为了说明,在上面的示例输出中执行了以下操作,这些是它们提供的持续时间和时间戳(我已经舍入了数字):

operation - duration - timestamp
    mark - 609ms - 48:46.429
    classunload - 1ms - 48:46.430
    sweep - 19ms - 48:46.449
    compact - 2209ms - 48:48.658

As one can see, the timestamps don't make sense if they're interpreted as the start time. 可以看出,如果将时间戳解释为开始时间,则时间戳没有意义。 They do, however, if interpreted as the end time. 但是,如果将它们解释为结束时间。 For instance, the classunload timestamp value equals the makr timestamp plus the classunload duration (not plus the mark duration as it should be if the timestamps indicate when the operations started). 例如,classunload时间戳值等于makr时间戳加上cla​​ssunload持续时间(如果时间戳指示操作何时开始,则不应加上标记持续时间)。 Similarly, the sweep timestamp equals the classunload timestamp plus the sweep duration. 同样,扫描时间戳等于classunload时间戳加上扫描持续时间。

Can someone confirm if this is expected behavior, or am I simply just reading this incorrectly? 有人可以确认这是否是预期的行为,还是我只是错误地阅读?

2) Another issue appears to be that the entirely of every operation in the log appears to be a full-stop-the world operation from start to finish. 2)另一个问题似乎是日志中的每个操作都是从头到尾的全程操作。 I expected that the scavenge GCs would all be full stops, however, all the documentation I've seen from IBM has indicated that at least part of the mark phase of the global GC should be done concurrently. 我预计清除GC将全部停止,但是,我从IBM看到的所有文档都表明,全球GC的标记阶段至少应该同时完成。 However, that is also not what is occurring in the GC log entry above. 但是,这也不是上面GC日志条目中发生的情况。 In this example, it appears that everything is done as a full-stop, and that no part of the mark or either the sweep is done concurrently. 在这个例子中,似乎一切都是作为一个句号完成的,并且标记的任何部分或者扫描都不是同时完成的。 Per (older) documentation at least part of the mark should be done concurrently as not all as part of the stop-the-world pause ( http://www.ibm.com/developerworks/java/library/j-ibmjava2/ ) Is there a setting to make the global GCs in the tenured generation run part of the cycle concurrently? 每个(较旧的)文档至少部分标记应该同时完成,而不是作为停止世界停顿的一部分( http://www.ibm.com/developerworks/java/library/j-ibmjava2/ )是否有一个设置可以使终身代的全球GC同时成为周期的一部分? I know this is being done in the IBM JREs optavgpause collector as well as the Hotspot concurrent collector, neither of which are particularly new technology, so it's not clear why gencon is always forcing a full stop-the-world every time it has to do a global collection. 我知道这是在IBM JREs optavgpause收集器以及Hotspot并发收集器中完成的,这两者都不是特别新技术,因此不清楚为什么gencon每次必须做的时候总是强制完全停止世界全球收藏。 This is important, as the 2.8 second pause indicated above is going to make us run afoul of our SLAs with the client. 这很重要,因为上面指出的2.8秒暂停将使我们与客户端的SLA发生冲突。

3) In the global GC above it's apparent from the occupancy numbers of the nursery that it is also being collected during this operation (the nursery shows as only 8% free before the GC and 89% after). 3)在上面的全球气相色谱中,从苗圃的占用数量可以看出,在这次手术中它也被收集(苗圃在GC之前只有8%免费,之后只有89%)。 However, I'm not seeing the relevant information regarding a GC occuring in the nursery in the output. 但是,我没有看到关于输出中托儿所中发生的GC的相关信息。 I would expect to see a scavange operation logged here. 我希望看到这里记录的scavange操作。 Does anyone know why this is missing? 有谁知道为什么这个缺失? Is this because during a global GC the nursury is also collected via the same heap walking (mark-sweep-compact) approach as the tenured space is when a global GC occurs? 这是因为在全球GC期间,还通过相同的堆行走(标记 - 扫描 - 紧凑)方法收集托儿所,因为当全局GC发生时,终身空间是什么?

You're right in that you're seeing stop of the world collections indicated by the exclusive-start and exclusive-end covering the whole cycle. 你是对的,因为你看到停止了由独家开始和独家终端覆盖整个周期的世界收藏。 What I find interesting is that you're seeing a global gc at a time when the heap is not even fully allocated. 我觉得有趣的是,当堆甚至没有完全分配时,你会看到一个全局gc。

The reason I think is that you have sys-start and sys-end entries which seem to indicate that something in your application (or a remote GC) is triggering a System.gc() and you should get this resolved first and get a new set of gc logs. 我认为你有sys-start和sys-end条目的原因似乎表明你的应用程序(或远程GC)中的某些东西正在触发System.gc(),你应该首先解决这个问题并获得一个新的一套gc日志。

Noticed the question is old, but adding answer to Q1 and Q3 for reference. 注意到这个问题已经过时了,但是在Q1和Q3中添加答案以供参考。

1. You are correct about the timestamp. 1.您对时间戳正确。 Note that the gc-op tag, in fact, tells you what it has already performed in that gc-operation phase. 请注意,gc-op标记实际上告诉您它在gc操作阶段已经执行了什么。 That is why the timestamp indicates the end of the operation. 这就是时间戳指示操作结束的原因。

3. During global GC - both tenure and scavenge collection takes place. 3.在全球GC期间 - 进行任期和清除收集。 These are not individually recorded in the GC log during global collection. 在全局采集期间,这些不会单独记录在GC日志中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM