简体   繁体   English

基线配置文件 - 指标不一致

[英]Baseline Profiles - Metrics are not consistent

I'm implementing the Baseline profile in an app and when I run my macrobenchmark tests, the results are not consistent.我在应用程序中实现基线配置文件,当我运行我的宏基准测试时,结果并不一致。 Not always the results are optimized using the baselineprofile.txt.并非总是使用baselineprofile.txt 优化结果。

I'm running the macrobenchmark test in an emulator with this configuration:我正在使用此配置的模拟器中运行宏基准测试:

  • Pixel 6像素 6
  • Android 11.0 (API 30) x86_64 (Google APIs) Android 11.0 (API 30) x86_64 (Google API)
  • 6144MB Internal Storage 6144MB 内部存储
  • 1536GB RAM 1536GB 内存

My baselineprofile.txt file has a size of 7MB.我的baselineprofile.txt 文件大小为7MB。

And my results are, for example:我的结果是,例如:

HomeStartupBenchmark_startupNoCompilation[mode=COLD]
timeToInitialDisplayMs   min 344.4,   median 424.3,   max 466.6
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=COLD]
timeToInitialDisplayMs   min 355.0,   median 379.1,   max 690.1
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=COLD]
timeToInitialDisplayMs   min 211.4,   median 319.0,   max 363.3
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=COLD]
timeToInitialDisplayMs   min 301.1,   median 396.0,   max 479.3
Traces: Iteration 0 1 2 3 4

HomeStartupBenchmark_startupNoCompilation[mode=WARM]
timeToInitialDisplayMs   min 285.1,   median 385.5,   max 435.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=WARM]
timeToInitialDisplayMs   min 238.1,   median 331.8,   max 360.9
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=WARM]
timeToInitialDisplayMs   min 263.4,   median 338.9,   max 393.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=WARM]
timeToInitialDisplayMs   min 281.2,   median 373.1,   max 434.0
Traces: Iteration 0 1 2 3 4

HomeStartupBenchmark_startupNoCompilation[mode=HOT]
timeToInitialDisplayMs   min 243.8,   median 252.2,   max 280.8
Traces: Iteration 0 1 2 3 4
HomeFeedStartupBenchmark_startupFullCompilation[mode=HOT]
timeToInitialDisplayMs   min 219.2,   median 245.9,   max 272.7
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=HOT]
timeToInitialDisplayMs   min 175.4,   median 284.6,   max 611.3
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=HOT]
timeToInitialDisplayMs   min 253.0,   median 305.9,   max 310.7
Traces: Iteration 0 1 2 3 4

In this run, the startupPartialBaseline result is better than startupPartialNoBaseline in all compilations modes (COLD, WARM and HOT), and it's ok.在本次运行中,startupPartialBaseline 结果在所有编译模式(COLD、WARM 和 HOT)下都优于 startupPartialNoBaseline,并且还可以。

But in a second run I got:但在第二次运行中,我得到了:

HomeStartupBenchmark_startupNoCompilation[mode=COLD]
timeToInitialDisplayMs   min   510.6,   median   702.0,   max 1,456.0
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=COLD]
timeToInitialDisplayMs   min 323.5,   median 425.0,   max 650.1
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=COLD]
timeToInitialDisplayMs   min   277.8,   median   385.7,   max 1,416.7
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=COLD]
timeToInitialDisplayMs   min 249.4,   median 319.4,   max 437.9
Traces: Iteration 0 1 2 3 4

HomeStartupBenchmark_startupNoCompilation[mode=WARM]
timeToInitialDisplayMs   min 217.9,   median 301.1,   max 340.5
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=WARM]
timeToInitialDisplayMs   min 197.0,   median 269.0,   max 337.6
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=WARM]
timeToInitialDisplayMs   min 214.6,   median 281.5,   max 323.3
Traces: Iteration 0 1 2 3 4
HomeFeedStartupBenchmark_startupPartialNoBaseline[mode=WARM]
timeToInitialDisplayMs   min 246.4,   median 358.8,   max 418.0
Traces: Iteration 0 1 2 3 4

HomeStartupBenchmark_startupNoCompilation[mode=HOT]
timeToInitialDisplayMs   min 254.1,   median 449.9,   max 473.9
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=HOT]
timeToInitialDisplayMs   min   391.3,   median   823.7,   max 1,448.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=HOT]
timeToInitialDisplayMs   min 206.1,   median 235.2,   max 335.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=HOT]
timeToInitialDisplayMs   min 237.8,   median 246.8,   max 275.0
Traces: Iteration 0 1 2 3 4

But here the COLD compilation using the baseline profile has a worst performance than not using the baselineprofile.但是这里使用基线配置文件的 COLD 编译比不使用基线配置文件的性能最差。

Firstly sure to use the latest AGP (7.3.0-rc01), macrobenchmark (1.2.0-alpha03), and profileinstaller (1.2.0).首先确保使用最新的 AGP (7.3.0-rc01)、macrobenchmark (1.2.0-alpha03) 和 profileinstaller (1.2.0)。

The device you're using for benchmarking can be under different load at different times.您用于基准测试的设备可能在不同的时间处于不同的负载下。
I'd recommend to also increase the amount of iterations to give you more consistent results.我建议还增加迭代次数,以提供更一致的结果。
For the samples we chose 5 iterations as a trade off between speed and somehow accurate results.对于样本,我们选择 5 次迭代作为速度和某种准确结果之间的权衡。
As the milage for you app might vary, play with different numbers to see what works for you.由于您的应用程序的里程数可能会有所不同,因此请使用不同的数字来查看适合您的应用程序。 10 runs can provide more accurate results but will take more time to execute. 10 次运行可以提供更准确的结果,但需要更多时间来执行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM