I'm implementing the Baseline profile in an app and when I run my macrobenchmark tests, the results are not consistent. Not always the results are optimized using the baselineprofile.txt.
I'm running the macrobenchmark test in an emulator with this configuration:
My baselineprofile.txt file has a size of 7MB.
And my results are, for example:
HomeStartupBenchmark_startupNoCompilation[mode=COLD]
timeToInitialDisplayMs min 344.4, median 424.3, max 466.6
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=COLD]
timeToInitialDisplayMs min 355.0, median 379.1, max 690.1
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=COLD]
timeToInitialDisplayMs min 211.4, median 319.0, max 363.3
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=COLD]
timeToInitialDisplayMs min 301.1, median 396.0, max 479.3
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupNoCompilation[mode=WARM]
timeToInitialDisplayMs min 285.1, median 385.5, max 435.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=WARM]
timeToInitialDisplayMs min 238.1, median 331.8, max 360.9
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=WARM]
timeToInitialDisplayMs min 263.4, median 338.9, max 393.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=WARM]
timeToInitialDisplayMs min 281.2, median 373.1, max 434.0
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupNoCompilation[mode=HOT]
timeToInitialDisplayMs min 243.8, median 252.2, max 280.8
Traces: Iteration 0 1 2 3 4
HomeFeedStartupBenchmark_startupFullCompilation[mode=HOT]
timeToInitialDisplayMs min 219.2, median 245.9, max 272.7
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=HOT]
timeToInitialDisplayMs min 175.4, median 284.6, max 611.3
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=HOT]
timeToInitialDisplayMs min 253.0, median 305.9, max 310.7
Traces: Iteration 0 1 2 3 4
In this run, the startupPartialBaseline result is better than startupPartialNoBaseline in all compilations modes (COLD, WARM and HOT), and it's ok.
But in a second run I got:
HomeStartupBenchmark_startupNoCompilation[mode=COLD]
timeToInitialDisplayMs min 510.6, median 702.0, max 1,456.0
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=COLD]
timeToInitialDisplayMs min 323.5, median 425.0, max 650.1
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=COLD]
timeToInitialDisplayMs min 277.8, median 385.7, max 1,416.7
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=COLD]
timeToInitialDisplayMs min 249.4, median 319.4, max 437.9
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupNoCompilation[mode=WARM]
timeToInitialDisplayMs min 217.9, median 301.1, max 340.5
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=WARM]
timeToInitialDisplayMs min 197.0, median 269.0, max 337.6
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=WARM]
timeToInitialDisplayMs min 214.6, median 281.5, max 323.3
Traces: Iteration 0 1 2 3 4
HomeFeedStartupBenchmark_startupPartialNoBaseline[mode=WARM]
timeToInitialDisplayMs min 246.4, median 358.8, max 418.0
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupNoCompilation[mode=HOT]
timeToInitialDisplayMs min 254.1, median 449.9, max 473.9
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupFullCompilation[mode=HOT]
timeToInitialDisplayMs min 391.3, median 823.7, max 1,448.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialBaseline[mode=HOT]
timeToInitialDisplayMs min 206.1, median 235.2, max 335.4
Traces: Iteration 0 1 2 3 4
HomeStartupBenchmark_startupPartialNoBaseline[mode=HOT]
timeToInitialDisplayMs min 237.8, median 246.8, max 275.0
Traces: Iteration 0 1 2 3 4
But here the COLD compilation using the baseline profile has a worst performance than not using the baselineprofile.
Firstly sure to use the latest AGP (7.3.0-rc01), macrobenchmark (1.2.0-alpha03), and profileinstaller (1.2.0).
The device you're using for benchmarking can be under different load at different times.
I'd recommend to also increase the amount of iterations to give you more consistent results.
For the samples we chose 5 iterations as a trade off between speed and somehow accurate results.
As the milage for you app might vary, play with different numbers to see what works for you. 10 runs can provide more accurate results but will take more time to execute.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.