Compare local and weekly benchmarks using Hatchet #1317

chapman39 · 2025-01-23T02:21:22Z

create script to compare benchmarks of a local build and weekly shared benchmarks on LC
improve handling of cmake build type in config-build.py
create optional, manual CI pipeline (ruby-gcc, ruby-clang, lassen-clang) to test current PR
documentation on how to use this script and run the manual CI pipeline

tmp todo:

run benchmarks in a separate pipeline and have them all point at one compare call (since lassen cannot run hatchet)

How the script works

The script matches two caliper files (one from weekly shared location /usr/workspace/smithdev/califiles/serac, one from a specified build location), and creates a Hatchet "graph frame" from the difference between these two files. If the maximum difference between any section of the graph is greater than X seconds, that benchmark will "fail." The script will do this for each benchmark.

Example

../scripts/llnl/compare_benchmarks.py --current-cali-dir . --verbose --depth 2 --metric-columns "Max time/rank (inc)"

(not all graphs are shown)

You can now see the baseline and current benchmark times, as well as the difference between the two. You can also choose which "metric column" you want to see (defaults to average time per rank) and set the level of depth of the tree you wish to view. At the moment, it only displays the difference trees.

Some problems

LC system performance is inconsistent. You can run the same benchmark multiple times and get wildly different results. My understanding is this is due to the node(s) you get allocated, how busy the machine is, among other things. That being said, while this is a nice feature to look at, I'm skeptical to make this a required CI check at this time.

Improving `config-build.py`

Before this PR, if you set -DCMAKE_BUILD_TYPE=Release when configuring Serac, the build directory will incorrectly have debug in the name, since the args.buildtype variable remained Debug. This PR updates args.buildtype based on -DCMAKE_BUILD_TYPE, if set - and assuming --build-type option hasn't been set to anything else.

Links

docs https://serac.readthedocs.io/en/feature-chapman39-hatchet/sphinx/dev_guide/profiling.html
comparison ci example https://lc.llnl.gov/gitlab/smith/serac/-/pipelines/503794

…hatchet

…cleaner prints

…o feature/chapman39/hatchet

chapman39 · 2025-01-29T22:48:05Z

comparison ci https://lc.llnl.gov/gitlab/smith/serac/-/pipelines/503794

chapman39 added 13 commits December 11, 2024 10:29

starter script

b875eb6

get working with some of our cali files

504f355

Merge remote-tracking branch 'origin/develop' into feature/chapman39/…

74e41cf

…hatchet

small comment

4a3cb2a

update uberenv

25e480c

update radiuss-spack-configs

4ba6df0

undo

90fa3c6

Merge remote-tracking branch 'origin/develop' into feature/chapman39/…

48aa032

…hatchet

experimenting

b8cf7eb

add main tracking

f1986d5

Merge remote-tracking branch 'origin/develop' into feature/chapman39/…

594b317

…hatchet

setting up graph frames

d1acac7

starting to generate diff graph frames (not working)

57d9038

chapman39 added CI Continuous Integration testing Related to testing labels Jan 23, 2025

chapman39 self-assigned this Jan 23, 2025

print benchmark comparison results nicely

85850aa

chapman39 mentioned this pull request Jan 23, 2025

Fix benchmarks running as Debug #1318

Closed

chapman39 added 4 commits January 23, 2025 15:54

a

cb4f2fa

comment

ffe4672

filter by compiler as well for ruby case

e48b41a

seconds

4dfae88

chapman39 mentioned this pull request Jan 27, 2025

Improving Performance Analysis #1226

Open

10 tasks

chapman39 added 3 commits January 27, 2025 16:49

Improve handling of cmake build type

ad7e181

manual ci pipeline to compare benchmarks

36fd003

revert uberenv

d0c709c

chapman39 marked this pull request as ready for review January 28, 2025 01:31

chapman39 added 3 commits January 28, 2025 17:52

fix name

c79f38d

fix path

8b9bf8d

test

51cc89d

chapman39 and others added 8 commits January 29, 2025 10:02

revert

a3346df

comparison docs

e16d256

add main to benchmark

dc887f6

Merge branch 'develop' into feature/chapman39/hatchet

a6ef640

show baseline and current times, add depth and metric column option, …

326bfa7

…cleaner prints

Merge branch 'feature/chapman39/hatchet' of github.com:LLNL/serac int…

6c7019c

…o feature/chapman39/hatchet

fix ci?

295066d

verbose ci, better table, more docs

e6dbb67

chapman39 requested review from btalamini and white238 January 29, 2025 21:42

chapman39 added 3 commits January 29, 2025 15:14

cleanup

6cde9e1

wording

fa6f21b

update arg name

ac49930

chapman39 mentioned this pull request Jan 30, 2025

CZLassen failing to load SPOT files LLNL/hatchet#155

Closed

stop lassen comparison builds

ea0e462

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare local and weekly benchmarks using Hatchet #1317

Compare local and weekly benchmarks using Hatchet #1317

chapman39 commented Jan 23, 2025 •

edited

Loading

chapman39 commented Jan 29, 2025

Compare local and weekly benchmarks using Hatchet #1317

Are you sure you want to change the base?

Compare local and weekly benchmarks using Hatchet #1317

Conversation

chapman39 commented Jan 23, 2025 • edited Loading

How the script works

Example

Some problems

Improving config-build.py

Links

chapman39 commented Jan 29, 2025

chapman39 commented Jan 23, 2025 •

edited

Loading

Improving `config-build.py`