Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize how we merge multiple operatorStats #24414

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shangm2
Copy link
Contributor

@shangm2 shangm2 commented Jan 22, 2025

Description

  1. Thanks to @arhimondr who observed that GC could take too much cpu to clean up memory during heavy load
Screenshot 2025-01-21 at 21 55 30
  1. This pr will be the first of a series of optimization to improve how objects are being created along the critical path.
  2. This pr optimizes how we merge multiple operatorStats without creating temporary/intermediate objects

Motivation and Context

  1. The original code will create temporary objects every time we add two OperatorStats together (with same id) using v.add(operatorStats) and this intermediate object will be discarded when it is used to merge with next OperatorStats object, This PR groups all operatorStats by their id and then merge them together in one go.
  2. Refactoring the code by moving local variables into a dedicated class so that we can easily use one loop within the create method to aggregate all necessary metrics.

Impact

Test Plan

  1. local hiveQueryRunner works fine
  2. Internal verifier tests passed

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Optimizations
* Improve how we merge multiple operator stats together. :pr:`24414`
* Improve metrics creation by refactoring local variables to a dedicated class. :pr:`24414`


@shangm2 shangm2 requested a review from a team as a code owner January 22, 2025 04:59
@shangm2 shangm2 requested a review from presto-oss January 22, 2025 04:59
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Jan 22, 2025
@shangm2 shangm2 force-pushed the optimize_merging_operator_stats branch from dfcc4ca to a8ecf10 Compare January 22, 2025 05:04
@shangm2 shangm2 force-pushed the optimize_merging_operator_stats branch from a8ecf10 to 727cf06 Compare January 24, 2025 07:35
@shangm2 shangm2 force-pushed the optimize_merging_operator_stats branch from 727cf06 to 6808005 Compare January 24, 2025 07:38
@steveburnett
Copy link
Contributor

Thanks for the release note! Rephrasing suggestions to follow the Order of changes in the Release Notes Guidelines:

== RELEASE NOTES ==

General Changes
* Improve how we merge multiple operator stats together. :pr:`24414`
* Improve metrics creation by refactoring local variables to a dedicated class. :pr:`24414`

@shangm2
Copy link
Contributor Author

shangm2 commented Jan 24, 2025

@arhimondr feel free to take another look. Thank you so much for all the awesome suggestionsl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:Meta PR from Meta
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants