-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up test initialization #14784
Speed up test initialization #14784
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14784 +/- ##
============================================
+ Coverage 61.75% 63.84% +2.09%
- Complexity 207 1610 +1403
============================================
Files 2436 2704 +268
Lines 133233 150989 +17756
Branches 20636 23321 +2685
============================================
+ Hits 82274 96404 +14130
- Misses 44911 47371 +2460
- Partials 6048 7214 +1166
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
try (MockedStatic<HelixPropertyFactory> mock = Mockito.mockStatic(HelixPropertyFactory.class)) { | ||
|
||
// mock helix method to disable slow, but useless, getCloudConfig() call | ||
Mockito.when(HelixPropertyFactory.getCloudConfig(Mockito.anyString(), Mockito.anyString())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this breaks the purpose of integration test, where it should mimic the real cluster scenario.
How long does this call take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That call always returns an empty CloudConfig in tests.
Please have a look at the attached flame graph.
@@ -191,6 +192,7 @@ public void run() { | |||
LOGGER.warn("Failed to connect to zk server.", e); | |||
throw e; | |||
} | |||
Thread.sleep(50L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this sleep?
The CI machines running in Github Actions have very limited resource, so usually we prefer conditioned wait, instead of fixed time sleep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not . It's a replacement for the former up-front 1s sleep which might've served some purpose.
public static void closeAsync(ZkClient client) { | ||
if (client != null) { | ||
ZK_DISCONNECTOR.submit(() -> { | ||
client.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this take if we run it in sync fashion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each close call isn't very slow but there are tens of such calls done during init.
Please have a look at the attached flame graph.
This PR is still a work in progress.
Profiling integration test startup revealed that a lot of time is actually spent waiting on ZK connections or sleeping.
This PR reduces test init time by about 5-6 seconds by:
Flame graph:
wall.html.tgz