Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic hooks for testing #6938

Merged
merged 16 commits into from
Jan 18, 2025
Merged

Generic hooks for testing #6938

merged 16 commits into from
Jan 18, 2025

Conversation

dnr
Copy link
Member

@dnr dnr commented Dec 5, 2024

What changed?

  • Add generic hook interface for fine-grained control of behavior in tests
  • Use the hooks for matching varying behavior tests (force load balancer to target partitions and disable sync match)
  • Use the hooks to force a race condition in an update-with-start test (by @stephanos)

Why?

To write integration/functional tests that require tweaking behavior of code under test, without affecting non-test builds.

Potential risks

Hooks are disabled by default, so there should be zero risk to production code, and zero overhead (assuming the Go compiler can do very basic inlining and dead code elimination).

The downside is that functional tests now have to be run with -tags test_dep everywhere.

@@ -0,0 +1,18 @@
//go:build !errorinjector
Copy link
Contributor

@stephanos stephanos Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to consider flipping this, actually (ie the default being the real impl).

A while ago I experimented with build tags and assertions; and I feel confident that we can add a step for the actual binary build that verifies there's no trace of ErrorInjector to be found, as it was optimized away. That way we don't need to tell every single developer to run their tests with this flag (in their IDE etc.).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that, but I'm still a little concerned since we have so many different binary builds (docker images, goreleaser, internal ones), and then users that build their own binaries...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, docker-builds seems to already invoke the server Makefile.

@stephanos stephanos mentioned this pull request Dec 5, 2024
4 tasks
@dnr dnr changed the title draft: Error injector for tests draft: Generic hooks for testing Dec 5, 2024
@dnr dnr changed the title draft: Generic hooks for testing Generic hooks for testing Dec 5, 2024
@dnr dnr marked this pull request as ready for review December 5, 2024 23:24
@dnr dnr requested a review from a team as a code owner December 5, 2024 23:24
}
return lb
}

func (lb *defaultLoadBalancer) PickWritePartition(
taskQueue *tqid.TaskQueue,
) *tqid.NormalPartition {
if n := lb.forceWritePartition(); n >= 0 {
if n, ok := testhooks.Get[int](lb.testHooks, testhooks.MatchingLBForceWritePartition); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we discuss it?
I really dislike the idea of having this kind of dependency, and having this kind of code in the main code path.
I would suggest to extract functionality into something like "PartitionPicker", and provide different implementations in functional tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're discussing it now...

  1. If you look at what it was doing before, it was abusing dynamic config to hook in here, so this is already a strict improvement (reduces runtime overhead, and makes it clearer that this is a hook for testing).

  2. We only want this hook in some tests, only some of the time. So even in tests, most of the time we want the standard behavior. So we'd need a test LoadBalancer that can be set/unset to a mode with fixed behavior, otherwise falls back to the default. I think that's worse:

    1. First it's just a lot more code.
    2. Second, that means the mechanism to poke the test implementation is specific to each object, and tests will have to do their own cleanup. This generic mechanism is simpler for test writers, you just s.InjectHook and it's automatically cleaned up.
  3. How do we do that for the other two examples here, forcing async match, and injecting a racing call in the middle of an update-with-start sequence? The alternative implementation method doesn't work there, as far as I can see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good use will be the case when test needs to continue only after specific code line is reached somewhere deep inside server (like in almost all Update tests, I need to make sure that Update actually reached the server and added to the registry before moving forward). In this case, hook can unblock the channel which is awaited in test code.

Copy link
Contributor

@stephanos stephanos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Approval pending on team discussion.

Comment on lines 25 to 30
const (
MatchingDisableSyncMatch = "matching.disableSyncMatch"
MatchingLBForceReadPartition = "matching.lbForceReadPartition"
MatchingLBForceWritePartition = "matching.lbForceWritePartition"

UpdateWithStartInBetweenLockAndStart = "history.updateWithStartInBetweenLockAndStart"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use iota instead? Since the identifiers don't need to be stable; it would save us some typing and effort to keep things consistent.

Counter point could be that we lose the ability to print them. Which we don't do right now; but might in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good idea!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print them as int should be ok too. Anyway, adding Stringer implementation later is not a big deal.

Copy link
Member

@alexshtin alexshtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea! Yes, we should use it with caution when there is no better way, but it really opens new horizons in testing.

@@ -410,6 +414,7 @@ func (c *TemporalImpl) startFrontend() {
fx.Provide(func() persistenceClient.AbstractDataStoreFactory { return c.abstractDataStoreFactory }),
fx.Provide(func() visibility.VisibilityStoreFactory { return c.visibilityStoreFactory }),
fx.Provide(func() dynamicconfig.Client { return c.dcClient }),
fx.Decorate(func() testhooks.TestHooks { return c.testHooks }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: most of these fx.Provide should be fx.Decorate.

}
return lb
}

func (lb *defaultLoadBalancer) PickWritePartition(
taskQueue *tqid.TaskQueue,
) *tqid.NormalPartition {
if n := lb.forceWritePartition(); n >= 0 {
if n, ok := testhooks.Get[int](lb.testHooks, testhooks.MatchingLBForceWritePartition); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good use will be the case when test needs to continue only after specific code line is reached somewhere deep inside server (like in almost all Update tests, I need to make sure that Update actually reached the server and added to the registry before moving forward). In this case, hook can unblock the channel which is awaited in test code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name this file key.go.

Copy link
Member

@alexshtin alexshtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea! Yes, we should use it with caution when there is no better way, but it really opens new horizons in testing.

stephanos added a commit that referenced this pull request Jan 17, 2025
…7097)

## What changed?
<!-- Describe what has changed in this PR -->

Make WorkflowIdConflictPolicy TerminateExisting follow the
WorkflowIdReusePolicy after an _unsuccessful_ termination.

NOTE: The frontend change was made in
#7099

## Why?
<!-- Tell your future self why have you made these changes -->

When the termination from TerminateExisting fails, the user would expect
the WorkflowIdReusePolicy to be applied as the Workflow is not running.

## How did you test it?
<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->

Well ... there's no way to test this well right now. There are no unit
tests; and the functional test cannot be written without the use of
[testhooks](#6938) since
there is no other way to simulate the race condition here.

I manually added a `sync.Once` into the code that terminates the
workflow and can confirm the expected behavior.

## Potential risks
<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->

## Documentation
<!-- Have you made sure this change doesn't falsify anything currently
stated in `docs/`? If significant
new behavior is added, have you described that in `docs/`? -->

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
@dnr dnr merged commit f0e5891 into main Jan 18, 2025
50 checks passed
@dnr dnr deleted the david/ei branch January 18, 2025 01:21
@alexshtin
Copy link
Member

For GoLand users. GoLand supports configuration templates. You can add that build tag to it and then every single run from IDE will have it by default. Basically it is set once and forget approach.
image
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants