Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-4375] Add cluster manual re-join handler #592

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sinclert-canonical
Copy link
Contributor

This PR adds logic to manually re-join a MySQL replica that has gone OFFLINE, to the cluster, whenever MySQL 8.0.21+ auto re-join attempts have been exhausted.

Description

There are edge cases where a MySQL instance that has lost connection to the cluster it originally belong to (i.e. OFFLINE), exhaust its automatic retries (by default 3 retries, with 5 mins between each, starting with MySQL 8.0.21). For those cases, a manual re-join should be performed.

@sinclert-canonical sinclert-canonical force-pushed the sinclert/4375/cluster-manual-rejoin branch 2 times, most recently from 4f14207 to b1e5454 Compare January 22, 2025 09:03
src/charm.py Outdated Show resolved Hide resolved
lib/charms/mysql/v0/mysql.py Show resolved Hide resolved
lib/charms/mysql/v0/mysql.py Show resolved Hide resolved
src/charm.py Outdated Show resolved Hide resolved
src/charm.py Outdated Show resolved Hide resolved
Copy link
Contributor

@paulomach paulomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sinclert-canonical What's the idea for a integration test?

@sinclert-canonical
Copy link
Contributor Author

@sinclert-canonical What's the idea for a integration test?

I think we could do the following:

  • Spawn 3 MySQL units.
  • Set up group_replication_autorejoin_tries variable to 0, to ensure auto-rejoin is not even started.
  • Make one of the MySQL replica instances go OFFLINE (not sure how).
  • Validate that the _execute_manual_rejoin function is called.
  • Validate that the disconnected replica comes back up.

Would you mind doing a pairing session with me? I would like to avoid hours of CI time just to validate the integration test.

@sinclert-canonical sinclert-canonical force-pushed the sinclert/4375/cluster-manual-rejoin branch from bd8292d to c909fbf Compare January 23, 2025 16:21
@sinclert-canonical sinclert-canonical force-pushed the sinclert/4375/cluster-manual-rejoin branch from c909fbf to 41173b0 Compare January 23, 2025 16:55
@sinclert-canonical
Copy link
Contributor Author

Branch rebased from main.

Copy link
Contributor

@paulomach paulomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants