Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcdutl migrate should read WAL from last snapshot #17227

Closed
serathius opened this issue Jan 11, 2024 · 11 comments · Fixed by #19128
Closed

Etcdutl migrate should read WAL from last snapshot #17227

serathius opened this issue Jan 11, 2024 · 11 comments · Fixed by #19128
Labels
help wanted priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. type/feature
Milestone

Comments

@serathius
Copy link
Member

What would you like to be added?

When reading etcdutl migrate code I noticed that I made a mistake (not a serious one :P).

In code

w, err := wal.OpenForRead(c.lg, walPath, walpb.Snapshot{})
if err != nil {
return nil, fmt.Errorf(`failed to open wal: %v`, err)
}

I pass walpb.Snapshot{} as a reference to snapshot from which to start reading WAL. Empty struct means that fields Index and Term are set to 0, which I think is interpreted as read from beginning.

Instead of reading from the beginning we should be reading from the last snapshot, as this is the part of WAL we expect to be read by etcd when started.

The important part is to update the tests to ensure that checking WAL entries work, we need:

  • Test etcdutl migrate with and without snapshot
  • Test etcdutl migrate detects a newer WAL entry and will abort
  • Test etcdutl migrate passes if there was a snapshot after the newer WAL entry.

Current tests: https://github.com/etcd-io/etcd/blob/b3bf59a3557ce81fb9d4615df667b4cacc6a4f6d/tests/e2e/utl_migrate_test.go

Why is this needed?

Improve correctness of etcdutl migrate command

@serathius
Copy link
Member Author

cc @siyuanfoundation

@fykaa
Copy link
Contributor

fykaa commented Jan 30, 2024

I would like to work on this issue, @serathius. Can you assign it to me?

@serathius
Copy link
Member Author

/assign @fykaa

@jmhbnz jmhbnz added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Feb 16, 2024
@ahrtr
Copy link
Member

ahrtr commented Apr 11, 2024

Overall it's a good idea so that we don't lose any data, which exist in WAL but not in bbolt db.

ping @fykaa are you still working on this?

@ArkaSaha30
Copy link
Contributor

@fykaa I can take this up if you do not have the bandwidth

@fykaa
Copy link
Contributor

fykaa commented May 8, 2024

Hi @ArkaSaha30, I've started working on this. Forgot updating in the comments

@serathius
Copy link
Member Author

Ping @fykaa, do you mind if we reassign the issue so other contributors can help.

@fykaa fykaa removed their assignment Nov 24, 2024
@fykaa
Copy link
Contributor

fykaa commented Nov 24, 2024

Apologies for the lack of updates on this issue. Over the past couple of months, I’ve had significant changes in my professional journey, including finding a job and transitioning to other focus areas, which made it difficult for me to contribute here.

I will unassign myself from this issue to make room for someone else to take it up. I truly appreciate the opportunity and look forward to exploring similar issues when my circumstances align better in the future.

@ahrtr
Copy link
Member

ahrtr commented Dec 5, 2024

cc @siyuanfoundation @joshuazh-x

I think we need to get this resolved in 3.6.

@serathius serathius added this to the etcd-v3.6 milestone Dec 6, 2024
@ahrtr
Copy link
Member

ahrtr commented Dec 6, 2024

It's similar to what the etcdserver bootstrap does.

walSnaps, err := wal.ValidSnapshotEntries(cfg.Logger, cfg.WALDir())
if err != nil {
return nil, be, err
}
// snapshot files can be orphaned if etcd crashes after writing them but before writing the corresponding
// bwal log entries
snapshot, err := ss.LoadNewestAvailable(walSnaps)

@joshuazh-x do you have bandwidth to take care of this ticket?

@ahrtr
Copy link
Member

ahrtr commented Dec 16, 2024

I will take over this if nobody work on it this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. type/feature
Development

Successfully merging a pull request may close this issue.

5 participants