Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: download pre restore data in cluster restore #124348

Merged
merged 2 commits into from
May 21, 2024

Conversation

msbutler
Copy link
Collaborator

This patch adds the pre restore data spans to the list of spans to download.
While these pre restore spans map to data in the temporary system table
database that are then rewwritten to the actual system table, the download job
ought to download all external data linked into the cluster out of principle.

Fixes #124330

Release note: none

@msbutler msbutler self-assigned this May 17, 2024
@msbutler msbutler requested review from a team as code owners May 17, 2024 16:04
@msbutler msbutler requested review from nameisbhaskar, vidit-bhat and dt and removed request for a team May 17, 2024 16:04
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@dt
Copy link
Member

dt commented May 17, 2024

These tables are already dropped and awaiting GC by the time we write the download job; why are we using download bandwidth and iops on something just to delete it?

@msbutler
Copy link
Collaborator Author

I don't really have strong opinions here, but i have 3 reasons:

  • it is nice to guarantee that after the download job completes, no external bytes are left on the cluster
  • It's also only a mbs of pre restore data, though perhaps it scales for larger clusters.
  • It would also be nice for our check to pass on cluster restore:
`SELECT
COALESCE(stats->>'external_file_bytes','0') FROM
crdb_internal.tenant_span_stats( ARRAY(SELECT(crdb_internal.tenant_span()[1], crdb_internal.tenant_span()[2])))`

I'm also happy to skip the check on cluster restores and move on though.

@dt
Copy link
Member

dt commented May 20, 2024

As long as the "pre" spans are after the main spans I guess it doesn't matter; in a non-trivial restore they're already GC'ed and the vSSTs are gone anyway so it is a noop.

This patch adds the pre restore data spans to the list of spans to download.
While these pre restore spans map to data in the temporary system table
database that are then rewwritten to the actual system table, the download job
ought to download all external data linked into the cluster out of principle.

Fixes cockroachdb#124330

Release note: none
@msbutler msbutler force-pushed the butler-better-error-external-bytes branch from 6e76f55 to be876f3 Compare May 20, 2024 21:52
@msbutler
Copy link
Collaborator Author

TFTR!

bors r=dt

@craig craig bot merged commit 7807ee2 into cockroachdb:master May 21, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: backup-restore/online-restore failed
3 participants