Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: use first transient error when checking for flakes #124403

Merged

Conversation

renatolabs
Copy link
Collaborator

@renatolabs renatolabs commented May 20, 2024

Previously, roachtest would only look at the outermost error in a chain that matched a TransientError (or ErrorWithOwnership) when checking for flakes. However, that is in most cases not what we want: if a transient error wraps another transient error, the actual reason for the failure is the original (wrapped) error.

Informs: #123887

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@renatolabs renatolabs force-pushed the rc/roachtest-multiple-transient-errors branch from 5efd8e3 to 84e0bb6 Compare May 20, 2024 05:34
@renatolabs renatolabs marked this pull request as ready for review May 20, 2024 09:03
@renatolabs renatolabs requested a review from a team as a code owner May 20, 2024 09:03
@renatolabs renatolabs requested review from nameisbhaskar and vidit-bhat and removed request for a team May 20, 2024 09:03
matched = true
err = errors.Unwrap(err)
if err == nil {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or keep going? What if the next occurrence can be unwrapped?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors.Unwrap(err) returning nil means the err passed doesn't wrap any other error, so there's no "next occurrence". But maybe I misunderstand what you're trying to say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of "multi-errors" (more in the comment for UnwrapOnce). Either way, Unwrap returns nil, in this case, so I suppose those errors aren't very likely.

Copy link
Member

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Previously, roachtest would only look at the outermost error in a
chain that matched a `TransientError` (or `ErrorWithOwnership`) when
checking for flakes. However, that is in most cases *not* what we
want: if a transient error wraps another transient error, the actual
reason for the failure is the original (wrapped) error.

Informs: cockroachdb#123887

Release note: None
@renatolabs renatolabs force-pushed the rc/roachtest-multiple-transient-errors branch from 84e0bb6 to e24022b Compare May 21, 2024 05:21
@renatolabs
Copy link
Collaborator Author

TFTR!

bors r=srosenberg

craig bot pushed a commit that referenced this pull request May 21, 2024
123120: ui: Highlight unavailable ranges in red on the summary bar with nonzero r=abarganier a=theloneexplorerquest

Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: #122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.

124301: logtestutils: generalize structured logging spy r=xinhaoz a=xinhaoz

This commit generalizes the structured logging spy previously being used for datadriven telemetry tests so that it can repurposed for other structured logging channels.

Epic: none

Release note: None

124403: roachtest: use first transient error when checking for flakes r=srosenberg a=renatolabs

Previously, roachtest would only look at the outermost error in a chain that matched a `TransientError` (or `ErrorWithOwnership`) when checking for flakes. However, that is in most cases *not* what we want: if a transient error wraps another transient error, the actual reason for the failure is the original (wrapped) error.

Informs: #123887

Release note: None

124425: pkg/server/structlogging: support hot ranges stats with diagnostic reporting disabled r=kyle-a-wong a=kyle-a-wong

Previously, enable hot ranges stats also required the enabling of diagnostic reporting. Hot ranges stats doesn't need to be dependent on diagnostic reporting and someone might want to enable hot ranges stats without enabling diagnostic reporting.

Now, server.telemetry.hot_ranges_stats.enabled can be set to true while without setting diagnostics.reporting.enabled

Epic: none
Fixes: #122977
Part of: https://cockroachlabs.atlassian.net/browse/CRDB-38152

Release note: None

124459: builtins: fix st_geojson when max_decimal_digits is specified r=yuzefovich a=yuzefovich

This commit fixes a regression in `st_geojson` builtin when `max_decimal_digits` argument is specified which was introduced in 6009141 (during 24.1 cycle). In particular, this overload specifies the precision (rather than using the default number of digits), and that commit made it so that we ignore the precision argument. This is now fixed.

Fixes: #124368.

Release note (bug fix): CockroachDB previously would ignore `max_decimal_digits` argument of `st_geojson` builtin function and would use the default instead. The bug is only present in 24.1.0 releases.

124491: raft: remove RawNode.TickQuiesced r=pav-kv a=nvanbenschoten

This commit removes the `(*RawNode).TickQuiesced` method. The method was deprecated back in etcd-io/raft#62 and has not been in use since 2018.

Epic: None
Release note: None

Co-authored-by: theloneexplorerquest <theloneexplorerquest@gmail.com>
Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com>
Co-authored-by: Renato Costa <renato@cockroachlabs.com>
Co-authored-by: Kyle Wong <kyle.wong@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Contributor

craig bot commented May 21, 2024

Build failed (retrying...):

@craig craig bot merged commit 7807ee2 into cockroachdb:master May 21, 2024
22 checks passed
@renatolabs renatolabs deleted the rc/roachtest-multiple-transient-errors branch May 23, 2024 06:51
@renatolabs
Copy link
Collaborator Author

blathers backport 24.1 23.2

Copy link

blathers-crl bot commented May 23, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from e24022b to blathers/backport-release-23.2-124403: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants