Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning: when lightning exits while processing the task, it should return 1 #53381

Merged
merged 8 commits into from
May 21, 2024

Conversation

zeminzhou
Copy link
Contributor

@zeminzhou zeminzhou commented May 20, 2024

What problem does this PR solve?

Issue Number: close #53384

Problem Summary:

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  1. start tidb cluster
  2. start lightning server
  3. send import task to lightning server
  4. send SIGINT to lightning server
  5. lightning server exit with error code 1
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: zeminzhou <zhouzemin@pingcap.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked release-note-none size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 20, 2024
Copy link

tiprow bot commented May 20, 2024

Hi @zeminzhou. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented May 20, 2024

Codecov Report

Attention: Patch coverage is 88.88889% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 74.6346%. Comparing base (397a460) to head (b585e17).
Report is 10 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #53381        +/-   ##
================================================
+ Coverage   72.5545%   74.6346%   +2.0801%     
================================================
  Files          1505       1527        +22     
  Lines        429830     438838      +9008     
================================================
+ Hits         311861     327525     +15664     
+ Misses        98694      90740      -7954     
- Partials      19275      20573      +1298     
Flag Coverage Δ
integration 50.7137% <88.8888%> (?)
unit 71.3545% <ø> (-0.0919%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 53.9957% <ø> (ø)
parser ∅ <ø> (∅)
br 50.3654% <ø> (+8.9614%) ⬆️

@lance6716 lance6716 changed the title lightning: when lightning exits while processing the task, it shoulud return 1 lightning: when lightning exits while processing the task, it should return 1 May 20, 2024
@@ -99,7 +99,9 @@ func main() {
finished := true
if common.IsContextCanceledError(err) {
err = nil
finished = false
if app.TaskCanceled() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see lightning CI failed. To reduce behaviour change, you can check it's ServerMode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was fixed, thanks! PTAL~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For non-server mode it's a compatibility change. I need to ask PM about it tomorrow. It may break user's script.

Copy link
Contributor

@D3Hunter D3Hunter May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeminzhou anything affected by current behavior of lightning? any workaround?

Copy link
Contributor Author

@zeminzhou zeminzhou May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cloud, when k8s' job controller reschedules lightning pod to other node(because current node is evicted), job controller will send SIGINT to lightning pod. Because lightning pod returns 0, job controller think the job is done and will not reschedule it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added compatibility with the previous non-server mode. /cc @lance6716

Signed-off-by: zzm <zhouzemin@pingcap.com>
Signed-off-by: zzm <zhouzemin@pingcap.com>
Signed-off-by: zzm <zhouzemin@pingcap.com>
Signed-off-by: zzm <zhouzemin@pingcap.com>
Signed-off-by: zzm <zhouzemin@pingcap.com>
Signed-off-by: zzm <zhouzemin@pingcap.com>
Signed-off-by: zzm <zhouzemin@pingcap.com>
Copy link

ti-chi-bot bot commented May 21, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, lance6716

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

ti-chi-bot bot commented May 21, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-05-21 05:49:23.890847269 +0000 UTC m=+2150717.647982842: ☑️ agreed by lance6716.
  • 2024-05-21 05:58:01.454702995 +0000 UTC m=+2151235.211838568: ☑️ agreed by D3Hunter.

@D3Hunter
Copy link
Contributor

there's no test, so i unchecked unit test

@zeminzhou
Copy link
Contributor Author

/test check-dev

Copy link

tiprow bot commented May 21, 2024

@zeminzhou: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test check-dev

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot merged commit aeafb18 into pingcap:master May 21, 2024
24 checks passed
RidRisR pushed a commit to RidRisR/tidb that referenced this pull request May 23, 2024
@zeminzhou zeminzhou deleted the zeminzhou/not-return-zero branch May 23, 2024 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

should not return zero when lightning exits while processing the task
3 participants