Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

endpointmanager: Skip warning logs for endpoints being removed #32619

Merged

Conversation

jrajahalme
Copy link
Member

Skip warning log for failed policy update for endpoints that are being removed.

This avoids unnecessary warnings (and CI flakes) like:

Found 1 k8s-app=cilium logs matching list of errors that must be investigated:
2024-05-19T16:15:52.160976650Z time="2024-05-19T16:15:52Z" level=warning msg="Failed to apply policy map changes. These will be re-applied in future updates." ciliumEndpointName=kube-system/coredns-7db6d8ff4d-t6sjv containerID=7120688df8 containerInterface= datapathPolicyRevision=1 desiredPolicyRevision=1 endpointID=370 error="lock failed: endpoint is in the process of being removed" identity=42104 ipv4=10.0.0.46 ipv6="fd02::a8" k8sPodName=kube-system/coredns-7db6d8ff4d-t6sjv subsys=endpointmanager

Callers of endpoint functions should not need to care if the function
failed due to lockAlive() or rlockAlive() finding the endpoint in the
state of being deleted. Use ErrNotAlive for both cases. This simplifies
the call sites testing for this error condition.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Skip warning log for failed policy update for endpoints that are being removed.

This avoids unnecessary warnings (and CI flakes) like:

    Found 1 k8s-app=cilium logs matching list of errors that must be investigated:
    2024-05-19T16:15:52.160976650Z time="2024-05-19T16:15:52Z" level=warning msg="Failed to apply policy map changes. These will be re-applied in future updates." ciliumEndpointName=kube-system/coredns-7db6d8ff4d-t6sjv containerID=7120688df8 containerInterface= datapathPolicyRevision=1 desiredPolicyRevision=1 endpointID=370 error="lock failed: endpoint is in the process of being removed" identity=42104 ipv4=10.0.0.46 ipv6="fd02::a8" k8sPodName=kube-system/coredns-7db6d8ff4d-t6sjv subsys=endpointmanager

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme added release-note/misc This PR makes changes that have no direct user impact. ci/flake This is a known failure that occurs in the tree. Please investigate me! sig/agent Cilium agent related. labels May 19, 2024
@jrajahalme jrajahalme requested a review from a team as a code owner May 19, 2024 19:12
@jrajahalme
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 20, 2024
@jrajahalme jrajahalme added this pull request to the merge queue May 20, 2024
Merged via the queue into cilium:main with commit f11d132 May 20, 2024
66 checks passed
@jrajahalme jrajahalme deleted the endpoint-being-removed-suppress-warnings branch May 20, 2024 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/flake This is a known failure that occurs in the tree. Please investigate me! ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/agent Cilium agent related.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants