Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no route to host from all nodes to all services except to kubernetes service #8838

Open
UriZafrir opened this issue May 19, 2024 · 1 comment

Comments

@UriZafrir
Copy link

Hi everyone
I'm running an RKE cluster.
I have a problem in which I get "no route to host" when trying to query services from a node.

k get svc -A
argocd          argo-cd-argocd-applicationset-controller   ClusterIP      10.43.71.196    <none>                          7000/TCP                     9d
argocd          argo-cd-argocd-dex-server                  ClusterIP      10.43.60.116    <none>                          5556/TCP,5557/TCP            9d
argocd          argo-cd-argocd-redis                       ClusterIP      10.43.37.182    <none>                          6379/TCP                     9d
argocd          argo-cd-argocd-repo-server                 ClusterIP      10.43.200.3     <none>                          8081/TCP                     9d
argocd          argo-cd-argocd-server                      ClusterIP      10.43.229.66    <none>                          80/TCP,443/TCP               9d
default         kubernetes                                 ClusterIP      10.43.0.1       <none>                          443/TCP                      9d
ingress-nginx   ingress-nginx-controller                   LoadBalancer   10.43.70.189    172.20.121.173,172.20.121.174   80:30996/TCP,443:32439/TCP   9d
ingress-nginx   ingress-nginx-controller-admission         ClusterIP      10.43.137.222   <none>                          443/TCP                      9d
kube-system     kube-dns                                   ClusterIP      10.43.0.10      <none>                          53/UDP,53/TCP,9153/TCP       9d
kube-system     metrics-server                             ClusterIP      10.43.183.119   <none>                          443/TCP                      7d12h
kubeshark       kubeshark-front                            ClusterIP      10.43.200.80    <none>                          80/TCP                       7d17h
kubeshark       kubeshark-hub                              ClusterIP      10.43.162.11    <none>                          80/TCP                       7d17h
kubeshark       kubeshark-worker-metrics                   ClusterIP      10.43.64.10     <none>                          49100/TCP                    7d17h
telnet  10.43.0.10 53
Trying 10.43.0.10...
telnet: connect to address 10.43.0.10: No route to host
telnet 10.43.229.66 443
Trying 10.43.229.66...
telnet: connect to address 10.43.229.66: No route to host
telnet 10.43.70.189 80
Trying 10.43.70.189...
telnet: connect to address 10.43.70.189: No route to host
telnet 10.43.137.222 443
Trying 10.43.137.222...
telnet: connect to address 10.43.137.222: No route to host

This is the flow of debugging i did:
I got this line when using k get pods:

E0519 05:23:36.925419 1110186 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

checking the apiservices i got faileddiscovery check for metrics server:

kubectl get apiservices
v1beta1.metrics.k8s.io                 kube-system/metrics-server   False (FailedDiscoveryCheck)   7d12h

when describing the apiservice i got:

Message: failing or missing response from https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

kubectl describe apiservice v1beta1.metrics.k8s.io
E0519 05:30:38.505746 1113885 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0519 05:30:38.535446 1113885 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0519 05:30:38.538759 1113885 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0519 05:30:38.542372 1113885 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       k8s-app=metrics-server
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2024-05-11T13:38:43Z
  Resource Version:    1332438
  UID:                 ae69ae9d-f893-400b-b993-7be2e8af833b
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2024-05-11T13:38:43Z
    Message:               failing or missing response from https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

after which i tried to telnet the services and discovered the problem is not only to the metrics-server service.

Would appreciate some assistnce.

Your Environment

  • Calico version:
    v3.22.5
  • Flannel version:
    0.3.1
  • Orchestrator version:
    kubernetes v1.24.10
  • Operating System and version:
    CentOS Stream release 9
@UriZafrir UriZafrir changed the title no route to host from all nodes to all services except to kubernetes service #143 no route to host from all nodes to all services except to kubernetes service May 19, 2024
@tomastigera
Copy link
Contributor

Does you networking between nodes work? Does your pod network work? Is routing on the nodes sane? Btw. you are using a very old and now unsupported calico version. Would upgrading solve your problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants