Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include default requests/limits for child pods #348

Open
ohthehugemanatee opened this issue Apr 9, 2023 · 2 comments
Open

Include default requests/limits for child pods #348

ohthehugemanatee opened this issue Apr 9, 2023 · 2 comments

Comments

@ohthehugemanatee
Copy link

Since a good deal of netdata's value is in keeping its' footprint as small as possible, and instances are stateless, adding default limits for child pods seem to make sense. I would include parent and k8s-state too, but AFAIK those are relative to the size/complexity of the cluster and therefore hard to predict.

Setting resource requests is just good practice. Setting limits prevent runaway processes.

Based on the docs I suggest defaulting the child to

resources:
  requests:
    cpu: 150m
    memory: 200Mi
  limits:
    memory: 250Mi
@ilyam8
Copy link
Member

ilyam8 commented Apr 9, 2023

I would include parent and k8s-state too, but AFAIK those are relative to the size/complexity of the cluster and therefore hard to predict.

Hi, @ohthehugemanatee. I think we have the same situation with the children nodes - it is hard to predict, depending on the number of pods per node and the workload (e.g. a lot of cronjobs (many short-lived pods/containers)).

Setting resource requests is just good practice

You are 100% right. As it is now - it is up to the user: to install, check resource usage and set the limits. The default values (not only the limits) are not production ready.

@ohthehugemanatee
Copy link
Author

default values (not only the limits) are not production ready.

OK so this is the first problem to solve. From my understanding in a k8s situation the children should be very lightweight; it's the parent that really gets big. So a start would be to approach this only for the children.

What about using the calculation from docs as a starting point? Add the ephemerality and tiers-related config values to the helm chart values, so we can use helm arithmetic to arrive at a high estimate default value. In the output after install, add a line to say "we set memory limits based on an estimate of your child node RAM requirements. There is almost certainly room to reduce these limits, see https://learn.netdata.cloud/... for more information.

Even better would be to request feedback on a GH issue if the defaults suck, but you might get that regardless. 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants