Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support time based retention #17413

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

stelfrag
Copy link
Collaborator

@stelfrag stelfrag commented Apr 16, 2024

Summary

This PR introduces several changes

  • Update every for high resolution tier can be one of (1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30)
    • An invalid value specified will fallback to the previous valid one (see list above)
      • eg. value of 25 will fallback to 20
  • Default tiers enabled 3
    • Tier 0 - high resolution (update every)
    • Tier 1 - per minute
    • Tier 2 - per hour
    • Configure disk space using dbengine tier x disk space MB = nnnn
      • eg For tier 0, use dbengine tier 0 disk space MB = 1024
    • Configure time retention using dbengine tier x retention days = nnnn
      • eg For tier 0, use dbengine tier 0 retention days = 14
    • Configure collection frequency in seconds using dbengine tier x frequency = nnnn
      • eg For tier 1, use dbengine tier 1 frequency = 60 to specify that this tier will store values
        every 60 seconds
  • Backfill option is now global for all tiers (none, full, new)
    • option dbengine tier backfill
  • dbengine tier x disk space MB options can be 0 to use as much space needed for the desired time based retention
    • The remaining disk space for each tier (minus 5%) will be used for the dbengine retention chart and alert

  • High resolution tier (tier 0)
    • Default diskspace 1GB or 14 days of data
    • Store values per update every (see above)
  • Tier 1
    • Default diskspace 1GB or 90 days of data
    • Store values per minute
  • Tier 2
    • Default diskspace 1GB or 2 years (2 x 365 days) of data
    • Store values per hour
  • Retention charts that store the percentage of space and time used (vs the configured values)

@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 3 times, most recently from db8ff30 to 56f2033 Compare April 23, 2024 11:36
@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 2 times, most recently from d280a21 to fbec2cb Compare April 29, 2024 06:35
@ilyam8 ilyam8 force-pushed the limit_tier0_update_every branch from fb168d5 to b3f487e Compare May 6, 2024 10:26
@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 2 times, most recently from 15c5f52 to 6dc60fa Compare May 13, 2024 10:57
@thiagoftsm
Copy link
Contributor

thiagoftsm commented May 16, 2024

@stelfrag I ran different tests with this PR, and I did not observe anything anomalous:

  • I compiled on a host without previous netdata installation
  • I tested on a host with netdata default options running and the PR ran normally; After this I changed the default collection time to one invalid, and I had the error message. Netdata continued running as expected.
  • I also tested using ram mode to be sure nothing was changed with it.

Tests were done compiling with netdata-installer.

@ktsaou
Copy link
Member

ktsaou commented May 16, 2024

@stelfrag please rebase this to test it.

Rename options
Global backfill
Metadata calculation (percentage)
Retention timer
Adjust time
Calculate iterations
Update every is less than 60 and divisor of 60
dbengine tier x retention days
Switch to dbengine tier x disk space MB
Fix grouping iterations
Assume retention to be the one specified with time
Sane value for maximum datafile target size if max disk space is 0
Rework human readable retention and expected retention in nodes_instances api
If no time restriction is specified, use disk space calculated one
Remove duplicate, commented out code
Support seconds in human duration representation
Remove commented out code
Create tiers as needed
Compile with disable-ml properly
Do proper time retention check
Temporary additional info in node_instances api
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants