-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extra metric relabelings to scrape classes #6492
Add extra metric relabelings to scrape classes #6492
Conversation
b10ca78
to
4e0d8f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issue for me is that the scrape class metrics relabeling happens after the scrape object's metrics relabeling. Other scrape class fields behave differently and can be overridden by scrape objects. I suppose that you have a specific requirement in mind but the inconsistency brings a UX problem.
4e0d8f9
to
bb983f6
Compare
bb983f6
to
622d7c7
Compare
622d7c7
to
01933cb
Compare
@simonpasquier Yes. This is described in the doc, but if this is not enough we can use two configurations: |
Can you share more details about why you need scrape class relabeling to happen after service/pod monitor relabelings rather than before? |
Yes. As described in the issue, we have multi-tenant clusters, and each tenant is defined by a list of namespaces or a list of namespace regexps. We want the metrics relabeling to happen after the namespace label enforcement, to avoid a tenant injecting metrics into another tenant. If scrape class metric relabelings happen before scrape resource metric relabelings, a malicious actor can easily change the |
To be clear: you're already setting |
Yes.
I forgot about those, I also need the tenant label for recording/alerting rules. Thanks 🙏 . Maybe a new |
🤔 I don't see an obvious solution for rules since relabeling can't be applied here...
I suppose that the latter is because you deploy generic rules generating namespace-aware metrics/alerts and you still want the metrics/alerts to have the right tenant? There might still be other use cases that need to skip "enforced" metric relabelings for some monitor/rule resources though. |
Nevertheless this PR (with scrape class' relabeling happening before monitor's metric relabeling) is worth the effort even if it's not addressing @sathieu requirements. |
01933cb
to
8ce6306
Compare
I've updated the PR to : // The Operator adds the scrape class metric relabelings defined here.
// Then the Operator adds the target-specific metric relabelings defined in the scrape object.
// Then the Operator adds namespace enforcement relabeling rule. I'll handle my usecase in another PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sathieu, sorry for taking so long to give you an answer here, I saw the failing tests and assumed the PR was not ready. I took a deeper look today and I think the failing tests are flaky ones. I've re-triggered them to be sure... let's see
8ce6306
to
a914e2d
Compare
Thanks @ArthurSens I've added your suggestion. I also had to rebase to resolve conflicts. Back to you! |
a914e2d
to
850cfc2
Compare
850cfc2
to
fff1d0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for tackling this!
I've made a few nitpick comments, but if you want to focus on something I'd say improving test readability is what could receive some love right now.
Approving the PR because this seems to work as expected
// MetricRelabelings configures the relabeling rules to apply to all samples before ingestion. | ||
// | ||
// The Operator adds the scrape class metric relabelings defined here. | ||
// Then the Operator adds the target-specific metric relabelings defined in the scrape object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Then the Operator adds the target-specific metric relabelings defined in the scrape object. | |
// Then the Operator adds the target-specific metric relabelings defined in ServiceMonitors, PodMonitors, Probes and ScrapeConfigs. |
// | ||
// The Operator adds the scrape class metric relabelings defined here. | ||
// Then the Operator adds the target-specific metric relabelings defined in the scrape object. | ||
// Then the Operator adds namespace enforcement relabeling rule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Then the Operator adds namespace enforcement relabeling rule. | |
// Then the Operator adds namespace enforcement relabeling rule, specified in '.spec.enforcedNamespaceLabel'. |
pkg/prometheus/promcfg_test.go
Outdated
@@ -8783,3 +8783,205 @@ func TestPodMonitorWithNonDefaultScrapeClassRelabelings(t *testing.T) { | |||
require.NoError(t, err) | |||
golden.Assert(t, string(cfg), "podMonitorObjectWithNonDefaultScrapeClassWithRelabelings.golden") | |||
} | |||
|
|||
func TestServiceMonitorWithDefaultScrapeClassMetricRelabelings(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome tests!
We could improve code readability here if we reduce code duplication using table tests
pkg/prometheus/promcfg_test.go
Outdated
prometheus.Spec.ScrapeClasses = scrapeClasses | ||
cg := mustNewConfigGenerator(t, prometheus) | ||
|
||
cfg, err := cg.GenerateServerConfiguration( | ||
context.Background(), | ||
prometheus.Spec.EvaluationInterval, | ||
prometheus.Spec.QueryLogFile, | ||
prometheus.Spec.RuleSelector, | ||
prometheus.Spec.Exemplars, | ||
prometheus.Spec.TSDB, | ||
prometheus.Spec.Alerting, | ||
prometheus.Spec.RemoteRead, | ||
map[string]*monitoringv1.ServiceMonitor{"monitor": serviceMonitor}, | ||
nil, | ||
nil, | ||
nil, | ||
&assets.StoreBuilder{}, | ||
nil, | ||
nil, | ||
nil, | ||
nil, | ||
) | ||
require.NoError(t, err) | ||
golden.Assert(t, string(cfg), "serviceMonitorObjectWithDefaultScrapeClassWithMetricRelabelings.golden") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean with code duplication is that this part is basically identical in all tests. We could have a single function with different inputs instead
testCases := []struct{
scrapeClass monitoringv1.ScrapeClass
podMonitors map[string]monitoringv1.Podmonitor // can be nil
serviceMonitors map[string]monitoringv1.ServiceMonitor // can be nil
probes map[string]monitoringv1.Probe // can be nil
scrapeConfigs map[string]monitoringv1alpha1.ScrapeConfig // can be nil
goldenFile string
}
You can put the implementation of the test inside a loop and ensure that your input results are equal to the appropriate golden file
fff1d0d
to
ddc3b9e
Compare
@nicolastakashi I've addressed your comments, rebased (there was a conflict because of 642d9cd) and force-pushed. Back to you 🏀 ! |
Signed-off-by: Mathieu Parent <math.parent@gmail.com>
ddc3b9e
to
555fa80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll let @ArthurSens and/or @nicolastakashi merge if they are happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks @sathieu !
Description
See #5947 (comment)
Inspired by #6379.
Type of change
What type of changes does your code introduce to the Prometheus operator? Put an
x
in the box that apply.CHANGE
(fix or feature that would cause existing functionality to not work as expected)FEATURE
(non-breaking change which adds functionality)BUGFIX
(non-breaking change which fixes an issue)ENHANCEMENT
(non-breaking change which improves existing functionality)NONE
(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)Verification
Please check the Prometheus-Operator testing guidelines for recommendations about automated tests.
Changelog entry
Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.