Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spacetime] Fields metadata services #183806

Merged
merged 65 commits into from
Jun 5, 2024

Conversation

tonyghiani
Copy link
Contributor

@tonyghiani tonyghiani commented May 20, 2024

📓 Summary

Closes https://github.com/elastic/observability-dev/issues/3331

Given the needs described in the linked issue about having a centralized and async way to consume field metadata across Kibana, this work focuses on providing server/client services to consume field metadata on demand from static ECS definition and integration manifests, with the chance to extend further the possible resolution sources.

demo.mov

💡 Reviewers hints

This PR got quite long as it involves and touches different parts of the codebase, so I'll break down the interesting parts for an easier review.

More details, code examples and mechanics description can be found in the README file for the plugin.

@kbn/fields-metadata-plugin

To avoid bundling and consuming the whole ECS static definition client-side, a new plugin @kbn/fields-metadata-plugin is created to expose the server/client services which enable retrieving only the fields needed on a use-case basis.

FieldsMetadataService server side

A FieldsMetadataService is instantiated on the plugin setup/start server lifecycle, exposing a client to consume the fields and setup tools for registering external dependencies.

The start contract exposes a FieldsMetadataClient instance. With this, any application in Kibana can query for some fields using the available methods, currently:

  • FieldsMetadataClient.prototype.getByName(): retrieves a single FieldMetadata instance.
  • FieldsMetadataClient.prototype.find(): retrieves a record of matching FieldMetadata instances.

FieldsMetadataClient is instantiated with the source repositories dependencies. They act as encapsulated sources which are responsible for fetching fields from their related source. Currently, there are 2 field repository sources used in the resolution step, but we can use this concept to extend the resolution step in future with more sources (LLM, OTel, ...).
The currently used sources are:

  • EcsFieldsRepository: allows fetching static ECS field metadata.
  • IntegrationFieldsRepository: allows fetching fields from an integration package from EPR, where the fields metadata are stored. To correctly consume these fields, the fleet plugin must be enabled, otherwise, the service won't be able to access the registered fields extractor implemented with the fleet services.
    As this service performs a more expensive retrieval process than the EcsFieldsRepository constant complexity access, a caching layer is applied to the retrieved results from the external source to minimize latency.

Fields metadata API

To expose this service to the client, a first API endpoint is created to find field metadata and filter the results to minimize the served payload.

  • GET /internal/fields_metadata/find supports some initial query parameters to narrow the fields' search.

FieldsMetadataService client side

As we have a server-side FieldsMetadataService, we need a client counterpart to consume the exposed API safely and go through the validation steps.

The client FieldsMetadataService works similarly to the server-side one, exposing a client which is returned by the public start contract of the plugin, allowing any other to directly use fields metadata client-side.

This client would work well with existing state management solutions, as it's not decoupled from any library.

useFieldsMetadata

For simpler use cases where we need a quick and easy way to consume fields metadata client-side, the plugin start contract also exposes a useFieldsMetadata react custom hook, which is pre-created accessing the FieldsMetadataService client described above. It is important to retrieve the hook from the start contract of this plugin, as it already gets all the required dependencies injected minimizing the effort on the consumer side.

The UnifiedDocViewer plugin changes exemplify how we can use this hook to access and filter fields' metadata quickly.

registerIntegrationFieldsExtractor (@elastic/fleet)

Getting fields from an integration dataset is more complex than accessing a static dictionary of ECS fields, and to achieve that we need access to the PackageService implemented by the fleet team.

To get access to the package, maintain a proper separation of concerns and avoid a direct dependency on the fleet plugin, some actions were taken:

  • the PackageService.prototype.getPackageFieldsMetadata() method is implemented to keep the knowledge about retrieving package details on this service instead of mixing it on parallel services.
  • a fleet registerIntegrationFieldsExtractor service is created and used during the fleet plugin setup to register a callback that accesses the service as an internal user and retrieves the fields by the given parameters.
  • the fields metadata plugin returns a registerIntegrationFieldsExtractor function from its server setup so that we can use it to register the above-mentioned callback that retrieves fields from an integration.

This inverts the dependency between fields_metadata and fleet plugins so that the fields_metadata plugin keeps zero dependencies on external apps.

Adoption

We currently have places where the @elastic/ecs package is directly accessed and where we might be able to refactor the codebase to consume this service.

EcsFlat usages in Kibana

@apmmachine
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

Marco Antonio Ghiani and others added 26 commits May 20, 2024 09:09
Copy link
Contributor

@mattkime mattkime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a number of comments which are more discussion items than requirements for approval. I'll approve once we've had a discussion.

Copy link
Contributor

@mattkime mattkime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, we had a good conversation about choices made. This is a very nice step forward independent of whether follow up work occurs or not.

Copy link
Contributor

@juliaElastic juliaElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fleet changes LGTM, I don't see any impact on existing service methods.
It would be good to cover the changes with unit tests.

@tonyghiani
Copy link
Contributor Author

Fleet changes LGTM, I don't see any impact on existing service methods. It would be good to cover the changes with unit tests.

@juliaElastic a test for the getPackageFieldsMetadata was not so trivial as there are missing dependencies (tests for getPackage method) so I added a test for the logic that resolves the fields for specific data streams, which is the core piece of the whole getPackageFieldsMetadata given a resolved package.

The PackageService related method is tested here along the other methods.

@stratoula
Copy link
Contributor

What a beautiful drop 😍 👏

image

Copy link
Contributor

@flash1293 flash1293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonyghiani Something that probably should be considered - as the fields extracted from integrations are cached on the server, it's possible for them to go stale:

  • Kibana is started with package abc in version 1.0
  • User happily uses the field meta service
  • User upgrades the package to version 1.1 which adds a lot of new fields
  • User is confused about what the new fields mean, but they won't get help in the UI, because the Kibana server still has the fields from version 1.0 of the abc package cached and will return them
  • User needs to restart Kibana to be able to get the field help

Doesn't need to happen on this PR, I think we should get it in, but IMHO we should take care of this as especially after a package upgrade it's important to users to have accurate information.

Maybe I read the code wrong and that's already considered - if not, adding the version of the package to the cache key should resolve it.

@tonyghiani
Copy link
Contributor Author

Something that probably should be considered - as the fields extracted from integrations are cached on the server, it's possible for them to go stale:

@flash1293 this is an excellent point, thanks for raising it! It is not currently handling this scenario, and I'm 100% in this should be handled properly, clearing the cache as soon as a new package version is released and the schema might change.

I'll make a note of this and take car of this behaviour in a follow-up work to keep the scope of this PR the same, as there are plans to revisit the cache also for client-side requests.

Copy link
Contributor

@achyutjhunjhunwala achyutjhunjhunwala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this PR mate, LGTM, left quite small nits

import pick from 'lodash/pick';
import { FieldAttribute, FieldMetadataPlain, PartialFieldMetadataPlain } from '../types';

// eslint-disable-next-line @typescript-eslint/no-empty-interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you please also add a comment above as to why the ES Lint exception is in place. This helps in future to navigate why a certain line was added

@@ -0,0 +1,8 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Liked the idea to call it the latest, you can then switch the versions in this file and others could still continue to use the latest. Hopefully it does not break for them as they will not be aware that latest has now changed. What about also exposing this V1 in addition to latest. This way consumers has choice to either import latest or specific version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, also the documentation suggests it!

The reason I avoided this is that we don't only export types but also runtime types, which would increase the size of the bundle. We can add this as soon as we need it and keep the bundle small otherwise :)

describe('useFieldsMetadata', () => {
let fieldsMetadataClient: jest.Mocked<IFieldsMetadataClient>;
beforeEach(async () => {
jest.clearAllMocks();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I always feel with this pattern, that we have a spillover somewhere. Can we add clearAllMocks in an afterEach instead here

```ts
const fields = await client.find({
fieldNames: ['@timestamp', 'onepassword.client.platform_version'],
integration: '1password'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:missing a comma here

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
aiAssistantManagementSelection 65 66 +1
apm 1800 1801 +1
dataQuality 72 73 +1
datasetQuality 194 195 +1
fieldsMetadata - 52 +52
infra 1569 1570 +1
logsExplorer 597 598 +1
logsShared 224 225 +1
observability 490 491 +1
observabilityAIAssistantApp 242 243 +1
observabilityAiAssistantManagement 135 136 +1
observabilityLogsExplorer 198 199 +1
observabilityOnboarding 200 201 +1
profiling 302 303 +1
securitySolution 5495 5496 +1
serverlessObservability 36 37 +1
slo 846 847 +1
triggersActionsUi 735 736 +1
unifiedDocViewer 246 244 -2
ux 198 199 +1
total +68

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/io-ts-utils 59 60 +1
fieldsMetadata - 39 +39
fleet 1208 1213 +5
total +45

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
fieldsMetadata - 63.8KB +63.8KB
unifiedDocViewer 849.8KB 165.4KB -684.4KB
total -620.6KB

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
fieldsMetadata - 7 +7
fleet 69 71 +2
total +9

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
fieldsMetadata - 4.6KB +4.6KB
unifiedDocViewer 11.4KB 11.5KB +34.0B
total +4.6KB
Unknown metric groups

API count

id before after diff
@kbn/io-ts-utils 59 60 +1
fieldsMetadata - 39 +39
fleet 1329 1334 +5
total +45

async chunk count

id before after diff
fieldsMetadata - 1 +1

ESLint disabled in files

id before after diff
fieldsMetadata - 2 +2

ESLint disabled line counts

id before after diff
fieldsMetadata - 8 +8

Total ESLint disabled count

id before after diff
fieldsMetadata - 10 +10

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@tonyghiani tonyghiani merged commit 0a0853b into elastic:main Jun 5, 2024
36 checks passed
@tonyghiani tonyghiani deleted the spacetime/fields-metadata-plugin branch June 5, 2024 07:51
@kibanamachine kibanamachine added v8.15.0 backport:skip This commit does not require backporting labels Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Spacetime Team:Fleet Team label for Observability Data Collection Fleet team v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet