-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Healthcheck service #24817
base: master
Are you sure you want to change the base?
Healthcheck service #24817
Conversation
Changed Packages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🎉
Couple of initial high level bits
const router = Router(); | ||
let handler = () => Promise.resolve({ status: 'ok' }); | ||
|
||
router.get('/healthcheck', async (_request: Request, response: Response) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking we should have this separated from the main routes, probably something like /.backstage/health
, but potentially grouped a bit
export interface HttpRouterHealthCheckConfig { | ||
handler: () => Promise<any>; | ||
} | ||
|
||
/** | ||
* @public | ||
*/ | ||
export interface HttpRouterService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about keeping health checks at the root level instead? Something like /.backstage/health/v1/readiness
for checking whether the entire instance is up, and /.backstage/health/v1/<plugin-id>/readiness
for individual plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the other way around, /.backstage/health/v1/readiness/<plugin-id>
// @public (undocumented) | ||
export interface HttpRouterService { | ||
// (undocumented) | ||
addAuthPolicy(policy: HttpRouterServiceAuthPolicy): void; | ||
// (undocumented) | ||
healthCheckConfig(healthCheckOptions: HttpRouterHealthCheckConfig): void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this should live on the http router tbh. Or framed a bit differently: if we can make this a separate service we probably should. I'm thinking this should probably be additive as well so that it's possible to add health checks from modules etc.
// @public (undocumented) | ||
export interface HttpRouterHealthCheckConfig { | ||
// (undocumented) | ||
handler: () => Promise<any>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to consider a liveness + readiness split. It can help avoid situations where instances are shut down because they timed out at startup, while still being able to be pretty aggressive about timeouts when it comes to whether the service is functional at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have any suggestion for a possible root level readiness implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 🎉, left minor comments. Could we also deprecate the legacy handler from the backend-common
package as well?
packages/backend-app-api/src/services/implementations/httpRouter/createHealthcheck.ts
Outdated
Show resolved
Hide resolved
packages/backend-app-api/src/services/implementations/httpRouter/httpRouterServiceFactory.ts
Outdated
Show resolved
Hide resolved
packages/backend-plugin-api/src/services/definitions/HttpRouterService.ts
Outdated
Show resolved
Hide resolved
3ede1f0
to
74a6836
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, left minor comments 🙌🏻
@@ -58,6 +59,7 @@ export const defaultServiceFactories = [ | |||
userInfoServiceFactory(), | |||
urlReaderServiceFactory(), | |||
eventsServiceFactory(), | |||
healthServiceFactory(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we have also to provide a mock implementation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not used anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this file can be deleted, right?
'@backstage/backend-app-api': patch | ||
--- | ||
|
||
Added a new health service which adds new endpoints for health checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice add this service to the core services docs:
https://backstage.io/docs/backend-system/core-services/index
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
cf48e42
to
8c5ca35
Compare
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
8c5ca35
to
926d6c9
Compare
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👍
Couple of nits
* The service reference for the plugin scoped {@link HealthService}. | ||
*/ | ||
export const health = createServiceRef< | ||
import('./HealthService').HealthService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's prolly best to keep this as void
instead, since now you can pass anything instead. We can still add methods in the future even if it starts out as void
.
router.get('/v1/readiness', async (_request: Request, response: Response) => { | ||
if (!isRunning) { | ||
throw new Error('Backend has not started yet'); | ||
} | ||
response.json({ status: 'ok' }); | ||
}); | ||
|
||
router.get('/v1/liveness', async (_request: Request, response: Response) => { | ||
response.json({ status: 'ok' }); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a quick description to each of these with what their purpose is?
|
||
router.get('/v1/readiness', async (_request: Request, response: Response) => { | ||
if (!isRunning) { | ||
throw new Error('Backend has not started yet'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking this should be a 503 ServiceUnavailableError
. Perhaps we shouldn't throw an error in this case either, so that we can have a more consistent response? Instead setting an explicit status code and returning whatever the opposite of status: 'ok'
is
4fd88dc
to
cdc62c9
Compare
* The service reference for the plugin scoped {@link HealthService}. | ||
*/ | ||
export const health = createServiceRef< | ||
import('./HealthService').HealthService | ||
>({ id: 'core.health', scope: 'root' }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The service reference for the plugin scoped {@link HealthService}. | |
*/ | |
export const health = createServiceRef< | |
import('./HealthService').HealthService | |
>({ id: 'core.health', scope: 'root' }); | |
* The service reference for the root scoped {@link RootHealthService}. | |
*/ | |
export const rootHealth = createServiceRef< | |
import('./RootHealthService').RootHealthService | |
>({ id: 'core.rootHealth', scope: 'root' }); |
Every single other root scoped thing is named like this - we should probably follow suit here. Make sure to change to that naming pattern everywhere else too.
I'd also be interested in what the future might be in terms of per-plugin liveness/health. That's sort of where the most interesting developments might be in the future. This root scoped one could at most do simple lifecycle matching like it does today, or possibly offer some form of rollup of the states of all installed plugins (not even sure that's a good idea though). I mean, how often do we really desire to look at the liveness of effectively a machine that hosts many plugins? What does that even mean?
Hey, I just made a Pull Request!
This PR adds a new healthcheck service which adds a bunch of endpoints to the root HTTP Router
Closes #24548
✔️ Checklist
Signed-off-by
line in the message. (more info)