Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACTool-Config-Worker may throw an exception because it executes very expensive queries, preventing startup #669

Open
thomasmueller opened this issue Jun 14, 2023 · 4 comments
Assignees

Comments

@thomasmueller
Copy link

The following two queries are executed at startup (thread name "Apache Sling Repository Startup Thread #1-ACTool-Config-Worker"). Depending on the content, they may try to read more than 100'000 nodes, which throws an exception, and so startup fails.

SELECT ace.* FROM [rep:ACE] AS ace 
WHERE ace.[rep:principalName] IS NOT NULL 
AND ISDESCENDANTNODE(ace, [/content])

SELECT ace.* FROM [rep:ACE] AS ace 
WHERE ace.[rep:principalName] IS NOT NULL 
AND ISDESCENDANTNODE(ace, [/apps])

Specially the first query may read too many entries. Both queries use the index /oak:index/acPrincipalName.

I think this is somewhat related to #219 - however switching to a Lucene index won't resolve the issue.

  • In general, at startup, queries should not read that many nodes. If they do, startup can be delayed too much. I would recommend to run expensive queries (if they are really needed) after startup, so that startup is not blocked. Until the query is run, a cache can be used.
  • Queries should no try to read more than about 5000 nodes. If they need to read more, then keyset pagination should be used, see also https://jackrabbit.apache.org/oak/docs/query/query-engine.html#keyset-pagination
@ghenzler ghenzler self-assigned this Jun 22, 2023
@ghenzler
Copy link
Member

@thomasmueller Is the problem happening during image build or during startup of k8s pod?

Overall the following should be true:

  • During image build, the AC Tool will run the config against the whole repo. However then the repo should not have more than 5000 ACE nodes (mostly OOTB ACEs in /libs folder from ootb AEM packages)
  • During pod startup, the AC Tool will run the config against the repo without paths /apps and /libs. It normally does this async if not configured differently (maybe that is the problem in your case)

You can check the the following code to see what is happening here:

and you can check for the respective log messages in your setup.

@thomasmueller
Copy link
Author

Hi,

Thanks a lot! The problem I see, it is run at AEM startup of k8s (author), against the whole repository (including /content). This is in the Repository Startup thread; it is blocking the startup.

It normally does this [async]

Great! So maybe the case I saw had a non-default configuration! How can this be configured? It then might just be a matter of explaining this; possibly improving the documentation.

@kwin
Copy link
Member

kwin commented Feb 15, 2024

It then might just be a matter of explaining this; possibly improving the documentation.

This OSGi configuration is not documented at all, so hopefully no one deviates from the default without knowing exactly what to do here:

@kwin
Copy link
Member

kwin commented Feb 16, 2024

The issue seems to be the evolution of the AEMaaCS build pipeline as outlined in https://adapt.to/2023/schedule/evolution-of-the-aemaacs-build-pipeline.

  1. The build image step does no longer involve custom code
  2. During the deploy step the composite node store seems to be used in seed mode (i.e. the method will never return true in AEMaaCS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants