Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option dbfilesize to control LevelDB target ("max") file size #30059

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

luke-jr
Copy link
Member

@luke-jr luke-jr commented May 8, 2024

Debug option to control LevelDB file sizes. Since LevelDB seems to always overshoot, "max" didn't seem fitting.

Intended to be followed up with a change to the default in a rebased #30039

@DrahtBot
Copy link
Contributor

DrahtBot commented May 8, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK tdb3

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28358 (Drop -dbcache limit by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot mentioned this pull request May 8, 2024
@laanwj
Copy link
Member

laanwj commented May 9, 2024

i'm broadly on board with making this an option (seperately from changing the default in #30039). Assuming there are legit reasons for varying this based on the kind of server, operating system, amount of RAM, kind of disk, etc, and there in't a single sweet spot.

Copy link
Contributor

@tdb3 tdb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

Thank you. Seems reasonable to include this as a debug option, and increases flexibility over the single value chosen for PR #30039. Left an inline comment/question.

Ran a quick sanity check:
With dbfilesize=64 set (along with txindex=1), performed an IBD on Signet from a node on the same LAN, then stopped and restarted bitcoind. Saw chainstate and indexes/txindex files that were close to 64MB (blocks/index for signet was less than 64MB, but still had a file over 2MB).

@@ -448,6 +449,11 @@ void SetupServerArgs(ArgsManager& argsman)
argsman.AddArg("-datadir=<dir>", "Specify data directory", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
argsman.AddArg("-dbfilesize",
strprintf("Target size of files within databases, in MiB (%u to %u, default: %u).",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text states a range of 1MiB to 1024MiB, but these values didn't seem to be enforced (e.g. 0.5, -3, and 1025 were not rejected). Is the intent here to provide guidance for the user (i.e. tell the user to choose 1 to 1024 MiB) rather than enforce a specific range of values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LevelDB itself enforces it. Would it make sense to check it redundantly on our end to control error behaviour? (But then we have to remember to keep it in sync with LevelDB... but maybe just having the docs necessitates that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, redundant checking of range (beyond LevelDB) seems a bit much for a debug option (which is used by a more niche userbase). The range in the help message seems to be more of an example of a usable range rather than a recommendation of range or a strictly enforced range. The default value would be the defacto recommendation.

If we're set on continuing to have LevelDB handle the range rather than Core, it seems like it would make sense to inform the user in the help message that the range is an example. This could help prevent a developer from wasting time on a PR that misses the intent (e.g. could envision a PR starting with: ...dbfilesize specifies a range of allowable values, but doesn't enforce this range. This PR fixes this...).

It might also help to clarify to the user which files this option impacts.

This is a bit of a nit of a nit, so feel free to disregard it. Example:

diff --git a/src/init.cpp b/src/init.cpp
index 5cc513e18c9..f4be033f1c1 100644
--- a/src/init.cpp
+++ b/src/init.cpp
@@ -450,7 +450,7 @@ void SetupServerArgs(ArgsManager& argsman)
     argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
     argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
     argsman.AddArg("-dbfilesize",
-                   strprintf("Target size of files within databases, in MiB (%u to %u, default: %u).",
+                   strprintf("Target size of files within databases (chainstate, indexes), in MiB (example: %u to %u, default: %u).",
                              1, 1024,
                              DEFAULT_DB_FILE_SIZE),
                    ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);

@sipa
Copy link
Member

sipa commented May 17, 2024

I think it would be good to understand what the trade-offs are for this before considering making it configurable (as in: understand in what situations we'd want to recommend people use lower vs higher values). I assume lower values mean more open files/IO and less outdated data on disk, and higher values mean fewer compactions. Depending on whether those are the only effects, and to what extent these things are affected by the value, maybe we can get away with a default value that's a few % of the expected total db size.

FWIW, RocksDB uses a default of 64 MiB, and has an option (default off) to make higher database-level files become larger (e.g.: level L files would target 64 * (2^L) MiB).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants