indexes: Don't wipe indexes again when continuing a prior reindex #30132

TheCharlatan · 2024-05-17T20:08:17Z

When restarting bitcoind during an ongoing reindex without setting the -reindex flag again, the block and coins db is left intact, but any data from the optional indexes is discarded. While not a bug per se, wiping the data again is
wasteful, both in terms of having to write it again, as well as potentially leading to longer startup times. So keep the index data instead when continuing a prior reindex.

Also includes a bugfix and smaller code cleanups around the reindexing code. The bug was introduced in b47bd95: "kernel: De-globalize fReindex".

DrahtBot · 2024-05-17T20:08:20Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	stickies-v, fjahr, furszy, ryanofsky
Concept ACK	luke-jr
Stale ACK	theStack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30214 (refactor: Improve assumeutxo state representation by ryanofsky)
#30111 (locks: introduce mutex for tx download, flush rejection filters on UpdatedBlockTip by glozow)
#30110 (refactor: TxDownloadManager by glozow)
#29678 (Bugfix: Correct first-run free space checks by luke-jr)
#29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

stickies-v · 2024-05-20T10:22:30Z

Concept ACK

src/init.cpp

TheCharlatan · 2024-05-20T14:05:58Z

Updated 991f50a -> 9de8b26 (preserveIndexOnRestart_0 -> preserveIndexOnRestart_1, compare)

Addressed @stickies-v's comment, fixing a bug introduced in b47bd95: "kernel: De-globalize fReindex".

ryanofsky · 2024-05-20T15:58:09Z

fixing a bug introduced in b47bd95: "kernel: De-globalize fReindex".

~~Is this true? That commit, which was part of #29817, should not have changed any previous behavior~~

EDIT: Never mind, I see the problem now after reading f27290c commit description. The bug happens because the BlockManager is destroyed each loop iteration in AppInitMain so the value of the chainman.m_blockman.m_reindexing variable gets reset.

furszy

Looking good in a first glance. It would be nice to add some coverage for it just so it doesn't happen again. Maybe assert that certain logs are not present during init? Like the "Wiping LevelDB in <index_path>" one.

stickies-v · 2024-05-21T14:36:39Z

Approach ACK. First 2 commits (0d04433) LGTM but the third one I'm going to need to spend a lot more time wrapping my head around the implications.

TheCharlatan · 2024-05-23T15:49:20Z

Updated 9de8b26 -> dd290b3 (preserveIndexOnRestart_1 -> preserveIndexOnRestart_2, compare)

Added a commit for testing that the indexes are still there when continuing a reindex.

test/functional/feature_reindex.py

luke-jr · 2024-05-23T16:50:45Z

Concept ACK

TheCharlatan · 2024-05-24T07:37:50Z

Updated dd290b3 -> 891784c (preserveIndexOnRestart_2 -> preserveIndexOnRestart_3, compare)

Addressed @furszy's comment, removed the timeout on the initload busy loop.
Addressed @maflcko's comment, removed the busyloop waiting for the block filter index. I initially thought it might be useful to wait for the index to load completely, but I don't think this is strictly required for this test.
Addressed @maflcko's comment, moved stop_node out of the busy loop.
Addressed @maflcko's comment, using named args for literal arguments now.

test/functional/test_framework/test_node.py

test/functional/feature_reindex.py

DrahtBot · 2024-05-24T08:36:55Z

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

_{Debug: https://github.com/bitcoin/bitcoin/runs/25367317004}

TheCharlatan · 2024-05-24T12:46:53Z

Thanks for the reviews @maflcko,

891784c -> eeea081 (preserveIndexOnRestart_3 -> preserveIndexOnRestart_4, compare)

Addressed @maflcko's comment, removed redundant initial syncing of the blockfilterindex.
Addressed @maflcko's comment, enforce using named argument.

furszy

Code review ACK eeea081

stickies-v · 2024-06-04T12:11:29Z

The following change built on top of this PR could provide a simpler alternative: 9c643e7.

I like this suggestion. I think the naming is more clear in terms of the actual effect each variable has. I also think not having two reindex options is an improvement, the flow is more clear to me now.

I don't think wipe_block_tree_db (and a couple of other blockman options) should ultimately be a ChainstateLoadOptions member, but fixing that is probably beyond the scope of this PR (and I'm looking into doing that in a separate pull).

ryanofsky · 2024-06-04T14:39:36Z

Removing the blockman option looks like a nice improvement, but I am not sure yet about the renaming. Going from do_reindex to wipe_block_tree_db back to m_reindexing seems a bit convoluted.

That's a good point. One possible way to address it could be to rename m_reindexing as well like 38cc045. If we move away from having multiple variables called "reindex" and instead name variables individually based on what they do, I think it would be an improvement overall.

The renames suggested in 9c643e7 and 38cc045 don't directly relate to this PR, so might make more sense as followups to avoid complicating things. On the other hand, if you took the portion of 9c643e7 replacing BlockManagerOpts::reindex with do_reindex it would might make the third commit 9de8b26 here easier to understand. Just feel free to ignore these suggestions for now or use any parts that seem useful.

Also the previous set of options did not allow rebuilding the block database without also rebuilding the chainstate database, when it should be possible to do those independently.

Looking at the code more, I think setting wipe_block_tree_db = true and wipe_chainstate_db = false should probably work, even though we never do it. But if we add these options to the kernel API, the documentation could warn this combination is currently untested / unsupported. One problem with this combination is that it might not handle some reorgs, because LoadExternalBlockFile only currently scans block files and populates CBlockIndex::nDataPos. It doesn't currently scan undo files and populate CBlockIndex::nUndoPos, though it could.

theStack

ACK eeea081

Nice catch. Took me a while to see where b47bd95 introduced the bug (the retry logic is confusing indeed!), but both the detailed commit messages and yesterday's PR review club notes/log were very helpful to grok it.

test/functional/feature_reindex.py

TheCharlatan · 2024-06-07T06:14:54Z

Thanks for the review and ACKs, I will address the left over nits here shortly, I think re-ACKing should be easy enough.

Reverts a bug introduced in b47bd95 "kernel: De-globalize fReindex". The change leads to a GUI user being prompted to re-index on a chainstate loading failure more than once as well as the node actually not reindexing if the user chooses to. Fix this by setting the reindexing option instead of the atomic, which can be safely re-used to indicate that a reindex should be attempted. The bug specifically is caused by the chainman, and thus the blockman and its m_reindexing atomic being destroyed on every iteration of the for loop. The reindex option for ChainstateLoadOptions is currently also set in a confusing way. By using the reindex atomic, it is not obvious in which scenario it is true or false. The atomic is controlled by both the user passing the -reindex option, the user chosing to reindex if something went wrong during chainstate loading when running the gui, and by reading the reindexing flag from the block tree database in LoadBlockIndexDB. In practice this read is done through the chainstate module's CompleteChainstateInitialization's call to LoadBlockIndex. Since this is only done after the reindex option is set already, it does not have an effect on it. Make this clear by using the reindex option from the blockman opts which is only controlled by the user.

It does not control any actual logic and the log message as well as the comment are obsolete, since no database initialization takes place there anymore. Log messages indicating when indexes and chainstate databases are loaded exist in other places.

TheCharlatan · 2024-06-07T12:00:00Z

Thank you for your suggestions @ryanofsky! I applied both of your suggested patches.

eeea081 -> 682f1f1 (preserveIndexOnRestart_4 -> preserveIndexOnRestart_5, compare)

Addressed @theStack's comment, using chain_path to construct the db path in the functional tests.
Added @ryanofsky's suggestion in this patch, removing the blockmanager reindex option as well as renaming reindex to wipe_block_tree_db and reindex_chainstate to wipe_chainstate_db in the ChainstateLoadOptions.
Added @ryanofsky's suggestion in this patch, renaming m_reindexing to m_blockfiles_indexed.

stickies-v

Code LGTM 682f1f1 modulo one small bug in comments.

I couldn't find or think of any occasions where the new reduced wiping behaviour introduces problematic behaviour. The new variable names introduced make things significantly more clear and is a very welcome change.

src/test/util/setup_common.cpp

src/node/blockstorage.h

src/bitcoin-chainstate.cpp

src/init.cpp

Drop confusing kernel options: BlockManagerOpts::reindex ChainstateLoadOptions::reindex ChainstateLoadOptions::reindex_chainstate Replacing them with more straightforward options: ChainstateLoadOptions::wipe_block_tree_db ChainstateLoadOptions::wipe_chainstate_db Having two options called "reindex" which did slightly different things was needlessly confusing (one option wiped the block tree database, and the other caused block files to be rescanned). Also the previous set of options did not allow rebuilding the block database without also rebuilding the chainstate database, when it should be possible to do those independently.

Before this change continuing a reindex without the -reindex flag set would leave the block and coins db intact, but discard the data of the optional indexes. While not a bug per se, wiping the data again is wasteful, both in terms of having to write it again, and potentially leading to longer startup times. When initially running a reindex, both the block index and any further activated indexes are wiped. On an index's Init(), both the best block stored by the index and the chain's tip are null. An index's m_synced member is therefore true. This means that it will process blocks through validation events while the reindex is running. Currently, if the reindex is continued without the user re-specifying the reindex flag, the block index is preserved but further index data is wiped. This leads to the stored best block being null, but the chain tip existing. The m_synced member will be set to false. The index will not process blocks through the validation interface, but instead use the background sync once the reindex is completed. If the index is preserved (this change) after a restart its best block may potentially match the chain tip. The m_synced member will be set to true and the index can process validation events during the rest of the reindex.

Co-authored-by: furszy <matiasfurszyfer@protonmail.com>

This is a just a mechanical change, renaming and inverting the meaning of the indexing variable. "m_blockfiles_indexed" is a more straightforward name for this variable because this variable just indicates whether or not <datadir>/blocks/blk?????.dat files have been indexed in the <datadir>/blocks/index LevelDB database. The name "m_reindexing" was more confusing, it could be true even if -reindex was not specified, and false when it was specified. Also, the previous name unnecessarily required thinking about the whole reindexing process just to understand simple checks in validation code about whether blocks were indexed. The motivation for this change is to follow up on previous commits, moving away from having multiple variables called "reindex" internally, and instead naming variables individually after what they do and represent.

TheCharlatan · 2024-06-07T17:38:26Z

Thanks for the review @stickies-v,

Updated f68cba2 -> f68cba2 (preserveIndexOnRestart_5 -> preserveIndexOnRestart_6, compare)

Addressed @stickies-v's comment, slightly changing the way the wipe_chainstate_db is set in setup_common.
Addressed @stickies-v's comment, clarified m_blockfiles_indexed docstring.
Addressed @stickies-v's comment, changing bitcoin-chainstate log lines.
Addressed @stickies-v's comment, fixed missed renaming for the block storage check. The check should be done if a chainstate is present.

stickies-v

ACK f68cba2

fjahr

Code review ACK f68cba2

I also confirmed that the new test fails before the changes here are applied.

I think the test could use a bit more explanation on its reasoning but this shouldn't block a merge of this as-is.

fjahr · 2024-06-08T12:18:01Z

test/functional/feature_reindex.py

+        self.log.info("Restarting node while reindexing..")
+        node.stop_node()
+        with node.busy_wait_for_debug_log([b'initload thread start']):
+            node.start(['-blockfilterindex', '-reindex'])


nit: Why was the blockfilterindex chosen here? I am assuming because it's slow? Would be good to add a comment because it may be confusing for others in the future what this choice has to do with the test.

I just picked the one which furszy picked and I'm guessing he just picked one too. I'll check if there is any signifcant effect when picking a different one.

fjahr · 2024-06-08T12:27:36Z

test/functional/feature_reindex.py

@@ -73,13 +73,33 @@ def find_block(b, start):
        # All blocks should be accepted and processed.
        assert_equal(self.nodes[0].getblockcount(), 12)

+    def continue_reindex_after_shutdown(self):
+        node = self.nodes[0]
+        self.generate(node, 1500)


nit: I guess this is needed so the node can be stopped fast enough. It's still a race and could turn out to be flakey in the CI, right? I don't have a better idea to fix this right now but a comment might be good to make this explicit and make future debugging easier if this turns out to be the case.

furszy

Code review ACK f68cba2

Great last commit.

Question about 804f09d commit description:

Also the previous set of
options did not allow rebuilding the block database without also
rebuilding the chainstate database, when it should be possible to do
those independently.

is this tested anywhere?

furszy · 2024-06-09T14:10:21Z

src/test/util/setup_common.cpp

+    options.wipe_block_tree_db = m_args.GetBoolArg("-reindex", false);
+    options.wipe_chainstate_db = m_args.GetBoolArg("-reindex", false) || m_args.GetBoolArg("-reindex-chainstate", false);


tiny nit

Suggested change

options.wipe_block_tree_db = m_args.GetBoolArg("-reindex", false);

options.wipe_chainstate_db = m_args.GetBoolArg("-reindex", false) || m_args.GetBoolArg("-reindex-chainstate", false);

options.wipe_block_tree_db = m_args.GetBoolArg("-reindex", false);

options.wipe_chainstate_db = options.wipe_block_tree_db || m_args.GetBoolArg("-reindex-chainstate", false);

Side note:
I don't think this is used anywhere.

I suggested the current approach, see #30132 (comment)

I don't think this is used anywhere.

What do you mean?

I suggested the current approach, see #30132 (comment)

Hmm ok. We should probably go further and deduplicate the init.cpp / setup_commons.cpp code somewhere in the future.

I don't think this is used anywhere.

What do you mean?

This is part of the unit test framework and no unit test, benchmark or fuzz test seems to make use of it. "-reindex" and "-reindex-chainstate" are always unset.

TheCharlatan · 2024-06-10T08:51:53Z

Re #30132 (review)

is this tested anywhere?

No, and I'm not sure it should be given we don't support this. Maybe we can add a comment and assert that just wiping the block index db is not supported for now?

ryanofsky

Code review ACK f68cba2. Only changes since last review were cherry-picking suggested commits that rename variables, improving comments, and making some tweaks to test code.

ryanofsky · 2024-06-10T13:26:58Z

re: #30132 (comment)

No, and I'm not sure it should be given we don't support this. Maybe we can add a comment and assert that just wiping the block index db is not supported for now?

FWIW, my original draft of 804f09d added this code to LoadChainstate:

// For now, don't allow wiping block tree db without also wiping chainstate
// db. There's no reason this could not work in theory, but in practice the
// code path is untested, and to be really robust, the
// LoadExternalBlockFile function should to be updated to scan undo files,
// not just block files, and to populate CBlockIndex::nUndoPos, not just
// CBlockIndex::nDataPos.
assert(!options.wipe_block_tree_db || options.wipe_chainstate_db);

I decided to drop it to keep things simpler, since inevitably the kernel API will support combinations of options bitcoin core doesn't exercise or test, and it might be cumbersome to try to warn about all of them. But I could understand wanting to do it in some cases like this.

I think the PR is ready to merge, so you can let me know if you want to add an assert or just merge it in its current form.

TheCharlatan · 2024-06-10T14:00:07Z

#30132 (comment)

I think the PR is ready to merge, so you can let me know if you want to add an assert or just merge it in its current form.

I think this is rfm.

DrahtBot added UTXO Db and Indexes CI failed and removed CI failed labels May 17, 2024

TheCharlatan mentioned this pull request May 18, 2024

kernel: De-globalize fReindex #29817

Merged

TheCharlatan force-pushed the preserveIndexOnRestart branch from 133bf46 to 991f50a Compare May 18, 2024 09:19

stickies-v reviewed May 20, 2024

View reviewed changes

src/init.cpp Outdated Show resolved Hide resolved

TheCharlatan force-pushed the preserveIndexOnRestart branch from 991f50a to 9de8b26 Compare May 20, 2024 14:05

DrahtBot mentioned this pull request May 20, 2024

scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint #29641

Draft

furszy reviewed May 20, 2024

View reviewed changes

maflcko added this to the 28.0 milestone May 21, 2024

furszy reviewed May 23, 2024

View reviewed changes

test/functional/feature_reindex.py Outdated Show resolved Hide resolved

maflcko reviewed May 23, 2024

View reviewed changes

test/functional/feature_reindex.py Outdated Show resolved Hide resolved

test/functional/feature_reindex.py Outdated Show resolved Hide resolved

test/functional/feature_reindex.py Outdated Show resolved Hide resolved

TheCharlatan force-pushed the preserveIndexOnRestart branch from dd290b3 to 891784c Compare May 24, 2024 07:37

maflcko reviewed May 24, 2024

View reviewed changes

test/functional/test_framework/test_node.py Outdated Show resolved Hide resolved

test/functional/feature_reindex.py Outdated Show resolved Hide resolved

DrahtBot added the CI failed label May 24, 2024

TheCharlatan force-pushed the preserveIndexOnRestart branch from 891784c to eeea081 Compare May 24, 2024 12:46

DrahtBot removed the CI failed label May 24, 2024

This comment was marked as resolved.

Sign in to view

furszy reviewed May 30, 2024

View reviewed changes

DrahtBot requested a review from stickies-v May 30, 2024 14:59

theStack approved these changes Jun 6, 2024

View reviewed changes

test/functional/feature_reindex.py Outdated Show resolved Hide resolved

TheCharlatan added 2 commits June 7, 2024 13:06

TheCharlatan force-pushed the preserveIndexOnRestart branch from eeea081 to 682f1f1 Compare June 7, 2024 11:59

stickies-v reviewed Jun 7, 2024

View reviewed changes

src/test/util/setup_common.cpp Outdated Show resolved Hide resolved

src/node/blockstorage.h Outdated Show resolved Hide resolved

src/bitcoin-chainstate.cpp Outdated Show resolved Hide resolved

src/init.cpp Outdated Show resolved Hide resolved

ryanofsky and others added 4 commits June 7, 2024 19:17

test: Add functional test for continuing a reindex

1b1c6dc

Co-authored-by: furszy <matiasfurszyfer@protonmail.com>

TheCharlatan force-pushed the preserveIndexOnRestart branch from 682f1f1 to f68cba2 Compare June 7, 2024 17:38

stickies-v approved these changes Jun 7, 2024

View reviewed changes

DrahtBot requested review from furszy, ryanofsky and theStack June 7, 2024 17:46

This was referenced Jun 7, 2024

refactor: Improve assumeutxo state representation #30214

Open

locks: introduce mutex for tx download, flush rejection filters on UpdatedBlockTip #30111

Open

refactor: TxDownloadManager #30110

Draft

Bugfix: Correct first-run free space checks #29678

Open

fjahr reviewed Jun 8, 2024

View reviewed changes

furszy reviewed Jun 9, 2024

View reviewed changes

ryanofsky approved these changes Jun 10, 2024

View reviewed changes

ryanofsky merged commit b1ba1b1 into bitcoin:master Jun 10, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

indexes: Don't wipe indexes again when continuing a prior reindex #30132

indexes: Don't wipe indexes again when continuing a prior reindex #30132

TheCharlatan commented May 17, 2024 •

edited

DrahtBot commented May 17, 2024 •

edited

stickies-v commented May 20, 2024

TheCharlatan commented May 20, 2024

ryanofsky commented May 20, 2024 •

edited

furszy left a comment •

edited

stickies-v commented May 21, 2024 •

edited

TheCharlatan commented May 23, 2024

luke-jr commented May 23, 2024

TheCharlatan commented May 24, 2024

DrahtBot commented May 24, 2024

TheCharlatan commented May 24, 2024

This comment was marked as resolved.

furszy left a comment

stickies-v commented Jun 4, 2024

ryanofsky commented Jun 4, 2024

theStack left a comment

TheCharlatan commented Jun 7, 2024

TheCharlatan commented Jun 7, 2024

stickies-v left a comment •

edited

TheCharlatan commented Jun 7, 2024

stickies-v left a comment

fjahr left a comment

fjahr Jun 8, 2024

TheCharlatan Jun 8, 2024

fjahr Jun 8, 2024

furszy left a comment

furszy Jun 9, 2024

stickies-v Jun 9, 2024

furszy Jun 10, 2024

TheCharlatan commented Jun 10, 2024

ryanofsky left a comment

ryanofsky commented Jun 10, 2024

TheCharlatan commented Jun 10, 2024

		options.wipe_block_tree_db = m_args.GetBoolArg("-reindex", false);
		options.wipe_chainstate_db = m_args.GetBoolArg("-reindex", false) \|\| m_args.GetBoolArg("-reindex-chainstate", false);

indexes: Don't wipe indexes again when continuing a prior reindex #30132

indexes: Don't wipe indexes again when continuing a prior reindex #30132

Conversation

TheCharlatan commented May 17, 2024 • edited

DrahtBot commented May 17, 2024 • edited

Code Coverage

Reviews

Conflicts

stickies-v commented May 20, 2024

TheCharlatan commented May 20, 2024

ryanofsky commented May 20, 2024 • edited

furszy left a comment • edited

Choose a reason for hiding this comment

stickies-v commented May 21, 2024 • edited

TheCharlatan commented May 23, 2024

luke-jr commented May 23, 2024

TheCharlatan commented May 24, 2024

DrahtBot commented May 24, 2024

TheCharlatan commented May 24, 2024

This comment was marked as resolved.

furszy left a comment

Choose a reason for hiding this comment

stickies-v commented Jun 4, 2024

ryanofsky commented Jun 4, 2024

theStack left a comment

Choose a reason for hiding this comment

TheCharlatan commented Jun 7, 2024

TheCharlatan commented Jun 7, 2024

stickies-v left a comment • edited

Choose a reason for hiding this comment

TheCharlatan commented Jun 7, 2024

stickies-v left a comment

Choose a reason for hiding this comment

fjahr left a comment

Choose a reason for hiding this comment

fjahr Jun 8, 2024

Choose a reason for hiding this comment

TheCharlatan Jun 8, 2024

Choose a reason for hiding this comment

fjahr Jun 8, 2024

Choose a reason for hiding this comment

furszy left a comment

Choose a reason for hiding this comment

furszy Jun 9, 2024

Choose a reason for hiding this comment

stickies-v Jun 9, 2024

Choose a reason for hiding this comment

furszy Jun 10, 2024

Choose a reason for hiding this comment

TheCharlatan commented Jun 10, 2024

ryanofsky left a comment

Choose a reason for hiding this comment

ryanofsky commented Jun 10, 2024

TheCharlatan commented Jun 10, 2024

TheCharlatan commented May 17, 2024 •

edited

DrahtBot commented May 17, 2024 •

edited

ryanofsky commented May 20, 2024 •

edited

furszy left a comment •

edited

stickies-v commented May 21, 2024 •

edited

stickies-v left a comment •

edited