Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stratum v2 Template Provider (take 3) #29432

Draft
wants to merge 33 commits into
base: master
Choose a base branch
from
Draft

Stratum v2 Template Provider (take 3) #29432

wants to merge 33 commits into from

Conversation

Sjors
Copy link
Member

@Sjors Sjors commented Feb 14, 2024

Based on on @Fi3's master...Fi3:bitcoin:PatchTemplates which is based on @ccdle12's #27854. I rebased it and re-wrote the commit history. Compared to #28983 it introduces EllSwift in the handshake and fixes various bugs. I used that opportunity to change the branch name, which makes testing against SRI slightly easier. There's no conceptual discussion on #28983 so it can be ignored by reviewers.

See docs/stratum-v2.md for a brief description of Stratum v2 and the role of Bitcoin Core in that system..

What to test and review?

I'll make separate pull requests for parts that are ready for detailed review.

See the testing guide for various ways to test this PR. This branch is actively used by (testnet) pools, so it should be ready for high level review.

Related useful PRs

Related useful issues

Implementation notes

There's roughly three layers:

  1. Noise encryption Stratum v2 Noise Protocol #29346
  2. Messages and transport layer
  3. The Template Provider
  • the ci: commits (Support self-hosted Cirrus workers on forks #29274) are there to facilitate PR's against this branch, but they are not blocking for Stratum v2
  • the commits that move transport.h and some other stuff from node to common are not blocking. But in the longer run I'd like to see process separation between the node and the template provider.
  • I will occasionally add commits to undo bug fixes, in order to stay compatible with the SRI main branch. Those will get dropped over time and can be ignored.

Contributing

If you want to help out with any of the issues below, please open a PR to my fork. I will then squash your commits into my own where needed.

Things left todo

Spec

  • modify spec to use ProvideMissingTransactions? (followup?)
  • pick a good default for default_coinbase_tx_additional_output_size (see getblocktemplate RPC)

Networking

  • add -sv2bind and -sv2allowip
  • optional -sv2cert
  • drop Sv2TemplateProvider::SendBuf, reuse p2p socket handling if possible
  • limit number of connected clients
  • maybe limit (number of) coinbase_output_max_additional_size
  • TMP / TODO comments at the top of sv2_messages.h

Testing

  • expand sv2_template_provider_tests
  • add transport fuzzer
  • add template provider fuzzer

Template generation and updating

  • group templates with the same coinbase_tx_additional_output_size
  • don't generate templates when no client is connected

Misc

Potential followups

  • implement Noise protocol and mock client in Python, add functional tests (based on test/sv2_template_provider_tests.cpp)
  • use process separation, e.g. a bitcoin-tp binary, see multiprocess.md
  • make template updates push based, on top of Cluster Mempool, see docs/stratum-v2.md (for new blocks it's already push based)
  • push empty template for the next block (downstream can ignore or use, Implement a clever way to create and manage future jobs  stratum-mining/stratum#715)
    • send prevhash for this template as soon as any new block arrives
  • push optimistic template for the next block
    • send prevhash if and only if our template won (i.e. we got a SubmitSolution message)

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 14, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30203 (Enhance signet chain configuration in bitcoin.conf by BrandonOdiwuor)
  • #30200 (Introduce Mining interface by Sjors)
  • #30141 (kernel: De-globalize validation caches by TheCharlatan)
  • #30130 (contrib/signet/miner: increase miner search space by edilmedeiros)
  • #30051 (crypto, refactor: add new KeyPair class by josibake)
  • #29876 (build: add -Wundef by fanquake)
  • #29838 (Feature: Use different datadirs for different signets by BrandonOdiwuor)
  • #29775 (Testnet4 including PoW difficulty adjustment fix by fjahr)
  • #29686 (Update manpage descriptions by willcl-ark)
  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
  • #29015 (kernel: Streamline util library by ryanofsky)
  • #28843 ([refactor] Remove BlockAssembler m_mempool member by TheCharlatan)
  • #28710 (Remove the legacy wallet and BDB dependency by achow101)
  • #28417 (contrib/signet/miner updates by ajtowns)
  • #26697 (logging: use bitset for categories by LarryRuane)
  • #10102 (Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/21562393655

{
bool started = m_tp->Start(Sv2TemplateProviderOptions { .port = 18447 });
if (! started) return false;
// Avoid "Connection refused" on CI:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template provider tests are quite brittle because they use a real socket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the time being I just added handling for MSG_MORE (on e.g. macOS sequential messages are sent separately while on Linux they're combined). I also made the timeouts a bit longer.

Hopefully that does the trick. This can be revisited closer to the time when the Template Provider is ready for its own PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on a fix in Sjors#34

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably look into using StaticContentsSock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasild any thoughts on how to make mock Socks that can be used to play messages in two directions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! See the first two commits in #26812:

bee6bdf test: put the generic parts from StaticContentsSock into a separate class
f42e4f3 test: add a mocked Sock that allows inspecting what has been Send() to it

and then how to use that in the last commit of the same PR:

8b10990 test: add unit tests exercising full call chain of CConnman and PeerManager

With those it is possible to send/receive raw bytes to/from the (mocked) socket, or NetMsgs, e.g.:

pipes->recv.PushNetMsg(NetMsgType::GETBLOCKS, block_locator, hash_stop);

ss << TX_WITH_WITNESS(tx);
tx_size = ss.size();
}

Copy link
Member Author

@Sjors Sjors Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TSAN is tripping up somewhere around here. The last thing it logs is - Connect 2 transactions:. It doesn't get to - Verify ... txins:. I wonder if this is related to mock time, which I'm testing in Sjors#34

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maflcko shouldn't TSan on CI output something useful about why it crashed? I currently only says "error 2": https://cirrus-ci.com/task/5124733717446656?logs=ci#L3531

When running this locally on Ubuntu with clang 16.0.6 I get a WARNING: ThreadSanitizer: data race and significantly more details (still a bit cryptic, but hopefully enough to figure out what's happening).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess the unit tests don't capture the tsan output?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But they should. At least back when I tested #27667

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe good to re-check this when/after the cmake migration is done?

@Sjors
Copy link
Member Author

Sjors commented Feb 15, 2024

Added m_tp_mutex to Sv2TemplateProvider.

@Sjors
Copy link
Member Author

Sjors commented Feb 15, 2024

Bumping macOS to 14 on the CI does not help (tried in Sjors#35). I also can't reproduce this failure on my own Intel macOS machines, not on 13.6.4 and not on 14.2.1. A Sock mock is probably the most robust solution, but it'd be nice to find another workaround.

This extra delay seems to do the trick for now: Sjors@c8d10af

Another option to consider is using the functional test framework instead, since these are not really unit tests. However that involves implementing the sv2 noise protocol in Python and a bunch of other work to export transport functions to the functional test framework. If anyone feels up to that challenge, let me know...

@Sjors
Copy link
Member Author

Sjors commented Jun 7, 2024

Updated to use the interface proposed in #30200. This also fixes a small bug: -blockmintxfee and -blockmaxweight are no longer ignored. When the latter argument is unset, by default blocks have a 4000 byte safety margin, just like with getblocktemplate.

Sjors and others added 13 commits June 7, 2024 17:06
This makes the options argument mandatory for BlockAssembler constructor,
dropping implicit handling of ArgsManager. The caller i.e. the Mining
interface implementation now handles this.

In Stratum v2 the pool communicates how many extra bytes it needs for
its own outputs (payouts, extra commitments, etc). This needs to be
substracted from what the user set as -blockmaxweight.

To achieve that the caller would have to pass in an options object,
and not forget to also process -blockmintxfee.
Set tip at the start of the function and only update it for a long poll.
Co-Authored-By: Christopher Coverdale <chris.coverdale24@gmail.com>
The template provider will listen for a Job Declarator client.
It can establish a connection and detect various protocol errors.

Co-Authored-By: Christopher Coverdale <chris.coverdale24@gmail.com>
Co-Authored-By: Fi3
Co-Authored-By: Christopher Coverdale <chris.coverdale24@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants