Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Cluster mempool implementation #28676

Draft
wants to merge 90 commits into
base: master
Choose a base branch
from

Commits on Jun 9, 2024

  1. util: add BitSet

    This adds a bitset module that implements a BitSet<N> class, a variant
    of std::bitset with a few additional features that cannot be implemented
    in a wrapper without performance loss (specifically, finding first and
    last bit set, or iterating over all set bits).
    sipa committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    ab98ce5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    14ad38f View commit details
    Browse the repository at this point in the history
  3. clusterlin: introduce cluster_linearize.h with Cluster and DepGraph t…

    …ypes
    
    This primarily adds the DepGraph class, which encapsulated precomputed
    ancestor/descendant information for a given transaction cluster, with a
    number of a utility features (inspectors for set feerates, computing
    reduced parents/children, adding transactions, adding dependencies), which
    will become needed in future commits.
    sipa committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    ddd5514 View commit details
    Browse the repository at this point in the history
  4. tests: Fuzzing framework for DepGraph class

    This introduces a bespoke fuzzing-focused serialization format for DepGraphs,
    and then tests that this format can represent any graph, roundtrips, and then
    uses that to test the correctness of DepGraph itself.
    
    This forms the basis for future fuzz tests that need to work with interesting
    graph.
    sipa committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    648856f View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2024

  1. clusterlin: add AncestorCandidateFinder class

    This is a class that encapsulated precomputes ancestor set feerates, and
    presents an interface for getting the best remaining ancestor set.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    4bf08c0 View commit details
    Browse the repository at this point in the history
  2. clusterlin: add SearchCandidateFinder class

    Similar to AncestorCandidateFinder, this encapsulates the state needed for
    finding good candidate sets using a search algorithm.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    5f747f3 View commit details
    Browse the repository at this point in the history
  3. clusterlin: add Linearize function

    This adds a first version of the overall linearization interface, which given
    a DepGraph constructs a good linearization, by incrementally including good
    candidate sets (found using AncestorCandidateFinder and SearchCandidateFinder).
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    2af60f8 View commit details
    Browse the repository at this point in the history
  4. bench: Candidate finding and linearization benchmarks

    Add benchmarks for known bad graphs for the purpose of search (as
    an upper bound on work per search iterations) and ancestor sorting
    (as an upper bound on linearization work with no search iterations).
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    02d4284 View commit details
    Browse the repository at this point in the history
  5. clusterlin: add algorithms for connectedness/connected components

    Add utility functions to DepGraph for finding connected components.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    0229297 View commit details
    Browse the repository at this point in the history
  6. clusterlin: separate initial search entries per component (optimization)

    Before this commit, the worst case for linearization involves clusters which
    break apart in several smaller components after the first candidate is
    included in the output linearization.
    
    Address this by never considering work items that span multiple components
    of what remains of the cluster.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    db13f8e View commit details
    Browse the repository at this point in the history
  7. clusterlin: use bounded BFS exploration (optimization)

    Switch to BFS exploration of the search tree in SearchCandidateFinder
    instead of DFS exploration. This appears to behave better for real
    world clusters.
    
    As BFS has the downside of needing far larger search queues, switch
    back to DFS temporarily when the queue grows too large.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    1618822 View commit details
    Browse the repository at this point in the history
  8. clusterlin: randomize the SearchCandidateFinder search order

    To make search non-deterministic, change the BFS logic from always picking
    the first queue item, randomly picking the first or second queue item.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    e0e7a67 View commit details
    Browse the repository at this point in the history
  9. clusterlin: permit passing in existing linearization to Linearize

    This implements the LIMO algorithm for linearizing by improving an existing
    linearization. See
    https://delvingbitcoin.org/t/limo-combining-the-best-parts-of-linearization-search-and-merging
    for details.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    f25eef9 View commit details
    Browse the repository at this point in the history
  10. clusterlin: use feerate-sorted depgraph in SearchCandidateFinder

    This is a requirement for a future commit, which will rely on quickly iterating
    over transaction sets in decreasing individual feerate order.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    4ee9156 View commit details
    Browse the repository at this point in the history
  11. clusterlin: track upper bound potential set for work items (optimizat…

    …ion)
    
    In each work item, keep track of a conservative overestimate of the best
    possible feerate that can be reached from it, and then use these to avoid
    exploring hopeless work items.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    02c3bd9 View commit details
    Browse the repository at this point in the history
  12. clusterlin: reduce computation of unnecessary pot sets (optimization)

    Keep track of which transactions in the graph have an individual
    feerate that is better than the best included set so far. Others do not
    need to be added to the pot set, as they cannot possibly help beating
    best.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    829355d View commit details
    Browse the repository at this point in the history
  13. clusterlin: include topological pot subsets automatically (optimization)

    Automatically add topologically-valid subsets of the potential set pot
    to inc. It can be proven that these must be part of the best reachable
    topologically-valid set from that work item.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    c411a7a View commit details
    Browse the repository at this point in the history
  14. clusterlin: improve heuristic to decide split transaction (optimization)

    Emperically, this approach seems to be more efficient in common real-life
    clusters, and does not change the worst case.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    f74ba48 View commit details
    Browse the repository at this point in the history
  15. clusterlin: avoid recomputing potential set on every split (optimizat…

    …ion)
    
    Cache the potential set inside work items, and use it to skip part of
    the computation of split-off work items from it.
    sipa committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    cd84a26 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    694a103 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    0676101 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    a7829ba View commit details
    Browse the repository at this point in the history
  19. Add txgraph module

    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    3bffb34 View commit details
    Browse the repository at this point in the history
  20. add fuzz test for txgraph

    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    91c0b68 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    071d131 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    1896afb View commit details
    Browse the repository at this point in the history
  23. Limit mempool size based on chunk feerate

    Rather than evicting the transactions with the lowest descendant feerate,
    instead evict transactions that have the lowest chunk feerate.
    
    Once mining is implemented based on choosing transactions with highest chunk
    feerate (see next commit), mining and eviction will be opposites, so that we
    will evict the transactions that would be mined last.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    2e07a66 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    a74a05f View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    a56472a View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    195fa4b View commit details
    Browse the repository at this point in the history
  27. policy: Remove CPFP carveout rule

    The addition of a cluster size limit makes the CPFP carveout rule useless,
    because carveout cannot be used to bypass the cluster size limit. Remove this
    policy rule and update tests to no longer rely on the behavior.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    a862b0b View commit details
    Browse the repository at this point in the history
  28. Implement new RBF logic for cluster mempool

    With a total ordering on mempool transactions, we are now able to calculate a
    transaction's mining score at all times. Use this to improve the RBF logic:
    
    - we no longer enforce a "no new unconfirmed parents" rule
    
    - we now require that the mempool's feerate diagram must improve in order
      to accept a replacement
    
    TODO: update functional test feature_rbf.py to cover all our new scenarios.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    9e0938c View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    a20f38a View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    0811c9a View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    6e8058a View commit details
    Browse the repository at this point in the history
  32. Use cluster linearization for transaction relay sort order

    Previously, transaction batches were first sorted by ancestor count and then
    feerate, to ensure transactions are announced in a topologically valid order,
    while prioritizing higher feerate transactions. Ancestor count is a crude
    topological sort criteria, so replace this with linearization order so that the
    highest feerate transactions (as would be observed by the mining algorithm) are
    relayed before lower feerate ones, in a topologically valid way.
    
    This also fixes a test that only worked due to the ancestor-count-based sort
    order.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    15ec906 View commit details
    Browse the repository at this point in the history
  33. Remove CTxMemPool::GetSortedDepthAndScore

    The mempool clusters and linearization permit sorting the mempool topologically
    without making use of ancestor counts.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    8e047b1 View commit details
    Browse the repository at this point in the history
  34. Reimplement GetTransactionAncestry() to not rely on cached data

    In preparation for removing ancestor data from CTxMemPoolEntry, recalculate the
    ancestor statistics on demand wherever needed.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    82af314 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    986406b View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    657534f View commit details
    Browse the repository at this point in the history
  37. Stop enforcing ancestor size/count limits

    The cluster limits should be sufficient.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    513dc9b View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    0c9120d View commit details
    Browse the repository at this point in the history
  39. Use mempool/txgraph to determine if a tx has descendants

    Remove a reference to GetCountWithDescendants() in preparation for removing
    this function and the associated cached state from the mempool.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    aae45f4 View commit details
    Browse the repository at this point in the history
  40. Calculate descendant information for mempool RPC output on-the-fly

    This is in preparation for removing the cached descendant state from the
    mempool.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    21bfbb0 View commit details
    Browse the repository at this point in the history
  41. test: fix rbf carveout test in mempool_limit.py

    Minimal fix to the test that the RBF carveout doesn't apply in certain package
    validation cases. Now that RBF carveout doesn't exist, we can just test that
    the cluster count limit is respected (in preparation for removing the
    descendant limit altogether).
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    308dec4 View commit details
    Browse the repository at this point in the history
  42. Stop enforcing descendant size/count limits

    Cluster size limits should be enough.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    67586ce View commit details
    Browse the repository at this point in the history
  43. wallet: Replace max descendantsize with clustersize

    With the descendant size limits removed, replace the concept of "max number of
    descendants of any ancestor of a given tx" with the cluster count of the cluster
    that the transaction belongs to.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    e311049 View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    751c373 View commit details
    Browse the repository at this point in the history
  45. Eliminate RBF workaround for CPFP carveout transactions

    The new cluster mempool RBF rules take into account clusters sizes exactly, so
    with the removal of descendant count enforcement this idea is obsolete.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    e6c76d2 View commit details
    Browse the repository at this point in the history
  46. Configuration menu
    Copy the full SHA
    be678ac View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    3d4ce41 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    98e9b89 View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    9e2ea2f View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    93c7d01 View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    7a1249f View commit details
    Browse the repository at this point in the history
  52. Make removeConflicts private

    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    26bc551 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    eb72abd View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    1525d0d View commit details
    Browse the repository at this point in the history
  55. Rework removeForBlock so that clusters are only touched once

    Also remove extra linearization that was happening and some logging
    
    Update interface_zmq.py for new block connection behavior
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    5c68be9 View commit details
    Browse the repository at this point in the history
  56. Simplify ancestor calculation functions

    Now that ancestor calculation never fails (due to ancestor/descendant limits
    being eliminated), we can eliminate the error handling from
    CalculateMemPoolAncestors.
    
    interface_zmq test is broken
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    9eda1ea View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    d743d9c View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    2472b4d View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    96e3cc0 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    9a99439 View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    2a4c468 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    9aea970 View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    42a46bc View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    afe8f8f View commit details
    Browse the repository at this point in the history
  65. Switch to using the faster CalculateDescendants

    The only place we still use the older interface is in policy/rbf.cpp, where
    it's helpful to incrementally calculate descendants to avoid calculating too
    many at once (or cluttering the CalculateDescendants interface with a
    calculation limit).
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    91d0758 View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    df2a8b8 View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    f2fdd37 View commit details
    Browse the repository at this point in the history
  68. Configuration menu
    Copy the full SHA
    2a443bb View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    6a430b7 View commit details
    Browse the repository at this point in the history
  70. Configuration menu
    Copy the full SHA
    57c69b3 View commit details
    Browse the repository at this point in the history
  71. Configuration menu
    Copy the full SHA
    9b23cc8 View commit details
    Browse the repository at this point in the history
  72. Configuration menu
    Copy the full SHA
    3daeb9c View commit details
    Browse the repository at this point in the history
  73. Configuration menu
    Copy the full SHA
    76e4b33 View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    a8619b4 View commit details
    Browse the repository at this point in the history
  75. Eliminate need for ancestors in PackageV3Checks

    TO DO: Rewrite unit tests for PV3C to not lie about mempool parents, so that we
    can push down the parent calculation into v3_policy from validation.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    cc463e9 View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    6cb8d4a View commit details
    Browse the repository at this point in the history
  77. ==== END OPTIMIZATIONS ====

    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    1e9f23c View commit details
    Browse the repository at this point in the history
  78. ==== BEGIN TESTS ====

    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    654ae93 View commit details
    Browse the repository at this point in the history
  79. bench: add more mempool benchmarks

    Add benchmarks for:
    
      - mempool update time when blocks are found
      - adding a transaction
      - performing the mempool's RBF calculation
      - calculating mempool ancestors/descendants
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    0edc5f4 View commit details
    Browse the repository at this point in the history
  80. fuzz: try to add more code coverage for mempool fuzzing

    Including test coverage for mempool eviction and expiry
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    610945c View commit details
    Browse the repository at this point in the history
  81. Configuration menu
    Copy the full SHA
    bebcd5c View commit details
    Browse the repository at this point in the history
  82. Configuration menu
    Copy the full SHA
    f258137 View commit details
    Browse the repository at this point in the history
  83. Configuration menu
    Copy the full SHA
    de4b948 View commit details
    Browse the repository at this point in the history
  84. Configuration menu
    Copy the full SHA
    435e2ec View commit details
    Browse the repository at this point in the history
  85. fuzz: remove comparison between mini_miner block construction and miner

    This is in preparation for eliminating the block template building happening in
    mini_miner, in favor of directly using the linearizations done in the mempool.
    sdaftuar committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    5de8783 View commit details
    Browse the repository at this point in the history

Commits on Jun 11, 2024

  1. fixup! Add txgraph module

    sdaftuar committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    be7fb2a View commit details
    Browse the repository at this point in the history