Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render Graph Rewrite #13397

Open
wants to merge 132 commits into
base: main
Choose a base branch
from
Open

Conversation

ecoskey
Copy link

@ecoskey ecoskey commented May 16, 2024

Objective

Bevy's current render graph is quite rigid; Since it's a pure-ecs system, all the inputs and outputs are fixed. This makes it both difficult to customize for third-party users and difficult to maintain internally. ViewNode and similar abstractions relieve some of the pain of querying the World, but don't succeed in making the graph more dynamic.

Goals:

  • make custom render graphs possible, maintainable, and even easy
    • this will have the side effect of making it much easier to maintain separate bevy render graphs for mobile, pbr, etc.
  • total control for render graph authors
  • automatic resource management, without compromising control
  • modularity
  • reusability

Non-goals:

  • data-oriented or file-based config: while these can make sense for opinionated existing renderers like those that AAA studios develop in-house, they can't cover all or even most use cases without becoming very clumsy. Nothing about this PR would stop someone from building this on top, however.
  • beginner-friendly custom graphs: while this might even happen as a side effect of a good standard render graph library and the pass builders, users will always need to deal with wgpu if they need custom nodes.
  • "add custom pass from anywhere" API: the graph author must always have total control, so even though this style of API is useful for things such as post-processing, it should be in the form of a plugin and plain graph config functions that the graph author opts in to.

Solution

Credit to @JMS55 for guiding many of the design decisions for the new graph! I had more than a few iterations on the concept up to this point, most of them bad.

The main differences between the old and new graph are as follows:

  • while the old graph is static and created ahead of time, the new one is fully dynamic and rebuilt every frame.
  • The graph has no notion of a "view entity" and manages the rendering loop itself (THIS IS NOT FINAL AND IS UP FOR DEBATE, SEE THE SECTION BELOW)
  • while the old graph is configured indirectly from many different areas of code, the new system uses a single builder function, easily readable from a single file:
//run at startup
fn setup_render_graph(mut graph_setup: ResMut<RenderGraphSetup>) {
        graph_setup.set_render_graph(bevy_default_render_graph);
}

fn bevy_default_render_graph<'g>(graph: &mut RenderGraphBuilder<'_, 'g>) {
	let ui = default_ui_graph(graph);
	for view in graph.world_query::<Entity, With<ExtractedView>>() {
		bevy_default_render_graph_per_view(graph, view, ui);
	}
}

fn bevy_default_render_graph_per_view<'g>(graph: &mut RenderGraphBuilder<'_, 'g>, view: Entity, ui: RenderHandle<'g, Texture>) {
	//all graph operations live here, as plain functions
	let view_target = graph.new_resource(TextureDescriptor { ... });
	main_pass(view, view_target, ...);
	post_processing_pass(view, view_target, ...);
}

RenderHandle: graph resources as IDs

The render graph defers much of its resource creation until after graph configuration is done, so it provides an opaque handle to resources that may not yet exist in the form of RenderHandle. These are lifetimed to the current graph execution, so it's impossible to use them outside of their original context. See the migration guide for more detail about resource creation and the utilities the graph provides.

In addition, the graph stores metadata alongside each resource, which might be just their plain descriptors, but might also contain extra information such as which entries are writable in a bind group layout.

Render graph nodes

In the current render graph, a node is a simple unit of rendering work with defined inputs and outputs. The same applies here, except we also track which resources a node reads from and which it writes to.

In order to provide the simplest API, graph.add_node takes a plain closure (and a RenderDependencies, discussed in the migration guide) with some normal rendering necessities, as well as something called NodeContext. NodeContext::get() allows dereferencing a handle into a normal resource, and then you can do whatever you want!

let texture = graph.new_resource(TextureDescriptor {...});
graph.add_node(deps![&mut texture], |ctx, device, cmds| {
	let actual_tex: &Texture = ctx.get(texture);
        //then do anything you want!
})

From there, rendering features can mostly be reduced to plain functions that take an &mut RenderGraphBuilder as well as handles to whatever inputs they need!

Debate: single entry-point/view-less vs. dynamic view queuing

Single Entry Point Multiple entry points
single builder callback one builder callback per-view
simpler backend simpler end-user experience
how to effectively customize? how to get outputs of other graphs?

Where the current single-entry-point system has no concept of "view entities" and manages the rendering loop itself, associating each graph with a view entity allows better modularity and separating concerns. It also allows putting an EntityRef in the builder to avoid passing it around everywhere. The main issue is how to order these graphs and pass data between them (in the case of UI especially). The simplest possible API would look like let ui_tex = graph.depends_on::<RenderHandle<'g, Texture>>(entity) where entity is any entity with a graph attached, for example graph.world_query_filtered::<Entity, With<UiView>>().single(), that would return a texture handle. However, this would likely require unsafe code and untyped pointers to manage.

In the interest of simplicity for this initial PR, Jasmine convinced me to stay with a single entry-point system, though I did want to show what the alternative would be if we put the extra effort in. If the maintainers/community decide it's worth it to have dynamic views immediately, I don't mind delaying the PR to add those features.

NOTE: this would not allow configuring a single graph from multiple places. Merely configuring multiple separate graphs, each operating on their own "view." (need a better word for this. not all views are cameras or shadows)

Debate: Bind Groups

Currently there's two ways we could proceed with handling bind groups. Firstly, bind groups could be handled in the graph entirely behind the scenes, using caching to prevent duplication. This would look something like every node declaring a set of entries, the graph inferring their usage, and handing back a BindGroup as a callback parameter. This would be simpler, though might make things more verbose especially in the case of the view bind group.

The other way would involve making BindGroups a first-class RenderResource, which is what is in place in the crate currently. This does involve extra complexity when tracking node dependencies: when a user marks write access to a bind group handle, that has to propagate to every bind group entry that was possibly written to, like storage textures and buffers. This might make declaring a bind group in the graph more confusing, which is worth considering for a system about ease of use. However, I think abstracting around this with BindGroupBuilder and perhaps a version of AsBindGroup for render graph handles would solve most of these issues by inferring the right usage.

Current Limitations

  • just okay storage currently (just a bunch of hashmaps). In a future PR I'd like to follow-up with better data structures/id allocation since IDs are dense anyway.
  • no dynamic queuing of views (up for debate to delay for this feature)
  • no retained resources (graph.last_frame(texture), this requires more design work),
  • no texture/buffer reuse between frames
  • bind groups and texture views are only cached within a frame, not across frames
  • resource metadata storage needs a rework (DONE!)
  • graph resources aren't labelled yet, so debug messages aren't great
  • the graph currently panic!()s in a few places where it could gracefully stop rendering.
  • no diagnostics or timing info yet

Note: these three below all have the foundations already in place

  • no automatic pass reordering
  • no compute pass merging
  • no automatic pass culling
  • no parallel command encoding

To-do for this PR

  • Unit tests
  • Documentation
  • Render pass builders
  • Extract into new crate

I would greatly appreciate any help with docs and unit tests, it's a lot to cover! :)

Testing

This is intended to be merged as an experimental feature. The basic API should be in its final form, and essentially ready for production. Unit tests are in progress, as is better documentation, though large-scale testing will essentially have to happen as we proceed with the renderer refactor.

Migration Guide

Since actual migration will happen in the form of the big refactor™️ during the next milestone, this will consist of a usage/style guide. This might seem out of place for a PR, but for such a big new system I figure it would help maintainers figure out what they're looking at.

Lifetimes and types added for comprehension. These are generally inferred :)

How do I make a resource?

There are a few ways to create graph resources:

fn my_graph<'g>(graph: &mut RenderGraphBuilder, owned_buffer: Buffer) {

	//you can create resources from descriptors, or anything that implements IntoRenderResource
	let new_texture: RenderHandle<'g, Texture> = graph.new_resource(TextureDescriptor {...})

	//...or borrow them from the world
	let my_buffer: &'g Buffer = graph.world_resource::<Foo>().my_buffer();
	let imported_buffer: RenderHandle<'g, Buffer> = graph.as_resource(my_buffer);
	
	//...or move them into the world (lesser used)
	let owned_buffer: RenderHandle<'g, Buffer> = graph.into_resource(my_owned_buffer);
}

Note: when borrowing/taking an already-made resource from the World, users must also supply metadata that matches that resource. This might make things more difficult to import to the graph, but it lets the graph infer much more about how each resource is used.

What is RenderDependencies, and why doesn't my node work?

RenderDependencies marks what resources your node reads from and writes to, in order to properly track ordering dependencies between nodes. This must be manually specified, since the only way to infer it would be to intercept all rendering calls (think Unity's SRP render graph) which I felt would be both too complicated and worse to use. If you try to get a resource from the node context which isn't declared in the dependencies, the graph will panic. It can't detect if you write to a resource declared as read-only or vice-versa, so that's up to you.

Note: the deps![] macro works like vec![] and infers usage based on using a mutable or immutable reference (see trait IntoRenderDependencies). This is the preferred way to create a RenderDependencies. A trait is used here to allow for wrappers around handles to be included in this list as well.

let my_bind_group = graph.new_resource(...);
let my_pipeline = graph.new_resource(...);
let my_color_attachment = graph.new_resource(...);

//wrong
graph.add_node(deps![&my_bind_group, &my_pipeline], |ctx, _, _, cmds| {
	let bind_group = ctx.get(my_bind_group);
	let my_pipeline = ctx.get(my_bind_group);
	let color_attachment = ctx.get(my_color_attachment); //panic!
	...
});

//right ("write", hehe)
graph.add_node(deps![&my_bind_group, &my_pipeline, &mut my_color_attachment], |ctx, _, _, cmds| {
	let bind_group = ctx.get(my_bind_group);
	let my_pipeline = ctx.get(my_bind_group);
	let color_attachment = ctx.get(my_color_attachment); //all good :)
	...
});

.add_usages()

Oh no! I have a function that creates a texture/buffer and gives it back to the user, but I don't know what usages to assign! Have no fear, citizen, for the render graph tracks this for you! (sort of). For resources created by descriptor, .add_usages() will add the specified usage flags to the descriptor, since the resource hasn't actually been created yet. You can trust users later in the graph will call this based on their needs. Otherwise, if the resource is imported and has an associated descriptor, the graph will panic if the needed usage isn't present.

.is_fresh()

Use graph.is_fresh(resource_handle) to check if a resource has been written to yet in the current frame or not. This is most useful when determining if a render pass should clear the color attachment or not.

.meta()

Use graph.meta() to get the metata of a handle or the layout of a bind group handle respectively. This is meant to reduce parameter bloat when effects need to produce textures of the same size as their input, for example. Or, for the bind group case, when creating a pipeline given only a handle to a bind group.

In-depth example: a full-screen render pass

See crate::std::fullscreen for the actual code

pub fn fullscreen_pass<'g>(
    graph: &mut RenderGraphBuilder<'_, 'g>,
    shader: Handle<Shader>,
    target: RenderHandle<'g, TextureView>,
    blend: Option<BlendState>,
    bind_groups: &[RenderHandle<'g, BindGroup>],
) {
    let format = texture_view_format(graph, target);
    let pipeline = graph.new_resource(RenderGraphRenderPipelineDescriptor {
        label: Some("fullscreen_pass_pipeline".into()),
        layout: bind_groups
            .iter()
            .map(|bind_group| graph.meta(*bind_group).descriptor.layout)
            .collect(),
        push_constant_ranges: Vec::new(),
        vertex: fullscreen_shader_vertex_state(graph),
        primitive: Default::default(),
        depth_stencil: Default::default(),
        multisample: Default::default(),
        fragment: Some(FragmentState {
            shader,
            shader_defs: Vec::new(),
            entry_point: "fullscreen_frag".into(),
            targets: vec![Some(ColorTargetState {
                format,
                blend,
                write_mask: ColorWrites::all(),
            })],
        }),
    });

    let should_clear = graph.is_fresh(target);
    let ops = Operations {
        load: if should_clear {
            if let Some(clear_color) = clear_color {
                LoadOp::Clear(clear_color.into())
            } else {
                LoadOp::Load
            }
        } else {
            LoadOp::Load
        },
        store: StoreOp::Store,
    };

    let mut dependencies = RenderDependencies::new();
    dependencies.write(target);
    for bind_group in bind_groups {
        if graph.meta(*bind_group).writes_any() {
            dependencies.write(*bind_group);
        } else {
            dependencies.read(*bind_group);
        }
    }

    graph.add_node(
        Some("fullscreen_pass".into()),
        dependencies,
        move |ctx, cmds, _| {
            let mut render_pass = cmds.begin_render_pass(&RenderPassDescriptor {
                label: Some("fullscreen_pass"),
                color_attachments: &[Some(RenderPassColorAttachment {
                    view: ctx.get(target).deref(),
                    resolve_target: None,
                    ops,
                })],
                depth_stencil_attachment: None,
                timestamp_writes: None,
                occlusion_query_set: None,
            });
            render_pass.set_pipeline(ctx.get(pipeline).deref());
            render_pass.draw(0..3, 0..1);
        },
    );
}

@BD103 BD103 added the D-Domain-Expert Requires deep knowledge in a given domain label May 29, 2024
Copy link
Contributor

@JMS55 JMS55 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. We can sort out API tweaks and missing parts and stuff as we migrate the codebase. Hard to tell ergonomics without playing around with it myself, which will be done as we start migration.

If this is v0, thing I think are important for v1. maybe before we migrate:

  • Docs
  • RenderPass and ComputePass builders
  • Temporal resources and resource caching between frames
  • A way to mark a node for parallel encoding, where it gets its own command encoder and records commands in parallel, similar to the existing setup
  • Automatic CPU/GPU profiling/debug spans
  • (Maybe) A way to pass a ShaderType struct and have it automatically be uploaded to a uniform buffer (push constants can also be viable, and we have the existing UniformComponentPlugin to think about)

license = "MIT OR Apache-2.0"
keywords = ["bevy"]

# This isn't very polished ATM, I just copied from bevy_render's cargo.toml and took out all the unneeded dependencies and features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to cleaning this up eventually, you'll want to add more boilerplate in the Cargo.toml's for bevy_internal and the repo root. Same with adding RenderGraphPlugin to DefaultPlugins, maybe, depending on how we organize things.

proc derive macro for RenderGraphDebug to be added in followup PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Enhancement A new feature C-Needs-Release-Note Work that should be called out in the blog due to impact D-Complex Quite challenging from either a design or technical perspective. Ask for help! D-Domain-Expert Requires deep knowledge in a given domain S-Needs-Review Needs reviewer attention (from anyone!) to move forward S-Needs-SME Decision or review from an SME is required X-Controversial There is active debate or serious implications around merging this PR
Projects
Status: Candidate
Development

Successfully merging this pull request may close these issues.

None yet

4 participants