Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nv2a: Handle framebuffer CPU blit and PVIDEO only rendering #1146

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

abaire
Copy link
Contributor

@abaire abaire commented Jul 1, 2022

Handles two edge cases:

  1. CPU blits to the framebuffer without using 3D rendering or after 3D rendering was previously done with a surface shape that does not match the framebuffer (e.g., a different pitch).
  2. Fullscreen PVIDEO rendering without any 3D rendering.

In both cases this change prevents the pgraph code from returning early,
bypassing the special case VGA handling in sdl2_gl_refresh and instead
using a special framebuffer texture to render the contents of VRAM.

Fixes #652
Fixes #1165

Tests
HW results

@abaire
Copy link
Contributor Author

abaire commented Jul 1, 2022

There appears to be a race condition going on that causes stale frames to be displayed (on my machine if you let it run long enough there will be periods where things sync up and look fine and periods where it jumps back to a stale frame for a bit and jitters)

Attached is a video of one such run. The test alternates frames between different colored checkerboards, performing a buffer swap between each blit.

WARNING: THIS CONTAINS RAPIDLY FLASHING VIDEO

CPUBlitJitterTest.mp4

The video is captured at 120hz so it's expected that each color should be displayed for ~2 frames of video, but significantly longer runs can be seen (e.g., by extracting frames: ffmpeg -i CPUBlitJitterTest.mp4 frame%04d.png -hide_banner).

Colors should alternate:
black & white, red & black, black & blue, green & black, black & magenta, yellow & black

However frames 122-130 (as well as others) show a sequence break, in this case green&black is skipped entirely as the black&blue frame is repeated.

Note: The issue is less pronounced on an M1

@abaire
Copy link
Contributor Author

abaire commented Jul 1, 2022

Some numbers illustrating the problem (running the same test program that just blits checkerboards and swaps frames). This shows that the framebuffer surface used by the xemu UI can get out of sync with the buffer being written to by the guest.

get_fb_surface: 0x3AA8A00
get_fb_surface: 0x3AA8A00
get_fb_surface: 0x3BD4A00
CPU write: 0x3D00000

get_fb_surface: 0x3BD4A00
get_fb_surface: 0x3D00A00  // Flip handled correctly
CPU write: 0x3AA8000  // This frame is skipped entirely

get_fb_surface: 0x3D00A00 
get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
CPU write: 0x3BD4000  // This frame is skipped entirely

get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
CPU write: 0x3D00000  // This effectively writes to the frontbuffer and may tear

get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
get_fb_surface: 0x3D00A00
CPU write: 0x3AA8000

get_fb_surface: 0x3D00A00
CPU write: 0x3BD4000

get_fb_surface: 0x3AA8A00  // Things start to get resync'd at this point
CPU write: 0x3D00000

get_fb_surface: 0x3BD4A00
CPU write: 0x3AA8000

A similar run tracking changes to PCRTC_START:

pgraph_surface_access_callback - 0x3BD4000
nv2a_get_framebuffer_surface - 0x3AA8A00
PCRTC_START - 0x3BD4000 cc cc cc ff
pgraph_surface_access_callback - 0x3D00000
nv2a_get_framebuffer_surface - 0x3BD4A00
PCRTC_START - 0x3D00000 0 0 0 ff
pgraph_surface_access_callback - 0x3AA8000
nv2a_get_framebuffer_surface - 0x3D00A00
nv2a_get_framebuffer_surface - 0x3D00A00
PCRTC_START - 0x3AA8000 44 cc cc ff
pgraph_surface_access_callback - 0x3BD4000
nv2a_get_framebuffer_surface - 0x3AA8A00
nv2a_get_framebuffer_surface - 0x3AA8A00
PCRTC_START - 0x3BD4000 cc 44 cc ff
pgraph_surface_access_callback - 0x3D00000  // This frame is never displayed, the flip is missing.
nv2a_get_framebuffer_surface - 0x3BD4A00
nv2a_get_framebuffer_surface - 0x3BD4A00
nv2a_get_framebuffer_surface - 0x3BD4A00
nv2a_get_framebuffer_surface - 0x3BD4A00
pgraph_surface_access_callback - 0x3AA8000
nv2a_get_framebuffer_surface - 0x3BD4A00
nv2a_get_framebuffer_surface - 0x3BD4A00
PCRTC_START - 0x3AA8000 0 0 0 ff
pgraph_surface_access_callback - 0x3BD4000
nv2a_get_framebuffer_surface - 0x3AA8A00
PCRTC_START - 0x3BD4000 cc cc cc ff
pgraph_surface_access_callback - 0x3D00000
nv2a_get_framebuffer_surface - 0x3BD4A00
nv2a_get_framebuffer_surface - 0x3BD4A00
pgraph_surface_access_callback - 0x3AA8000
nv2a_get_framebuffer_surface - 0x3BD4A00
PCRTC_START - 0x3AA8000 44 cc cc ff
pgraph_surface_access_callback - 0x3BD4000
nv2a_get_framebuffer_surface - 0x3AA8A00
PCRTC_START - 0x3BD4000 0 0 0 ff
pgraph_surface_access_callback - 0x3D00000
nv2a_get_framebuffer_surface - 0x3BD4A00
PCRTC_START - 0x3D00000 cc cc cc ff
pgraph_surface_access_callback - 0x3AA8000
nv2a_get_framebuffer_surface - 0x3D00A00
PCRTC_START - 0x3AA8000 0 0 0 ff
pgraph_surface_access_callback - 0x3BD4000
nv2a_get_framebuffer_surface - 0x3AA8A00
PCRTC_START - 0x3BD4000 cc cc cc ff

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from 8b179d7 to 356da2a Compare July 5, 2022 21:00
@abaire abaire changed the title nv2a: Fall back to VGA path on framebuffer surface format mismatch. nv2a: Handle framebuffer CPU blit and PVIDEO only rendering Jul 5, 2022
@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from 356da2a to 373dc1d Compare July 5, 2022 21:02
@abaire
Copy link
Contributor Author

abaire commented Jul 5, 2022

Updated to take a different approach. Rather than fall back to the VGA handling in xemu.c, cases with a valid framebuffer but missing/non-matching SurfaceBinding are special cased to allow proper interaction with the PVIDEO overlay.

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch 2 times, most recently from e79faa4 to 031c46a Compare July 5, 2022 21:22
@@ -5146,6 +5155,144 @@ static void pgraph_render_display_pvideo_overlay(NV2AState *d)
out_x, out_y, out_width, out_height);
}

static void surface_copy_expand_row(uint8_t *out, uint8_t *in,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Most of these are just moved so they'll be defined before the new first point of use.

@abaire abaire marked this pull request as ready for review July 6, 2022 00:20
@Triticum0
Copy link

@abaire you broke Broken-Sword-The-Sleeping-Dragon

2022-07-06.10-26-30.mp4

@Triticum0
Copy link

Tested all the games except
-Far Cry Instincts: Evolution
-Disney's The Haunted Mansion
It didn't fix a single game

@abaire abaire marked this pull request as draft July 6, 2022 14:31
@abaire
Copy link
Contributor Author

abaire commented Jul 6, 2022

Tested all the games except -Far Cry Instincts: Evolution -Disney's The Haunted Mansion It didn't fix a single game

Thanks, then it's probably safe to assume they're due to a different underlying cause.

@abaire
Copy link
Contributor Author

abaire commented Jul 6, 2022

@abaire you broke Broken-Sword-The-Sleeping-Dragon

Looks like the pitch ends up being incorrect for this case. I just ordered it, will look at it again in a week or two when it arrives.

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from 031c46a to 00d3916 Compare July 6, 2022 15:03
@abaire
Copy link
Contributor Author

abaire commented Jul 6, 2022

Looks like it also breaks the FMV for Serious Sam, so hopefully fixing that will fix Broken Sword as well

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from 00d3916 to 886537c Compare July 6, 2022 20:57
@abaire
Copy link
Contributor Author

abaire commented Jul 6, 2022

@Triticum0 can you check Broken Sword again with the latest commit? I'm not sure it'll actually work given the video you posted, but it now properly handles framebuffers other than 32bpp (like Serious Sam's intro video).

@Triticum0
Copy link

Triticum0 commented Jul 7, 2022

I checked the issue hasn't changed
xemu-2022-07-07-11-38-34
Note: this is the title screen doesn't change when in cutscenes

@Triticum0
Copy link

do you want me to use a debug command in the monitor?

@abaire
Copy link
Contributor Author

abaire commented Jul 7, 2022

do you want me to use a debug command in the monitor?

No, it seems like there's either something interesting about that specific game, or about this patch in general. @theboy181 has similar output for games that I know work with this patch (e.g., Pirates... which is the primary game I tested with on my machine) so I'm suspicious that there is something else I'm missing, possibly driver version related since I think all three of us are using nvidia hardware.

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch 4 times, most recently from 77afde6 to d27b5e8 Compare July 7, 2022 20:00
@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from d27b5e8 to 3e3de08 Compare July 15, 2022 14:35
@abaire
Copy link
Contributor Author

abaire commented Jul 15, 2022

@Triticum0 I was able to debug Broken Sword, it used a non-packed pitch for the VGA buffer but should be fixed now.

Can you re-test everything just to make sure there's no regressions? I've checked Pirates! and Stake, but want to make sure they're still fixed on your machine as well.

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from 3e3de08 to 5866d11 Compare July 15, 2022 14:42
@abaire
Copy link
Contributor Author

abaire commented Jul 15, 2022

I was able to reproduce the issue reported by @theboy181 by enabling widescreen and 480p mode and running with -machine avpack=hdtv. The fix for the Broken Sword pitch issue also fixes that case (confirmed by reverting the pitch fix, which produces corrupt video).

Still a small problem when running with widescreen enabled in the dashboard but no avpack, the title screen of Pirates... wraps around.

@abaire abaire marked this pull request as ready for review July 15, 2022 15:03
@abaire abaire marked this pull request as draft July 15, 2022 15:06
@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from 5866d11 to 87a879c Compare July 15, 2022 15:20
@abaire
Copy link
Contributor Author

abaire commented Jul 15, 2022

When widescreen is enabled but no avpack is set, Pirates ends up using a vga resolution of 704x480 with a vga line offset of 2560 (640x4). Added handling for this case, anytime the vga width is set larger than the pitch it will calculate the correct width and ignore the excess (matching the behavior of the GL surface bypass case)

@abaire abaire marked this pull request as ready for review July 15, 2022 15:22
@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch 2 times, most recently from 7f48bc5 to 2507ba0 Compare July 16, 2022 04:21
@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch 4 times, most recently from 9c29b9e to f05c0a7 Compare June 1, 2023 00:53
@ko81e24wy
Copy link

Seems res scale takes no effert with this build.
6666

@abaire
Copy link
Contributor Author

abaire commented Jun 3, 2023

Seems res scale takes no effert with this build.

Thanks for testing @ko81e24wy - can you double check? I just did a build of this branch and it seems to work for me:

1x:
fix_652_1x

4x:
fix_652_4x

And for comparison purposes, here's a clean build:
1x:
master_1x

4x:
master_4x

To my eye, the textures on the rocks in the 4x case are substantially sharper than the 1x, both on this fix branch and on master

@ko81e24wy
Copy link

Seems res scale takes no effert with this build.

Thanks for testing @ko81e24wy - can you double check? I just did a build of this branch and it seems to work for me:

1x: fix_652_1x

4x: fix_652_4x

And for comparison purposes, here's a clean build: 1x: master_1x

4x: master_4x

To my eye, the textures on the rocks in the 4x case are substantially sharper than the 1x, both on this fix branch and on master

yes,its sharper,but still has aliasing,master build is smoother.

@abaire
Copy link
Contributor Author

abaire commented Jun 4, 2023

Ah, I see it now, along the character's arms. Hopefully just a simple mismerge, will look at it when I have a chance.

@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch 2 times, most recently from f928340 to a6c1e5f Compare June 4, 2023 15:10
Handles two edge cases:
1) CPU blits to the framebuffer without using 3D rendering.
2) Fullscreen PVIDEO rendering without any 3D rendering.

In both cases this change prevents the pgraph code from returning early,
bypassing the special case VGA handling in `sdl2_gl_refresh` and instead
using a special framebuffer texture to render the contents of VRAM.

Fixes xemu-project#652
Fixes xemu-project#1165

[Tests](https://github.com/abaire/nxdk_pgraph_tests/blob/main/src/tests/antialiasing_tests.cpp)
[HW results](https://github.com/abaire/nxdk_pgraph_tests_golden_results/wiki/Results-Antialiasing_tests)
@abaire abaire force-pushed the fix/652/handle_mismatched_framebuffer_surface_format branch from a6c1e5f to 1bebdf6 Compare June 5, 2023 04:41
@abaire
Copy link
Contributor Author

abaire commented Jun 5, 2023

@ko81e24wy mind testing again with the latest version? There was an issue with the merge and the original patch that I think are resolved now:

new_fix_652_4x

@ko81e24wy
Copy link

Resolved.latest version of this branch.
2023-06-05 141043

@mborgerson mborgerson self-assigned this Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants