Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal (Apple) GPU back-end for Tracy #793

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

slomp
Copy link
Contributor

@slomp slomp commented May 17, 2024

(I still need to update the manual, but I'm putting the code here for review to save some time).

The Metal back-end in Tracy operates differently than other GPU back-ends like Vulkan, Direct3D and OpenGL. Specifically, TracyMetalZone() must be placed around the site where a command encoder is created.

This is because not all hardware supports timestamps at command granularity, and can only provide timestamps around an entire command encoder. This accommodates for all tiers of hardware; in the future, variants of TracyMetalZone() will be added to support the habitual command-level granularity of Tracy GPU back-ends.

Metal also imposes a few restrictions that make the process of requesting and collecting queries more complicated in Tracy:

  • timestamp query buffers are limited to 4096 queries (32KB, where each query is 8 bytes)
  • when a timestamp query buffer is created, Metal initializes all timestamps with zeroes, and there's no way to reset them back to zero after timestamps get resolved; the only way to clear the timestamps is by allocating a new timestamp query buffer
  • if a command encoder records no commands and its corresponding command buffer ends up committed to the command queue, Metal will "optimize-away" the encoder along with any timestamp queries associated with it (the timestamp will remain as zero and will never get resolved)

Because of the limitations above, two timestamp buffers are managed internally. Once one of the buffers fills up with requests, the second buffer can start serving new requests.

Once all requests in a buffer get resolved and collected, the entire buffer is discarded and a new one allocated for future requests. (Proper cycling through a ring buffer would require bookkeeping and completion handlers to collect only the known complete queries.)

In the current implementation, there is potential for a race condition when the buffer is discarded and reallocated. In practice, the race condition will never materialize so long as TracyMetalCollect() is called frequently to keep the amount of unresolved queries low.

Finally, there's a timeout mechanism during timestamp collection to detect "empty" command encoders and ensure progress.

@slomp
Copy link
Contributor Author

slomp commented May 17, 2024

@wolfpld I'd like to request reviews from @nosferalatu and @JamesMcCarthy44, but I can't seem to be able to add reviewers.

@wolfpld
Copy link
Owner

wolfpld commented May 17, 2024

I don't know how assigning reviewers work on Github. Mentioning people should be enough to get their attention.

@slomp
Copy link
Contributor Author

slomp commented May 18, 2024

Also pinging @theblackunknown for a code review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants