Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance comparison in sample alignedTypes not significant #249

Open
vipcxj opened this issue Jan 29, 2024 · 0 comments
Open

Performance comparison in sample alignedTypes not significant #249

vipcxj opened this issue Jan 29, 2024 · 0 comments

Comments

@vipcxj
Copy link

vipcxj commented Jan 29, 2024

This is my result:

Starting...
GPU Device 0: "Turing" with compute capability 7.5

[NVIDIA GeForce RTX 2080 Ti] has 68 MP(s) x 64 (Cores/MP) = 4352 (Cores)
> Compute scaling value = 1.00
> Memory Size = 49999872
Allocating memory...
Generating host input data array...
Uploading input data to GPU memory...
Testing misaligned types...
uint8...
Avg. time: 5.870122 ms / Copy throughput: 7.932716 GB/s.
        TEST OK
uint16...
Avg. time: 0.653178 ms / Copy throughput: 71.291444 GB/s.
        TEST OK
RGBA8_misaligned...
Avg. time: 0.411919 ms / Copy throughput: 113.046586 GB/s.
        TEST OK
LA32_misaligned...
Avg. time: 0.221544 ms / Copy throughput: 210.188781 GB/s.
        TEST OK
RGB32_misaligned...
Avg. time: 0.203884 ms / Copy throughput: 228.394200 GB/s.
        TEST OK
RGBA32_misaligned...
Avg. time: 0.192203 ms / Copy throughput: 242.274994 GB/s.
        TEST OK
Testing aligned types...
RGBA8...
Avg. time: 0.372953 ms / Copy throughput: 124.857542 GB/s.
        TEST OK
I32...
Avg. time: 0.372953 ms / Copy throughput: 124.857542 GB/s.
        TEST OK
LA32...
Avg. time: 0.217597 ms / Copy throughput: 214.001280 GB/s.
        TEST OK
RGB32...
Avg. time: 0.193803 ms / Copy throughput: 240.274804 GB/s.
        TEST OK
RGBA32...
Avg. time: 0.194241 ms / Copy throughput: 239.733621 GB/s.
        TEST OK
RGBA32_2...
Avg. time: 0.191466 ms / Copy throughput: 243.208190 GB/s.
        TEST OK

[alignedTypes] -> Test Results: 0 Failures
Shutting down...
Test passed

I found almost no difference between the aligned and unaligned versions.
uint8 and uint16 version is so slow, but there are no aligned version compared to them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant