Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update studio_util neon #383

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

update studio_util neon #383

wants to merge 3 commits into from

Conversation

MoeMod
Copy link
Contributor

@MoeMod MoeMod commented Aug 2, 2023

No description provided.

cl_dll/StudioModelRenderer.cpp Outdated Show resolved Hide resolved
@@ -143,8 +143,10 @@ void CrossProduct( const float *v1, const float *v2, float *cross )
memcpy(&v1_reg, v1, sizeof(float) * 3);
memcpy(&v2_reg, v2, sizeof(float) * 3);

float32x4_t yzxy_a = vextq_f32(vextq_f32(v1_reg, v1_reg, 3), v1_reg, 2); // [aj, ak, ai, aj]
float32x4_t yzxy_b = vextq_f32(vextq_f32(v2_reg, v2_reg, 3), v2_reg, 2); // [bj, bk, bi, bj]
float32x2_t xy_a = vget_low_f32(v1_reg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't check this code without actually running it. But did you at least tested if it compiles? Because the last time it spewed errors due to invalid data types on Android and Switch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should compile with msvc and clang. Shuffling with qword is faster so I've changed here.

@@ -349,13 +348,28 @@ void CStudioModelRenderer::StudioSlerpBones( vec4_t q1[], float pos1[][3], vec4_

s1 = 1.0f - s;

switch (m_pStudioHeader->numbones % 4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have only more than 4 bones but not dividable by 4?

For example, 6? With this code, only first 4 bones will be interpolated.

@a1batross
Copy link
Member

Also, the previous set of patches turned out to be glitchy on AArch64 computer running Linux, with GCC 10.

Could you check it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants