[Performance 3/6] Disable nan check by default #15805

huchenlei · 2024-05-15T20:12:05Z

Description

According to lllyasviel/stable-diffusion-webui-forge#716 (comment) , nan check has ~20ms/it overhead. The overhead is large enough that option should only be used for debugging purpose.

Screenshots/videos:

Checklist:

I have read contributing wiki page
I have performed a self-review of my own code
My code follows the style guidelines
My code passes tests

SLAPaper · 2024-05-17T15:07:52Z

can the nan check only enable for VAE?

wfjsw · 2024-05-18T16:59:07Z

nan check is not great but disabling that has a lot of implications for example the VAE fallback will no longer work

wfjsw · 2024-05-18T20:04:05Z

long term wise it may be desirable to load VAE as bfloat16 instead

AUTOMATIC1111 · 2024-06-08T07:41:49Z

As it is now it will break automatically switching to full precision VAE.

AUTOMATIC1111 · 2024-06-08T09:42:37Z

Can we maybe just get the needed performance improvement by checking a single element instead of the whole tensor? Since there are batch norms, single values becoming NaN dooms the whole tensor to become all NaNs.

I pushed 547778b to dev with this change.

AUTOMATIC1111 · 2024-06-08T09:43:20Z

Also what tool is being used here for those performance visualizations? I'd like that too.

Edit: it's torch's profiler visualized in chrome https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html

huchenlei · 2024-06-08T14:29:32Z

Can we maybe just get the needed performance improvement by checking a single element instead of the whole tensor? Since there are batch norms, single values becoming NaN dooms the whole tensor to become all NaNs.

I pushed 547778b to dev with this change.

I think checking only a single element is a better way to handle this. Thanks for doing that!

AUTOMATIC1111 · 2024-06-09T13:04:34Z

Seems like that doesn't help - there's still those large delays caused by checking even single item.

AUTOMATIC1111 · 2024-06-09T15:39:18Z

But I changed the nan checking to only happen once after all steps are done in 6214aa7, so this is not an issue.

Disable nan check by default

06a7475

huchenlei requested a review from AUTOMATIC1111 as a code owner May 15, 2024 20:12

huchenlei mentioned this pull request May 17, 2024

[DO NOT MERGE] All perf improvements bundle #15821

Closed

4 tasks

huchenlei closed this Jun 8, 2024

huchenlei reopened this Jun 9, 2024

huchenlei closed this Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance 3/6] Disable nan check by default #15805

[Performance 3/6] Disable nan check by default #15805

huchenlei commented May 15, 2024 •

edited

SLAPaper commented May 17, 2024

wfjsw commented May 18, 2024

wfjsw commented May 18, 2024

AUTOMATIC1111 commented Jun 8, 2024

AUTOMATIC1111 commented Jun 8, 2024 •

edited

AUTOMATIC1111 commented Jun 8, 2024 •

edited

huchenlei commented Jun 8, 2024

AUTOMATIC1111 commented Jun 9, 2024

AUTOMATIC1111 commented Jun 9, 2024

[Performance 3/6] Disable nan check by default #15805

[Performance 3/6] Disable nan check by default #15805

Conversation

huchenlei commented May 15, 2024 • edited

Description

Screenshots/videos:

Checklist:

SLAPaper commented May 17, 2024

wfjsw commented May 18, 2024

wfjsw commented May 18, 2024

AUTOMATIC1111 commented Jun 8, 2024

AUTOMATIC1111 commented Jun 8, 2024 • edited

AUTOMATIC1111 commented Jun 8, 2024 • edited

huchenlei commented Jun 8, 2024

AUTOMATIC1111 commented Jun 9, 2024

AUTOMATIC1111 commented Jun 9, 2024

huchenlei commented May 15, 2024 •

edited

AUTOMATIC1111 commented Jun 8, 2024 •

edited

AUTOMATIC1111 commented Jun 8, 2024 •

edited