You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello guys,
I am killed by this weird behavior and want to verify if this is a FSDP bug. Basically, I want to use FSDP to train a MoE model and it hangs after several steps without any error information. I have make a minimal reproducible code based on the official FSDP tutorial. The only modification is model definition (add MoE) and wrap policy (make each expert a FSDP unit). Could you help me out? Thanks!
馃悰 Describe the bug
Hello guys,
I am killed by this weird behavior and want to verify if this is a FSDP bug. Basically, I want to use FSDP to train a MoE model and it hangs after several steps without any error information. I have make a minimal reproducible code based on the official FSDP tutorial. The only modification is model definition (add MoE) and wrap policy (make each expert a FSDP unit). Could you help me out? Thanks!
This is the output information:
Versions
Tasks
The text was updated successfully, but these errors were encountered: