-
Notifications
You must be signed in to change notification settings - Fork 59
Issues: Lightning-AI/lightning-thunder
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
CUDA error: CUDA_ERROR_ILLEGAL_ADDRESS failed when training falcon-7b
#583
opened Jun 12, 2024 by
mpatel31415
NotImplementedError: requires_grad=True is not yet supported within thunder.compile
#582
opened Jun 12, 2024 by
mpatel31415
nvFuser executor doesn't support prims.sum with symbolic dimensions
executors
nvfuser
#581
opened Jun 12, 2024 by
IvanYashchuk
Find a way to properly sort the communication operators for zero2/zero3
bug
Something isn't working
distributed
#574
opened Jun 11, 2024 by
kiya00
Returning Something isn't working
dtype
or device
from jitted function returns thunder's dtype or device (not torch.{dtype/device}).
bug
#573
opened Jun 11, 2024 by
kshitij12345
reenable testing cudnn SDPA with PyTorch dev version / 2.4.0a0+
bug
Something isn't working
ci / tests
triage review
#567
opened Jun 10, 2024 by
t-vi
[distributed] Enable transformed modules to load state dicts of the originals
distributed
enhancement
New feature or request
tensor parallel
distributed - tensor parallel
transforms
#564
opened Jun 10, 2024 by
crcrpar
TransformerEngine: Communication and Computation is not overlapped in backward pass with FSDP Zero3
bug
Something isn't working
TransformerEngine
#557
opened Jun 7, 2024 by
kshitij12345
TypeError: Missing a required argument with thunder.jit in NeMo SD ResBlock
bug
Something isn't working
nemo
Issues needed to support NVIDIA NeMo models.
#548
opened Jun 6, 2024 by
athitten
grad transform Something isn't working
forward_and_backward_from_trace
is not handling NumberProxy properly in saved_for_backward
bug
#541
opened Jun 6, 2024 by
jjsjann123
Applying Thunder on torch.fx.GraphModule from Dynamo fails
bug
Something isn't working
jit
#539
opened Jun 5, 2024 by
IvanYashchuk
math.xxx calls in function on NumberProxy is not being traced.
bug
Something isn't working
dynamic constraints
symbolic values
#526
opened Jun 5, 2024 by
jjsjann123
NVFuser error adding thunder.jit to UNet model of NeMo Stable Diffusion
bug
Something isn't working
nemo
Issues needed to support NVIDIA NeMo models.
#525
opened Jun 4, 2024 by
athitten
Thunder object's New feature or request
tracing architecture
__repr__
should indicate what object they are (TensorProxy and others)
enhancement
#510
opened Jun 3, 2024 by
t-vi
[RFC] Option to make a trace easier to interpret
enhancement
New feature or request
#507
opened Jun 2, 2024 by
crcrpar
Distill API for module transformations from distributed / quantization uses of ThunderModule attributes
enhancement
New feature or request
module
#497
opened May 31, 2024 by
t-vi
Support RN50 BatchNorm fusions with cudnn
cudnn
enhancement
New feature or request
#487
opened May 29, 2024 by
vedaanta
FP8 Linear and conv with cudnn
cudnn
enhancement
New feature or request
#486
opened May 29, 2024 by
vedaanta
load/save_state_dict hooks for early transforms
enhancement
New feature or request
module
#483
opened May 29, 2024 by
t-vi
fsdp(jit(...)) transform can use more memory compared to jit(fsdp(...))
bug
Something isn't working
distributed
#478
opened May 29, 2024 by
kshitij12345
OOM errors for Gemma-7, pythia-12b, Llama-2-13b-hf and Nous-Hermes-13b with FSDP zero3 and 2x8 H100
bug
Something isn't working
memory use
#474
opened May 29, 2024 by
mpatel31415
Dynamic shape needs to be modeled in trace
bug
Something isn't working
dynamic constraints
#471
opened May 29, 2024 by
jjsjann123
8 tasks
Implement GroupNorm to invoke APEX GroupNorm for NeMo Stable Diffusion AutoEncoder performance
bug
Something isn't working
nemo
Issues needed to support NVIDIA NeMo models.
performance
#468
opened May 29, 2024 by
athitten
dtype inconsistencies when dividing/rounding tensors
bug
Something isn't working
#467
opened May 29, 2024 by
k223kim
CI: Re-Enable torchrun call in Zero to Thunder notebook
bug
Something isn't working
ci / tests
#465
opened May 27, 2024 by
t-vi
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-05-12.