Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ContextNet-Transcducer: TypeError: forward() got an unexpected keyword argument 'input_lengths' #141

Open
yunigma opened this issue Feb 13, 2022 · 8 comments
Assignees
Labels
BUG Something isn't working

Comments

@yunigma
Copy link

yunigma commented Feb 13, 2022

Hello! I am trying to run the contextnet_transcducer now with:

nohup python ./openspeech_cli/hydra_train.py dataset=librispeech dataset.dataset_download=False dataset.dataset_path="../..//database/LibriSpeech/" dataset.manifest_file_path="../../../openspeech/datasets/librispeech/libri_subword_manifest.txt" tokenizer=libri_subword model=contextnet_transducer audio=fbank lr_scheduler=warmup trainer=gpu criterion=cross_entropy

Yet, I am failing to start the training with the following error:

    self.advance(*args, **kwargs)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance
    output = self._evaluation_step(batch, batch_idx, dataloader_idx)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step
    output = self.trainer.accelerator.validation_step(step_kwargs)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 239, in validation_step
    return self.training_type_plugin.validation_step(*step_kwargs.values())
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/dp.py", line 104, in validation_step
    return self.model(*args, **kwargs)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/pytorch_lightning/overrides/data_parallel.py", line 63, in forward
    output = super().forward(*inputs, **kwargs)
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/pytorch_lightning/overrides/base.py", line 92, in forward
    output = self.module.validation_step(*inputs, **kwargs)
  File "/remote/idiap.svm/temp.speech05/inigmatulina/work/experiments/contextnet/openspeech/openspeech/models/openspeech_transducer_model.py", line 258, in validation_step
    return self.collect_outputs(
  File "/remote/idiap.svm/temp.speech05/inigmatulina/work/experiments/contextnet/openspeech/openspeech/models/openspeech_transducer_model.py", line 94, in collect_outputs
    loss = self.criterion(
  File "/idiap/temp/inigmatulina/code/miniconda3/envs/ve-openspeech/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'input_lengths' 

I have checked the openspeech/decoders/rnn_transducer_decoder.py script and, in the forward function, the input_lengths is passed as an argument but never used after in the function... I am not sure how it should be...

Thank you!
Yulia

@sooftware sooftware added the BUG Something isn't working label Feb 14, 2022
@sooftware
Copy link
Member

@upskyy

@upskyy
Copy link
Member

upskyy commented Feb 14, 2022

Hello @yunigma !
I think there is an error using the cross_entropy criterion for the transducer model. Would you like to use the criterion as transducer?

@yunigma
Copy link
Author

yunigma commented Feb 14, 2022

Hello @upskyy !
Thank you very much for your response. Change cross_entropy to transducer helped to fix the issue reported above. I managed to start training, yet the training loss is very high (>200) and it keeps growing. I have tried to use a different lr_scheduler but it seems that the problem is not in it.
I attach the logs.
logs_20220214.log

@upskyy
Copy link
Member

upskyy commented Feb 15, 2022

Well, Would you like to experiment by modifying the gradient accumulation parameter? [link]
I think your batch size is 16, so it might be a good idea to set accumulate_grad_batches to 8.

@yunigma
Copy link
Author

yunigma commented Feb 15, 2022

Hello, @upskyy ! I have tried setting accumulate_grad_batches to 8 but it made loss grow even faster...

@upskyy
Copy link
Member

upskyy commented Feb 16, 2022

@yunigma I'll have to do some more testing. I'm so sorry... 😭

@yunigma
Copy link
Author

yunigma commented Feb 16, 2022

Thank you @upskyy !! No worries. It is a very cool project anyway. I keep trying to understand the issue on my side too.

@virgile-blg
Copy link

Capture d’écran 2022-04-06 à 11 00 23

Same issue here with hparams :

    dataset=librispeech \
    tokenizer=libri_subword \
    model=contextnet_transducer \
    audio=fbank \
    lr_scheduler=warmup_reduce_lr_on_plateau \
    trainer=gpu \
    criterion=transducer \
    trainer.sampler=smart \
    trainer.batch_size=4 \
    trainer.accumulate_grad_batches=8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants