Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request for requests #104

Open
keunwoochoi opened this issue Oct 18, 2020 · 13 comments
Open

request for requests #104

keunwoochoi opened this issue Oct 18, 2020 · 13 comments

Comments

@keunwoochoi
Copy link
Owner

Please leave me any feature request you'd like! Doesn't need to be realistic but just literally anything.

@Path-A
Copy link
Contributor

Path-A commented Oct 18, 2020

Unrealistic requests:

  1. A convenient data generator for large audio datasets. This could be a tf.keras.utils.Sequence similar to tf.keras.preprocessing.image.ImageDataGenerator or something compatible with the newer tf.data.Dataset.from_generator. Perhaps this could have access to some basic augmentations and mixup training.

  2. More GPU accelerated augmentations. Currently, I use audiomentations in a custom Keras Sequence (particularly the AddBackgroundNoise augmentation), but it has a decent cpu bottleneck. I know those authors are working on a Pytorch implementation, though.

  3. Basic TFlite compatibility. This might become unnecessary once tflite gets the Rfft op. There is an implementation that gets around that here. Also, the audio_microfrontend function in the experimental section of the tensorflow repo seems to be working. In any case, it would be nice to be able to tell the model to compile in a tflite-friendly manner if the model's ops are supported.

@Path-A
Copy link
Contributor

Path-A commented Nov 2, 2020

I think this one is already in progress, but here are some possible additional features:

  1. A SpecAugment Layer. Allow for a min value or spectrogram mean to mask the input. Allow for min/max number of masks to apply. Allow for min/max number of sequential bins for each mask. Add TimeWarping (potential tensorflow function here, PyTorch notebook example here).

@vincenzodentamaro
Copy link

I would like to have on GPU Continuous Wavelet Transform.
Any idea?
Just found this repo, but is for tensorflow.
https://github.com/nicolasigor/cmorlet-tensorflow

@Yannik1337
Copy link

I'd like to have mixed-precision compatibility. Currently, this fails on the layers, as they need float32 inputs, but mixed precision has FP16. (At least that is my experience)

@DankMinhKhoa
Copy link

please add a power exponential parameter for the logmelspectrogram. librosa has this, it's basically just an exponent of the amplitude layer, would be nice to have this baked in already :D thanks for the great work

@kenders2000
Copy link
Contributor

I am currently working on STFT tflite compatibility, I have a branch in a fork that is working for this, I am just tidying up and adding unit tests then I can share.

@keunwoochoi
Copy link
Owner Author

Thanks all, and please keep commenting.

But let me confess. I started to use Pytorch as a main DL library for my work and there are less time I can invest on Kapre at the moment. That said, I'm still watching Kapre and would love to do code-review or any quick fix :)

@Yannik1337
Copy link

Thanks all, and please keep commenting.

But let me confess. I started to use Pytorch as a main DL library for my work and there are less time I can invest on Kapre at the moment. That said, I'm still watching Kapre and would love to do code-review or any quick fix :)

If you don't mind, could you elaborate why you decided to use PyTorch? If heard several reports from people switching that PyTorch is more intuitiv, but I'm looking forward to your answer.

@keunwoochoi
Copy link
Owner Author

@Yannik1337 Sure. TF doesn't tell me where is the error exactly, especially when it comes to a real experiment that involves tfrecords, tf.data.Data, customized metrics and loss functions, data preprocessing, etc. As a result, quite often I have to go through a clueless debugging process which is not so different from the first programming class (with C) in my life. With PyTorch, when there's a problem, it tells me what it is exactly and I can go check out and fix it accordingly which is, like, working with Python. I didn't like the lack of keras-equivalent library, but these days pytorch-lightning is mature enough.

@Path-A
Copy link
Contributor

Path-A commented Jan 16, 2021

@keunwoochoi Same story for me. Although, I'm still somewhat stuck with tensorflow for production due to tflite / tfmicro still being the best for edge devices.

I didn't know about pytorch lightning! Thanks for exposing me to that awesome wrapper!

@kenders2000
Copy link
Contributor

kenders2000 commented Jan 16, 2021

I’m the same, I’m stuck with TF due to TFLITE, and edge platforms. I have not used pytorch, but from what I have read and my colleagues it seems much more intuitive.

I agree about tensorflow’s debug information, it’s not very helpful!

@Yannik1337
Copy link

@kenders2000 , do you have any available code for this compatible version? I am currently in the process of converting a model that uses kapre layers to the tflite format, but this fails due to unsupported operations.

@kenders2000
Copy link
Contributor

@Yannik1337 I added these tflite compatible layers to Kapre a while back:

from kapre import STFTTflite, MagnitudeTflite

The PR #131 includes some additional documentation as to how to use them, in summary:

You need to train a model with normal layers first, then create a new model with the tflite alternatives and load the trained weights into the model, this is because the tflite layers only support a batch size of 1, which is fine for inference on devices in most use cases but doesn't let you use them fro training,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants