Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DoReFa quantizer with higher number of MACs/Ops, Grouped convs as custom ops on LCE 0.7.0 #745

Open
lluevano opened this issue Jul 20, 2022 · 3 comments

Comments

@lluevano
Copy link

Hello, I have a couple of questions regarding quantizer options for Larq and LCE.

I am designing a BNN using the DoReFa quantizer, however, I noticed a very high number of estimated MACs and Ops when converting the model for ARM64. Changing the quantizer to "ste_sign" dramatically lowered the number of MACs and Ops.

I was wondering if there is a way to use the DoReFa quantizer for training without the serious overhead of operations when converting and running the model for inference in LCE? Is the "ste_sign" quantizer the only viable option for efficient inference?

Thank you for the excellent work and for your attention.

@lluevano lluevano changed the title DoReFa quantizer with high number of MACs/Ops on LCE DoReFa quantizer / grouped convs with high number of MACs/Ops on LCE Jul 26, 2022
@lluevano lluevano changed the title DoReFa quantizer / grouped convs with high number of MACs/Ops on LCE DoReFa quantizer with higher number of MACs/Ops, Grouped convs as custom ops on LCE 0.7.0 Jul 26, 2022
@lluevano
Copy link
Author

I noticed some issues with the latest version only (0.7.0) but not the one before (0.6.2).
Grouped convolutions (FP or binary) are converted as custom ops in the latest version.

Example:
Grouped (g=2) convs converter output:

2022-07-26 13:06:17.469686: W external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1903] The following operation(s) need TFLite custom op implementation(s):
Custom ops: Conv2D
Details:
tf.Conv2D(tensor<1x32x32x64xf32>, tensor<5x5x32x32xf32>) -> (tensor<1x11x11x32xf32>) : {data_format = "NHWC", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "SAME", strides = [1, 3, 3, 1], use_cudnn_on_gpu = true}
See instructions: https://www.tensorflow.org/lite/guide/ops_custom
2022-07-26 13:06:17.469772: I external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1963] Estimated count of arithmetic ops: 5792 ops, equivalently 2896 MACs

Estimated count of arithmetic ops: 5792 ops, equivalently 2896 MACs

Quantizer small example (2 qconv layers):

Example with ste_sign mode="weights":

2022-07-26 13:14:57.680246: I external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1963] Estimated count of arithmetic ops: 1.164 M ops, equivalently 0.582 M MACs

Estimated count of arithmetic ops: 1.164 M ops, equivalently 0.582 M MACs

Changing to DoReFa mode="weights":

2022-07-26 13:16:05.771057: I external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1963] Estimated count of arithmetic ops: 1.663 M ops, equivalently 0.831 M MACs

Estimated count of arithmetic ops: 1.663 M ops, equivalently 0.831 M MACs

I was able to successfully benchmark my model with DoReFa and grouped convolutions converted on version 0.6.2 with a better-than-expected efficiency but not the one converted with version 0.7.0
I am using Tensorflow 2.8.0 and larq 0.12.2

@lgeiger
Copy link
Member

lgeiger commented Jul 26, 2022

Sorry for the late reply.

I noticed some issues with the latest version only (0.7.0) but not the one before (0.6.2).
Grouped convolutions (FP or binary) are converted as custom ops in the latest version.

Unfortunately this was an issue with TensorFlow 2.8 which LCE 0.7.0 uses under the hood. This has been fixed on master since we upgraded to 2.9, but we haven't published a new release with it yet. Sorry about that. For now, I'd recommend sticking with 0.6.2 if grouped convolution support is required.

Is the "ste_sign" quantizer the only viable option for efficient inference?

For binarised convolutions this is recommended for the activation. You can also use custom activation quantisers as well, but to make sure they convert correctly they should be implemented with larq.math.sign which unfortunately is not the case for DoReFa. Regarding weight quantization other quantisers should work fine as long as they binarise to {-1, 1} or {-alpha, alpha}.

I recommend looking at the converted model in Netron to make sure the conversion worked as intended.

@lgeiger
Copy link
Member

lgeiger commented Aug 25, 2022

I noticed some issues with the latest version only (0.7.0) but not the one before (0.6.2).
Grouped convolutions (FP or binary) are converted as custom ops in the latest version.

@lluevano sorry for the delay. We just release v0.8.0 including a fix for this. Let me know if that works for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants