-
Notifications
You must be signed in to change notification settings - Fork 271
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #18
Comments
As I know almost the same, only official version looks to have additional bias after each layer. Also, I am not sure if initialization is the same. + regularization loss is changed because of optimizations. |
@Indoxer Thanks, you are so kindly. |
No, I'm not quite sure *Including the use of the official LBFGS training strategy |
I think this is acceptable, after all, the model is very efficient, and some losses are normal. It's strange if there are no losses at all. While it effectively retains the characteristics of the official model, it also combines training optimization |
@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)? |
Oh, yes, forgive me for forgetting |
reg_ is regularization loss. |
Here are my results and code, so you can compare |
AFAIK the only difference is that the "efficient" regularization loss is different from the official one. But I'm not sure if the parallel associativity will introduce numerical error that's large enough to break some important features. |
Just found that I missed the bias term after each layer. Will update that soon. I scanned over this long thread few days ago and totally missed the comment by @Indoxer lol |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
as title
The text was updated successfully, but these errors were encountered: