Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #18

yuedajiong · 2024-05-11T14:35:38Z

as title

Indoxer · 2024-05-12T14:54:34Z

As I know almost the same, only official version looks to have additional bias after each layer. Also, I am not sure if initialization is the same. + regularization loss is changed because of optimizations.

yuedajiong · 2024-05-13T02:10:42Z

@Indoxer Thanks, you are so kindly.

WhatMelonGua · 2024-05-13T03:48:23Z

No, I'm not quite sure
I tried the official tutorial on the following link: Tutorial

*Including the use of the official LBFGS training strategy
The results showed that after completing all the one-time training, the model was almost identical to the official one
But if training is conducted in phases, it cannot be perfectly fitted(But the model is still effective, just slightly underperforming)

official KAN

Eff-KAN

WhatMelonGua · 2024-05-13T03:50:30Z

I think this is acceptable, after all, the model is very efficient, and some losses are normal. It's strange if there are no losses at all. While it effectively retains the characteristics of the official model, it also combines training optimization

Indoxer · 2024-05-13T09:50:01Z

@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)?

Indoxer · 2024-05-13T10:20:47Z

(spline_scaler not trained, base_weights not trained)

(spline_scaler trained, base_weights trained):

(I am using my modified version (but the same algorithm as efficient kan), so I am not sure)

WhatMelonGua · 2024-05-13T15:35:43Z

@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)?

Oh, yes, forgive me for forgetting
There are no such parameters, so for that reg_ variable (I don't know what it is), I simply took the default value of 1 and fixed many errors (perhaps I was fixing it blindly, just making it work)
And then the result was that the official "LBFGS" cannot be directly migrated here

WhatMelonGua · 2024-05-13T15:37:35Z

(spline_scaler not trained, base_weights not trained) (spline_scaler trained, base_weights trained):

(I am using my modified version (but the same algorithm as efficient kan), so I am not sure)

This may seem like our operations are similar
What a coincidence! 🤗

Indoxer · 2024-05-13T16:13:52Z

@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)?

Oh, yes, forgive me for forgetting There are no such parameters, so for that reg_ variable (I don't know what it is), I simply took the default value of 1 and fixed many errors (perhaps I was fixing it blindly, just making it work) And then the result was that the official "LBFGS" cannot be directly migrated here

reg_ is regularization loss. loss = train_loss + lamb * reg_ for continual learning lamb=0.0 so loss = train_loss

Indoxer · 2024-05-13T16:55:32Z

Here are my results and code, so you can compare

Blealtan · 2024-05-17T18:52:10Z

AFAIK the only difference is that the "efficient" regularization loss is different from the official one. But I'm not sure if the parallel associativity will introduce numerical error that's large enough to break some important features.

Blealtan · 2024-05-20T12:25:06Z

Just found that I missed the bias term after each layer. Will update that soon.

I scanned over this long thread few days ago and totally missed the comment by @Indoxer lol

Blealtan added the discussion label May 20, 2024

Repository owner locked and limited conversation to collaborators May 20, 2024

Blealtan converted this issue into discussion #35 May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #18

Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #18

yuedajiong commented May 11, 2024

Indoxer commented May 12, 2024

yuedajiong commented May 13, 2024

WhatMelonGua commented May 13, 2024

WhatMelonGua commented May 13, 2024 •

edited

Indoxer commented May 13, 2024 •

edited

Indoxer commented May 13, 2024 •

edited

WhatMelonGua commented May 13, 2024

WhatMelonGua commented May 13, 2024

Indoxer commented May 13, 2024

Indoxer commented May 13, 2024 •

edited

Blealtan commented May 17, 2024

Blealtan commented May 20, 2024 •

edited

This issue was moved to a discussion.

This issue was moved to a discussion.

Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #18

Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #18

Comments

yuedajiong commented May 11, 2024

Indoxer commented May 12, 2024

yuedajiong commented May 13, 2024

WhatMelonGua commented May 13, 2024

WhatMelonGua commented May 13, 2024 • edited

Indoxer commented May 13, 2024 • edited

Indoxer commented May 13, 2024 • edited

WhatMelonGua commented May 13, 2024

WhatMelonGua commented May 13, 2024

Indoxer commented May 13, 2024

Indoxer commented May 13, 2024 • edited

Blealtan commented May 17, 2024

Blealtan commented May 20, 2024 • edited

This issue was moved to a discussion.

WhatMelonGua commented May 13, 2024 •

edited

Indoxer commented May 13, 2024 •

edited

Indoxer commented May 13, 2024 •

edited

Indoxer commented May 13, 2024 •

edited

Blealtan commented May 20, 2024 •

edited