[python-package] How to refit a classifier? #6461

TheLegendAli · 2024-05-19T21:03:21Z

I have a timeseries data that I have to fit a classifier, I would like to re-train it every month with new data coming in. I would like to keep some consistency, so I prefer to pre_warm the tree with the previous boosting trees and refit the data, or just iterate few times to get the results I need. In other words I want similar tree structure more or less.

Seems like the refit function is the best way to do it but unfortunately It doesnt seem to be an option for scikit API only boosters. What is the best way I can approach this? I have done the following thus far:

{'objective': 'multiclassova',
 'num_class': 6,
 'boosting_type': 'gbdt',
 'reg_alpha': 0.0,
 'reg_lambda': 0.0,
 'num_leaves': 103,
 'feature_fraction': 0.9311573062675359,
 'bagging_fraction': 0.9568372729883741,
 'bagging_freq': 1,
 'min_child_samples': 54,
 'learning_rate': 0.056834238901176865,
 'max_depth': 99,
 'min_data_in_leaf': 2,
 'min_gain_to_split': 2.272224629201629,
 'drop_rate': 0.6143619198733702,
 'n_estimators': 232,
 'force_col_wise': True,
 'class_weight': {1: 1.2466666666666666,
  5: 0.29088888888888886,
  3: 0.3177184466019417,
  0: 0.6233333333333333,
  4: 1.3357142857142859,
  2: 3.85},
 'seed': 42,
 'random_state': 42,
 'verbose': -1,
 'eval_metric': 'precision'}

classifier_obj = lgb.LGBMClassifier(**params)
classifier_obj.fit(X_train, y_train, categorical_feature=categorical_data + binary_data)

booster = lgb.train(params={}, train_set=data_set['train_data'], categorical_feature=categorical_data + binary_data, init_model=classifier_obj, keep_training_booster=True, num_boost_round=1)

booster.refit(data=X_train, label=y_train)

I'm sure this is wrong by just looking at the outputs can anyone point to the right direction? Thanks

The text was updated successfully, but these errors were encountered:

jameslamb · 2024-05-26T22:28:40Z

Thanks for using LightGBM.

Is it absolutely necessary to "refit" (modify the values of the leaf nodes without changing the total number of trees)? Or would it bee acceptable to add more trees, trained on the newly-arrived data? If you clarify that precisely, it would help us to offer some advice.

Please also see this explanation: https://stackoverflow.com/questions/73664093/lightgbm-train-vs-update-vs-refit/73669068#73669068

jameslamb · 2024-05-26T22:28:50Z

Also...I see that you double-posted this here and on Stack Overflow (link). Please do not do that.

Maintainers here also monitor the [lightgbm] tag on Stack Overflow. I could have been spending time preparing an answer here while another maintainer was spending time answering your Stack Overflow post, which would have been a waste of maintainers' limited attention that could otherwise have been spent improving this project. Double-posting also makes it less likely that others with a similar question will find the relevant discussion and answer.

TheLegendAli · 2024-05-26T22:39:38Z

Hi James, thanks for the response. It would be best if we can use the same exact "refit", if not I can use the update function?

also, I literally posted on StackOverflow about 30 min ago. I waited few days, in-case this got backlog and would take longer than expected to get a respond. Out of respect to you I will link this to StackOverflow. Thanks in advance.

jameslamb · 2024-05-26T22:46:16Z

It would be best if we can use the same exact "refit", if not I can use the update function?

What is preventing you from using Booster.refit()? You showed an example using that and said "this is wrong by just looking at the outputs", but didn't share those outputs or explain what is "wrong" about them.

We would be happy to help but need your help to understand what specifically you are looking for.

jameslamb changed the title ~~How to refit a classifier?~~ [python-package] How to refit a classifier? May 20, 2024

jameslamb added the question label May 20, 2024

jameslamb added the awaiting response label May 26, 2024

github-actions bot removed the awaiting response label May 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] How to refit a classifier? #6461

[python-package] How to refit a classifier? #6461

TheLegendAli commented May 19, 2024

jameslamb commented May 26, 2024 •

edited

jameslamb commented May 26, 2024 •

edited

TheLegendAli commented May 26, 2024 •

edited

jameslamb commented May 26, 2024

[python-package] How to refit a classifier? #6461

[python-package] How to refit a classifier? #6461

Comments

TheLegendAli commented May 19, 2024

jameslamb commented May 26, 2024 • edited

jameslamb commented May 26, 2024 • edited

TheLegendAli commented May 26, 2024 • edited

jameslamb commented May 26, 2024

jameslamb commented May 26, 2024 •

edited

jameslamb commented May 26, 2024 •

edited

TheLegendAli commented May 26, 2024 •

edited