Category : xgboost

I have a text dump like below 1000 trees and I want to reconstruct the xgboost model from the dump and test new file on reconstructed model later. booster[0]: 0:[sincelastrun<23.2917] yes=1,no=2,missing=2 1:[sincelastrun<18.0417] yes=3,no=4,missing=4 3:leaf=-0.0965415 4:leaf=-0.0679503 2:[sincelastrun<695.025] yes=5,no=6,missing=6 5:leaf=-0.0992546 6:leaf=-0.0984374 . . . booster[1000]: Thanks for the help. Source: Python..

Read more

I have trained a xgboost model with pre-processing + hyper-parameter tuning. I save the model using pickle.dump and then upload to the ML engine. When I try to make prediction / run inference, I get the following error / exception: version: python 3.6 xgboost 0.90 TypeError(‘no supported conversion for types: %r’ % (args,)) TypeError: can ..

Read more

I am using xgboost to make some predictions. We do some pre-processing, hyper-parameter tuning before fitting the model. While performing model diagnostics, we’d like to plot feature importances with feature names. Here are the steps we’ve taken. # split df into train and test X_train, X_test, y_train, y_test = train_test_split(df.iloc[:,0:21], df.iloc[:,-1], test_size=0.2) X_train.shape (1671, 21) ..

Read more

I need to use an older version of xgboost package for compatibility issues. I am attempting to do that with pkg_resources: import pkg_resources pkg_resources.require("xgboost==0.90") import xgboost from xgboost import XGBRegressor I get the following error: VersionConflict: (xgboost 1.5.0 (/Users/anaconda3/lib/python3.8/site-packages), Requirement.parse(‘xgboost==0.90’)) When I try: print(xgboost.__version__) 1.5.0 I have tried to pip uninstall and then pip install ..

Read more

For a dataframe df_content that looks like this: rated_object feature_1 feature_2 feature_n rating o1 2.02 0 90.40 0 o2 3.70 1 NaN 1 o3 3.45 0 70.50 1 o4 7.90 1 40.30 0 … I wrote the following function: import xgboost as xgb from sklearn.model_selection import train_test_split from xgboost import XGBClassifier from sklearn.metrics import accuracy_score ..

Read more

import pandas as pd from xgboost import XGBRegressor from sklearn.model_selection import train_test_split from sklearn.compose import ColumnTransformer from sklearn.preprocessing import OneHotEncoder # split df into train and test X_train, X_test, y_train, y_test = train_test_split(df.iloc[:,0:21], df.iloc[:,-1], test_size=0.2, random_state=42) # check shape X_train.shape (1485, 21) X_test.shape (372, 21) # Encode categorical variables cat_vars = [‘cat1’,’cat2’……] cat_transform = ColumnTransformer([(‘cat’, ..

Read more

The docs say: Data Matrix used in XGBoost. DMatrix is an internal data structure that is used by XGBoost, which is optimized for both memory efficiency and training speed. You can construct DMatrix from multiple different sources of data. I get this bit but what’s the difference/use of DMatrix instead of a Pandas Dataframe? Source: ..

Read more