For a dataframe
df_content that looks like this:
rated_object feature_1 feature_2 feature_n rating o1 2.02 0 90.40 0 o2 3.70 1 NaN 1 o3 3.45 0 70.50 1 o4 7.90 1 40.30 0 ...
I wrote the following function:
import xgboost as xgb from sklearn.model_selection import train_test_split from xgboost import XGBClassifier from sklearn.metrics import accuracy_score import pandas as pd def predict_cn(df, rated_object): is_target = (df['rated_object'] == rated_object) target = df[is_target].iloc cols_to_drop = ['rated_object'] df.drop(cols_to_drop, axis=1, inplace=True) X = df.drop('rating', axis=1) y = df['rating'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=5) model = XGBClassifier() model.fit(X_train, y_train) prediction=model.predict(target['rated_object'], verbose=False) return prediction
But giving an input like
predict_cn(df_content, 'o3') gives me the error:
TypeError Traceback (most recent call last) <ipython-input-10-08dcbb77df37> in <module> ----> 1 predict_cn(df_content, 'o3') <ipython-input-9-18667675e17b> in predict_cn(df, rated_object) 6 X = df.drop('rating', axis=1) 7 y = df['rating'] ----> 8 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=5) 9 model = XGBClassifier() 10 model.fit(X_train, y_train) TypeError: train_test_split() got multiple values for argument 'test_size'
I find it strange because when I run such a model separately for this whole dataframe it works fine and this is also the syntax from the documentation. I also don’t know if the rest of my code is correct if I want to input a
rated_object and obtain its predicted
Source: Python Questions