TypeError: train_test_split() got multiple values for argument ‘test_size’ only when I write it in a function

  classification, dataframe, python, xgboost

For a dataframe df_content that looks like this:

rated_object     feature_1    feature_2    feature_n    rating
o1               2.02         0            90.40        0
o2               3.70         1            NaN          1
o3               3.45         0            70.50        1
o4               7.90         1            40.30        0
...

I wrote the following function:

import xgboost as xgb
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
import pandas as pd

def predict_cn(df, rated_object):
    is_target = (df['rated_object'] == rated_object)
    target = df[is_target].iloc[0]
    cols_to_drop = ['rated_object'] 
    df.drop(cols_to_drop, axis=1, inplace=True)
    X = df.drop('rating', axis=1)  
    y = df['rating'] 
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=5)
    model = XGBClassifier() 
    model.fit(X_train, y_train)
    prediction=model.predict(target['rated_object'], verbose=False)
    return prediction

But giving an input like predict_cn(df_content, 'o3') gives me the error:

TypeError                                 Traceback (most recent call last)
<ipython-input-10-08dcbb77df37> in <module>
----> 1 predict_cn(df_content, 'o3')

<ipython-input-9-18667675e17b> in predict_cn(df, rated_object)
      6     X = df.drop('rating', axis=1)
      7     y = df['rating']
----> 8     X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=5)
      9     model = XGBClassifier()
     10     model.fit(X_train, y_train)

TypeError: train_test_split() got multiple values for argument 'test_size'

I find it strange because when I run such a model separately for this whole dataframe it works fine and this is also the syntax from the documentation. I also don’t know if the rest of my code is correct if I want to input a rated_object and obtain its predicted rating.

Source: Python Questions

LEAVE A COMMENT