Category : classification

I have mixed data features (3 continuous, 2 binomial, and 3 ordinal categorial) and the binomial (1,0) target. I have label-encoded the ordinal categorical features (e.g. instead of very low/low/medium/high/very high there are 1,2,3,4,5 in a column). So it is a binomial classification problem. I have the following code: #Initialize a tree (Decision Tree with ..

Read more

I have a question on the parameters of the mutual_info_classif function as a feature selection method. https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.mutual_info_classif.html#sklearn.feature_selection.mutual_info_classif I have a binomial target Y which takes values (0,1) and X-features of mixed type. The features are binomial (0,1), continuous and categorial. Within categorial, I have ordinal features (e.g. low, medium, high encoded as 1,2,3) and non-ordinal(e.g. ..

Read more

I am looking for some robust classification/clustering models, e.g. decision trees, that would utilise hierarchical information present in the dataset. The dataset consists of unique rows (customer ID’s) and purchased products (columns). The columns are 3-level and hierarchical, with hierarchy being – class – product – product type. Example being -> ‘Bedroom’ (class) – ‘Beds’ ..

Read more

I created a logistic regression model to predict customer subscription churn based on my company’s subscription data. For the most if, not all instances, the training data and the data overall consists of instances where people either: a. don’t churn and remain subscribers (0) or b. they churn and cancel their subscription (1). There are ..

Read more