I have a data frame
df that contains 3 classes( classification Problem). The data contains most of the columns as categorical and the dataset is imbalanced. I am trying to generate a synthetic dataset that replicates the characteristics and features of the original data frame.
scikit-learn can be used to generate synthetic data to balance the imbalanced
Q2. Does data.make_classification is used for random data generation only and not reproduce similar data with existing data
Source: Python Questions