Can deep learning handle imbalanced data?
Yes. Class imbalance is usually a task for a lot of machine learning versions, but you will find a selection of broadly relevant techniques which will enhance distinction metrics like recall, F1, and ROC AUC. Remember that in the techniques listed below there’s absolutely nothing algorithm certain, so these may surely be utilized to enhance the outcomes of serious learning models.
Resampling: oversampling a minority category, undersampling the vast majority of category, or perhaps generating brand new samples with a method as SMOTE.
To weight the price feature: you are able to designate weights to your class labels so that the cost feature penalizes loss on specific sessions a bit more seriously. This could make the unit adapt a lot better to the qualities of a minority category.
Modifying the determination threshold: when you’ve expected probabilities, rather than using a default 50%+ choice to assign expected labels, you are able to try reducing the threshold to correct recall of a minority category / optimize F1.
Much like some other modeling options, the usage of the approaches, as well as the parameters of theirs, must be cross-validated.
I would actually argue that in certain domains rich learning is particularly well suited for imbalanced sessions. For instance, with picture information it is typical to augment the information set by executing rotations, etc., flips, shears, so there is an extremely organic method to include artificial observations of a minority category.