简体繁体中英

Dealing with highly imbalanced datasets using Tensorflow Dataset and Keras Tuner

原文 2020-10-12 09:31:11 1 2 python/ tensorflow/ keras/ imbalanced-data/ keras-tuner

I have a highly imbalanced dataset (3% Yes, 87% No) of textual documents, containing a title and abstract feature. I have transformed these documents into tf.data.Dataset entities with padded batches. Now, I am trying to train this dataset using Deep Learning. With model.fit() in TensorFlow, you have the class_weights parameter to deal with class imbalance, however, I am seeking for the best parameters using keras-tuner library. In their hyperparameter tuners, they do not have such an option. Therefore, I am seeking other options for dealing with class imbalance.

Is there an option to use class weights in keras-tuner ? To add, I am already using the precision@recall metric. I could also try a data resampling method, such as imblearn.over_sampling.SMOTE , but as this Kaggle post mentions:

It appears that SMOTE does not help improve the results. However, it makes the network learning faster. Moreover, there is one big problem, this method is not compatible larger datasets. You have to apply SMOTE on embedded sentences, which takes way too much memory.

2 answers

if you are looking for other methods to deal with imbalanced data, you may consider generating synthetic data using SMOTE or ADASYN package. This usually works. I see you have considered this as an option to explore.

You could change the evaluation metric to fbeta_scorer.(its weighted fscore)

Or if the dataset is large enough, you can try undersampling.

Imbalanced Dataset Using Keras

Defining label in confusion matrix with highly imbalanced dataset

imbalanced dataset with Keras deep learning

Issues at using the Tensorflow Datasets API with Keras

How to deal with a highly imbalanced Issue(Text) classification Dataset?

Deal with imbalanced dataset in text classification with Keras and Theano

Displaying data from summarization dataset in TensorFlow (using TensorFlow datasets)

I need some help setting up Keras-Tuner with Celeb_a dataset using Spyder

Tuner search with Keras Tuner

Significantly lower accuracy while using class_weight for imbalanced dataset in keras

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Imbalanced Dataset Using Keras Defining label in confusion matrix with highly imbalanced dataset imbalanced dataset with Keras deep learning Issues at using the Tensorflow Datasets API with Keras How to deal with a highly imbalanced Issue(Text) classification Dataset? Deal with imbalanced dataset in text classification with Keras and Theano Displaying data from summarization dataset in TensorFlow (using TensorFlow datasets) I need some help setting up Keras-Tuner with Celeb_a dataset using Spyder Tuner search with Keras Tuner Significantly lower accuracy while using class_weight for imbalanced dataset in keras

Related Tags

Dealing with highly imbalanced datasets using Tensorflow Dataset and Keras Tuner

Question

2 answers

solution1
1 2020-10-13 03:57:44

solution2
0 2020-10-12 10:13:27

Dealing with highly imbalanced datasets using Tensorflow Dataset and Keras Tuner

Question

2 answers

solution1 1 2020-10-13 03:57:44

solution2 0 2020-10-12 10:13:27

solution1
1 2020-10-13 03:57:44

solution2
0 2020-10-12 10:13:27