简体   繁体   中英

How to transform multi-label to multi-class in Python?

Let's say I have the following samples with their respective multi-label

Where X1,X2,X3,X4,X5,X6 are samples

and Y1,Y2,Y3,Y4 are labels

X1 : {Y2, Y3}
x2 : {Y1}
X3 : {Y2}
X4 : {Y2, Y3}
X5 : {Y1, Y2, Y3, Y4}
X6 : {Y2}

How do I transform to

X1 : y1
x2 : y2
X3 : y3
X4 : y1
X5 : y4
X6 : y3

What I understood is that this approach is how the transformation happens in the Label Powerset method. But, I do not want to classify using this method. I just wanted to convert the labels.

We gave MultiLabelBinarizer to convert the multi-label to two-class. But this one only creates 0 and 1.

If you just want to map sequences of labels to a new label, you could convert those sequences to their string representation and use the LabelEncoder from sklearn .

from sklearn import preprocessing

Y = [(1, 2), (1, 2, 3, 4), (1,)]

le = preprocessing.LabelEncoder()
le.fit([str(y) for y in Y])

le.transform([str((1,)), str((1, 2))])
>>> array([2, 0])

Do be wary though, any invalid sequence in your test set won't be supported by your label encoder. This suggestion assumes labels are ordered in their representation and non-repeating.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM