Preprocessing data in Multi-label classification Python

Question

My dataset structure:

Text: 'Good service, nice view, location'
Tag: '{SERVICE#GENERAL, positive}, {HOTEL#GENERAL, positive}, {LOCATI
ON#GENERAL, positive}'

And the point here is that I don't know how can I structure my data frame. If you have any recommendations, these will be really nice to me. Thank you.

Answer 1

Separate adjectives (good, bad, etc) from the hotel attributes (service, view, location). You can start from creating a custom dictionary and automatically detect and leverage new words as categories. You could use some name entity recognition to do so, here some articles:

https://towardsdatascience.com/named-entity-recognition-with-nltk-and-spacy-8c4a7d88e7da https://towardsdatascience.com/a-review-of-named-entity-recognition-ner-using-automatic-summarization-of-resumes-5248a75de175

Personally I have used the standford one, pretty cool

Preprocessing data in Multi-label classification Python

Question

1 answers

solution1
0 2019-10-08 05:07:43

Preprocessing data in Multi-label classification Python

Question

1 answers

solution1 0 2019-10-08 05:07:43

solution1
0 2019-10-08 05:07:43