简体   繁体   English

将句子拆分为单词 pandas 并保留标签

[英]Split sentence into words pandas and keep tags

I have a Pandas dataframe like我有一个 Pandas dataframe 之类的

Text                  label      value
board members         A1          NaN
a really long sent    A2          B2

Result: I would like to unnest the sentences and keep each label per word-split, like this结果:我想取消嵌套句子并保留每个 label 每个单词拆分,像这样

Sentence    Text         label      value
   1        board          A1        NaN
   1        members        A1        NaN
   2          a            A2        B2
   2        really         A2        B2 
   2         long          A2        B2 
   2         sent          A2        B2

Extra: If possible, I would like to extract a POS (Part of Speech) tagging of each word in a new column_额外:如果可能,我想在新列中提取每个单词的 POS(词性)标记_

Sentence    Text         label      value    POS
   1        board          A1        NaN     Something
   1        members        A1        NaN     Something
   2          a            A2        B2      Something
   2        really         A2        B2      etc
   2         long          A2        B2 
   2         sent          A2        B2

You can convert Text to list then explode :您可以将Text转换为列表然后explode

df['Text'] = df['Text'].str.split()
df = df.explode("Text")

print(df)

      Text        label value
0    board  A1            NaN
0  members  A1            NaN
1        a  A2             B2
1   really  A2             B2
1     long  A2             B2
1     sent  A2             B2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM