从Chatter记录拆分机器人记录

Question

I have raw chat bot transcripts, and before doing any sentiment analysis, I would like to separate Bot records from Chatter records. 我有原始的聊天机器人成绩单，在进行任何情绪分析之前，我想将Bot记录与Chatter记录分开。

Data is already in a dataframe, and looks like the following: 数据已经在数据框中，如下所示：

Conversation_ID | Transcript
abcdef | BOT: Some text. CHATTER: Some text. BOT: Some text. BOT: Some text. CHATTER: Some text. BOT: Some text. BOT: Some text.

The result should look like: 结果应如下所示：

Conversation_ID | Transcript_BOT | Transcript_CHATTER
abcdef | Some text. Some text. Some text. Some text. Some text. | Some text. Some text.

Answer 1

If i understand correctly, 如果我理解正确，

df = pd.read_clipboard(sep='|')
df['Transcript'] = df['Conversation_ID'].str.split(':',expand=True)[1] # split by delim.
df['Conversation_ID'] = df['Conversation_ID'].str.split(':',expand=True)[0]
print(df)
    Conversation_ID Transcript
0   abcdef  None
1   BOT Some text.
2   CHATTER Some text.
3   BOT Some text.
4   BOT Some text.
5   CHATTER Some text.
6   BOT Some text.
7   BOT Some text.

and for your intended result : 并为您的预期结果：

new_df = (pd.crosstab(df['Conversation_ID'].iloc[0],
         df['Conversation_ID'].iloc[1:],
         values=df['Transcript']
        ,aggfunc='sum')).rename_axis('')
print(new_df)


Conversation_ID     BOT          CHATTER
abcdef              Some text. Some text. Some text. Some text. S...    Some text. Some text.

从Chatter记录拆分机器人记录

问题描述

1 个解决方案

解决方案1
0 2019-06-12 21:33:14

从Chatter记录拆分机器人记录

问题描述

1 个解决方案

解决方案1 0 2019-06-12 21:33:14

解决方案1
0 2019-06-12 21:33:14