如何根據列表中的字符串匹配文本並在 Python 中提取小節？

Question

我正在嘗試從類似於以下示例的收益電話文本生成結構：

"Operator

Ladies and gentlemen, thank you for standing by. And welcome to XYZ Fourth Quarter 2019 Earning Conference Call. At this time, all participants are in a listen-only mode. After the speaker presentation, there will be a question-and-answer session. [Operator Instructions] Please be advised that today’s conference is being recorded. [Operator Instructions]
I would now like to hand the conference to your speaker today,Person1, Head of Investor Relations. Please go ahead, ma’am**

Person1

Hello everyone, blablablablabla. Now let's see what Person2 has to say.

Person2

Thank you and hello everyone. Blablablabla

Person3

I have no further remarks....thank you once again"

由此我生成了一個名為list1 = ['Person1','Person2','Person3'] 。 我生成了一個空數據Person2 ，其列名稱為Person1 、 Person2和Person3 。 我現在必須根據列表中的值提取Person1 、 Person2和Person3下方的文本並填充數據框。 那可能嗎？

Answer 1

text="""OperatorLadies and gentlemen, thank you for standing by. And welcome to XYZ Fourth Quarter 2019 Earning Conference Call. At this time, all participants are in a listen-only mode. After the speaker presentation, there will be a question-and-answer session. [Operator Instructions] Please be advised that today’s conference is being recorded. [Operator Instructions]I would now like to hand the conference to your speaker today,Person1, Head of Investor Relations. Please go ahead, ma’am**Person1Hello everyone, blablablablabla. Now let's see what Person2 has to say.Person2Thank you and hello everyone. BlablablablaPerson3I have no further remarks....thank you once again"""

import re
say1=text.split('Person1')[2].split('Person2')[0] #getting text of person1
say2=text.split('Person2')[2].split('Person3')[0] #getting text of person2
say3=text.split('Person3')[1] #getting text of person3

#converting to a dataframe
pd.DataFrame({'Person1':say1,'Person2':say2,'Person3':say3},index=[1])

Answer 2

data_list = Data.split("\n")
People_Names = [name.strip() for name in People]

temp_data_list= data_list.copy()
data_dict = defaultdict(list)
isfirst=1
data_idx =0
for idx,line in enumerate(data_list):
    if line in People_Names:
        new_data_list = data_list[idx:]
        break
while len(new_data_list)>0 :
    while True:
        if new_data_list[0] in People_Names:
            key =new_data_list[0]
            break
        else:
            data_dict[key]=data_dict[key]+[new_data_list[0]]
            new_data_list.pop(0)
        if len(new_data_list)==0:
            break
    if len(new_data_list)!=0:
        new_data_list.pop(0)

df_dict = {}
for key,val in data_dict.items() :
    df_dict[key] = "\n".join(val)

df = pa.DataFrame(columns = People_Names)
df = df.append(df_dict,ignore_index=True)
#print(df)
df.to_csv("People_Data.csv")

如何根據列表中的字符串匹配文本並在 Python 中提取小節？

問題描述

2 個解決方案

解決方案1
0 2020-02-21 18:47:38

解決方案2
0 已采納 2020-02-28 05:37:19

如何根據列表中的字符串匹配文本並在 Python 中提取小節？

問題描述

2 個解決方案

解決方案1 0 2020-02-21 18:47:38

解決方案2 0 已采納 2020-02-28 05:37:19

解決方案1
0 2020-02-21 18:47:38

解決方案2
0 已采納 2020-02-28 05:37:19