简体   繁体   English

提取使用python匹配关键字的句子的索引

[英]Extract Index of a sentence where the keyword is matched using python

i want to extract the index number of a sentence where the keyword is matched in the text using python regular expressions. 我想使用python正则表达式提取文本中与关键字匹配的句子的索引号。 the key word is "I can help you with that" And the text data is, 关键字是“我可以帮助您”,文本数据是,

keyword=["I can help you with that"] 关键字= [“我可以帮助您”]

str1=[nv707g]: Agent 'nv707g' enters chat (as Sandra) * [nv707g]: Hi. str1 = [nv707g]:座席'nv707g'进入聊天室(作为Sandra) * [nv707g]:嗨。 My name is Sandra. 我叫桑德拉。 How can I help you? 我怎么帮你? * [nv707g]: Sure, please don't worry. * [nv707g]:当然,请不要担心。 I can help you with that. 我可以帮你。 *** [nv707g]: Can I have a contact number so that we can reach you. *** [nv707g]:我可以给我一个联系电话,以便我们与您联系。

str2=[ta250h]: Agent 'ta250h' enters chat (as Steve) * [ta250h]: Hi. str2 = [ta250h]:座席'ta250h'进入聊天状态(以史蒂夫的身份) * [ta250h]:嗨。 My name is Steve. 我叫史蒂夫。 How can I help you? 我怎么帮你? * [ta250h]: I can help you with that. * [ta250h]:我可以帮助您。

str3= * [virtualAssistant.nina]: Hmmm. str3 = * [virtualAssistant.nina]:嗯。 Could you rephrase your question? 你能改一下你的问题吗? Virtual Assistants understand simple questions best. 虚拟助手最能理解简单的问题。 [virtualAssistant.nina]: You will now be connected to a specialist for your issue. [virtualAssistant.nina]:现在,您将与专家联系解决您的问题。 [sv0573]: Agent 'sv0573' enters chat (as Rosen) [sv0573]:代理“ sv0573”进入聊天状态(Rosen) Agent 'virtualAssistant.nina' exits chat 代理“ virtualAssistant.nina”退出聊天 [sv0573]: Hello, my name is Rosen. [sv0573]:您好,我叫罗森。 With whom do I have the pleasure of speaking with today? 今天我很高兴与谁交谈? [sv0573]: Hi, Jerone. [sv0573]:您好,杰罗恩。 [sv0573]: I am sorry to know that you have issues with the E-mail. [sv0573]:很抱歉得知您的电子邮件有问题。 * [sv0573]: I apologize for the inconvenience. * [sv0573]:给您带来的不便,我们深表歉意。 I can help you with that. 我可以帮你。 *** [sv0573]: Can I have a contact number so that we can reach you by phone or text with information about your AT&T services? *** [sv0573]:我可以提供一个联系电话,以便我们通过电话或短信与您联系有关AT&T服务的信息吗?

str4= [sm0036]: Agent 'sm0036' enters chat (as Sean) * [sm0036]: Hi. str4 = [sm0036]:座席'sm0036'进入聊天(如肖恩) * [sm0036]:嗨。 My name is Sean. 我叫肖恩。 How can I help you? 我怎么帮你? [sm0036]: I can see you are typing I am waiting for your response. [sm0036]:我可以看到您正在输入我正在等待您的回复。 [sm0036]: I apologize for the inconvenience. [sm0036]:给您带来的不便,我们深表歉意。 I can help you with that. 我可以帮你。 * [sm0036]: I'll find out what is happening and will help you resolve this. * [sm0036]:我会找出正在发生的情况,并将帮助您解决此问题。

Use for loop for every string and extract the sentence index when ever the keyword is matched. 对每个字符串使用for循环,并在匹配关键字时提取句子索引。

Thanks in advance. 提前致谢。

Convert your conversations into lists, splitting the strings at the * and then parse the elements for the keyword and return the index of the element containing the keyword: 将您的对话转换为列表,在*处分割字符串,然后解析关键字的元素并返回包含关键字的元素的索引:

str1="[nv707g]: Agent 'nv707g' enters chat (as Sandra) * [nv707g]: Hi. My name is Sandra. How can I help you? * [nv707g]: Sure, please don't worry. I can help you with that. *** [nv707g]: Can I have a contact number so that we can reach you."

keyword = "I can help you with that"

a = str1.strip().split('[')

def f(L, key_word):
    for i in L: 
        if key_word in i: 
            return L.index(i)

print f(a, keyword)

>>> 2

returns None if the keyword is not in the conversation. 如果关键字不在对话中,则返回None。

Edit: Seeing how the * doesnt cleanly appear in all strings to denote a new speaker, you probably should use "[" to split your strings. 编辑:看到*不会干净地出现在所有字符串中表示新的讲话者,您可能应该使用“ [”拆分字符串。

def f_new(convo, key_word, splitter = "["): 
    c = [e for e in convo.strip().split('[') if e != '']
    for i in c:
        if key_word in i: 
            return c.index(i)

The default splitter is now "[" bute you can change it optionally when calling the function. 现在默认的分隔符为“ [”,但是您可以在调用函数时随意更改它。

As for your comment, heres a pointer: Cleanly define all your strings and put them in a list 至于您的评论,这里有一个指针:干净定义所有字符串并将它们放在列表中

convos = [str1, str2, str3, str4]

Then simply loop over them: 然后简单地遍历它们:

for i in convos: 
    print(f_new(i, keyword))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM