[英]Extract conversations using regex
I have text like this: 我有这样的文字:
[agent]:Welcome to ABC bank My name is Asif.
[代理]:欢迎来到ABC银行。我叫Asif。 How may I help you [cust]:I got additional charge in my credit card, I will not be paying this, please remove it [agent]:Okay can I place the call on hold [cust]:This is very unresponsive behaviour on banks side
我可以如何帮助您[cust]:我的信用卡上有额外的费用,我将不支付这笔费用,请删除它[agent]:好的,我可以保留通话吗[cust]:这是一种非常无响应的行为银行方面
The conversations are not line seperated. 对话不是行分隔的。 I need to extract only what customer said and ignore what agent said for analyzing customer sentiment.
我只需要提取客户所说的内容,而忽略代理商在分析客户情绪时所说的内容。 Please help with this regex.
请帮助此正则表达式。
Either: 要么:
\\[cust\\]:((?:(?!\\[\\w+\\]:).)*)
or 要么
(?s)\\[cust\\]:(.*?)(?=\\[\\w+\\]:|$)
https://regex101.com/r/RT2O4y/1 https://regex101.com/r/RT2O4y/1
Benchmarks: 基准测试:
Regex1: \[cust\]:((?:(?!\[\w+\]:).)*)
Options: < none >
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 1.37 s, 1372.69 ms, 1372693 µs
Matches per sec: 72,849
Regex2: (?s)\[cust\]:(.*?)(?=\[\w+\]:|$)
Options: < none >
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 0.92 s, 918.17 ms, 918175 µs
Matches per sec: 108,911
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.