[英]How to get back original position of a word in a preprocessed sentence using python?
I am trying to get a sentence from User and preprocessing the same to remove special characters using regex at backend then I need to send back the position of a particular word in order to highlight that word to User, but facing conflict as the position of original and the preprocessed sentence are different.我正在尝试从用户那里获取一个句子并对其进行预处理以在后端使用正则表达式删除特殊字符然后我需要发回特定单词的位置以便向用户突出显示该单词,但面临冲突作为原始位置和预处理的句子不同。
Is there any best method to solve the above issue using Python?是否有使用 Python 解决上述问题的最佳方法?
For example:例如:
import re
def text_preprocessing(input_text, string_to_find):
print("Original text is:", input_data)
cleaned_text = [re.sub('[^a-zA-Z0-9#.+]', " ", input_data)]
cleaned_text = [re.sub(' +', " ", text) for text in cleaned_text]
for cleaned_text in cleaned_text: # just to convert list to string
print("preprocessed text is:", cleaned_text)
position = cleaned_text.find(string_to_find)
position = [position, position + len(string_to_find)]
return position
input_text = 'Hi! Hello'
string_to_find = 'Hello'
position = text_preprocessing(input_text, string_to_find)
print(position)
Actual Output实际产量
Original text is: Hi! Hello
preprocessed text is: Hi Hello
[3, 8]
original sentence = 'Hi!原句 = '嗨! Hello'
你好'
Preprocessed sentence = 'Hi Hello' (just removed '!' symbol)预处理语句 = 'Hi Hello' (刚刚去掉了 '!' 符号)
In case i need to highlight the word "Hello" I just returning the position from backend as (3,8) but the actual position in UI is (4, 9)如果我需要突出显示“你好”这个词,我只是将后端的位置返回为(3,8)但 UI 中的实际位置是(4, 9)
Expected Output预期产出
Original text is: Hi! Hello
preprocessed text is: Hi Hello
[4, 9]
OS: windows 10, Python 3.7, used regex for preprocessing操作系统:windows 10,Python 3.7,使用正则表达式进行预处理
The first character in a string is at position 0
, then, Hello
is at position 3
in the string Hi Hello
.字符串中的第一个字符位于位置
0
,然后Hello
位于字符串Hi Hello
中的位置3
。
H
is at 0
H
为0
i
is at 1
i
在1
is at 2
2
H
is at 3
H
在3
e
is at 4
e
在4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.