简体   繁体   English

在 2 个 html 标签之间添加文本

[英]Adding text between 2 html tags

I am a 2 years student and I am working on text mining.我是一名 2 岁的学生,我正在从事文本挖掘工作。

For general let me tell you about the code it first accept pdf type text and convert that in to doc.txt file, then I process that data for couple of hundred lines then after i store all sentences in that text to the list called all_text (for th future use) and also I select some texts and store them in to a list called summary .一般来说,让我告诉你它首先接受 pdf 类型文本并将其转换为doc.txt文件的代码,然后我处理几百行的数据,然后我将该文本中的所有句子存储到名为all_text的列表中(供将来使用),我还选择了一些文本并将它们存储到一个名为summary的列表中。

Finally the problem is on this part:最后问题出在这一部分:

Summary list look like this摘要列表如下所示

summary=['Artificial Intelligence (AI) is a science and a set of computational technologies that are inspired by—but typically operate quite differently from—the ways people use their nervous systems and bodies to sense, learn, reason, and take action.','In reality, AI is already changing our daily lives, almost entirely in ways that improve human health, safety,and productivity.','AI is also changing how people interact with technology.']

What I want is read from doc.txt sentence by sentence and if that sentence is in the summary list modify that sentence by put it in to BOLD tag " the sentence " for all in the summary list here is small code i tried for that specific part it not help full but here it is我想要的是从 doc.txt 逐句读取,如果该句子在摘要列表中,请修改该句子,将其放入粗体标签“句子”中,对于摘要列表中的所有内容,这里是我针对该特定尝试的小代码部分它无济于事,但在这里

while i < len(lis):
    if lis[i] in txt:
        txt = txt.replace(lis[i], "<b>" + lis[i] + "</b>")

        print(lis[i])

   i += 1

This code did not work as I expected, I mean it works for some short sentences, but it doesn't work for the sentences like those I don't have any idea why it's not working help me please?这段代码没有像我预期的那样工作,我的意思是它适用于一些短句子,但它不适用于那些我不知道为什么它不起作用的句子请帮助我?

For that purpose you might use list comprehension, example:为此,您可以使用列表理解,例如:

summary = ['sentenceE','sentenceA']
text = ['sentenceA','sentenceB','sentenceC','sentenceD','sentenceE']
output = ['<b>'+i+'</b>' if (i in summary) else i for i in text]
print(output) #prints ['<b>sentenceA</b>', 'sentenceB', 'sentenceC', 'sentenceD', '<b>sentenceE</b>']

Note that summary and text should be list s of str s.请注意, summarytext应该是strlist

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM