简体繁体 English

在大文本句子语料库中搜索句子

[英]Search the sentence in large text sentence corpus

原文 2018-05-26 14:31:03 6 1 string/ algorithm/ text

I am a beginner and I want to know if there's way to search a text sentence in a large text sequence of data (say 1 million) and search accordingly like when a user type: 我是一个初学者，我想知道是否有办法搜索大文本数据序列（例如一百万）中的文本句子，并像用户键入时一样进行相应的搜索：

I shouldn't be there 我不应该在那里

then it should search for sequence like this: 那么它应该搜索这样的序列：

I should not be there 我不应该在那里

similar like this : 类似这样：

I gonna go there. 我要去那里。

to 至

I going to go there. 我要去那里。

I have been thinking for couple of days to figure out solution of this problem. 我已经思考了几天，以找出解决该问题的方法。

If you know anything about how to deal with this problem then please provide a solution or just a hint would be more than enough. 如果您对如何解决此问题一无所知，请提供解决方案，否则仅是提示就足够了。 Thank you. 谢谢。

1 个解决方案

I would firstly go trough both the sentence and text and replace all contractions with the long form. 首先，我将遍历句子和文本，并以长格式替换所有紧缩。 Then after that use Knuth-Morris-Pratt. 然后使用Knuth-Morris-Pratt。

在字符串中搜索句子/单词 - Search string for sentence/word

（PHP）将10个单词的句子随机插入大文本文档中 - (PHP) randomly insert a 10 word sentence into a large text document

"将句子列表替换为文本" - replace list of sentence to the text

将文本解析为有效的句子 - parse text into valid sentence

有没有一种简单的方法可以在 Java 中搜索可能包含行分隔符的文本中的句子？ - Is there a simple way to search for a sentence in a text that may contain line separators in Java?

搜索文本文件中的字符串以及上一句和下一句 - Search a string in text file and also its previous and next sentence

搜索字符串中的句子（C＃） - Search a sentence in a string (C#)

返回包含搜索字符串的句子 - Return sentence that contains search string

嗨，我想在文本视图中逐句显示文本 - hi, i want display the text in text view sentence by sentence

从文本行中提取句子 - Extract sentence from lines of text

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在字符串中搜索句子/单词 - Search string for sentence/word （PHP）将10个单词的句子随机插入大文本文档中 - (PHP) randomly insert a 10 word sentence into a large text document "将句子列表替换为文本" - replace list of sentence to the text 将文本解析为有效的句子 - parse text into valid sentence 有没有一种简单的方法可以在 Java 中搜索可能包含行分隔符的文本中的句子？ - Is there a simple way to search for a sentence in a text that may contain line separators in Java? 搜索文本文件中的字符串以及上一句和下一句 - Search a string in text file and also its previous and next sentence 搜索字符串中的句子（C＃） - Search a sentence in a string (C#) 返回包含搜索字符串的句子 - Return sentence that contains search string 嗨，我想在文本视图中逐句显示文本 - hi, i want display the text in text view sentence by sentence 从文本行中提取句子 - Extract sentence from lines of text

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM