[英]Linq for getting words in sentences
我有一個單詞列表和一個句子列表。 我想知道哪些話可以在句子中找到。
這是我的代碼:
List<string> sentences = new List<string>();
List<string> words = new List<string>();
sentences.Add("Gallia est omnis divisa in partes tres, quarum unam incolunt Belgae, aliam Aquitani, tertiam qui ipsorum lingua Celtae, nostra Galli appellantur.");
sentences.Add("Alea iacta est.");
sentences.Add("Libenter homines id, quod volunt, credunt.");
words.Add("est");
words.Add("homines");
List<string> myResults = sentences
.Where(sentence => words
.Any(word => sentence.Contains(word)))
.ToList();
我需要的是一個元組列表。 句子和單詞,在句子中找到。
首先,我們必須定義什么是單詞 。 讓它成為字母和撇號的任意組合 。
Regex regex = new Regex(@"[\p{L}']+");
其次,我們應該考慮如何處理案件 。 讓我們實現不區分大小寫的例程:
HashSet<string> wordsToFind = new HashSet<string>(StringComparer.OrdinalIgnoreCase) {
"est",
"homines"
};
然后我們可以使用正則Regex
匹配句子中的單詞,並使用Linq查詢句子:
碼:
var actualWords = sentences
.Select((text, index) => new {
text = text,
index = index,
words = regex
.Matches(text)
.Cast<Match>()
.Select(match => match.Value)
.ToArray()
})
.SelectMany(item => item.words
.Where(word => wordsToFind.Contains(word))
.Select(word => Tuple.Create(word, item.index + 1)));
string report = string.Join(Environment.NewLine, actualWords);
Console.Write(report);
結果:
(est, 1) // est appears in the 1st sentence
(est, 2) // est appears in the 2nd sentence as well
(homines, 3) // homines appears in the 3d sentence
如果你想要Tuple<string, string>
for word , sentence ,只需在最后一個Select
更改Tuple.Create(word, item.index + 1)
Tuple.Create(word, item.text)
Tuple.Create(word, item.index + 1)
你可以這樣試試,
var result = from sentence in sentences
from word in words
where sentence.Contains(word)
select Tuple.Create(sentence, word);
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.