[英]Count occurrence of whole word in a string
我想找到字符串中特定單詞的出現次數。
我在網上搜索並找到了很多答案,例如
但他們都沒有給我准確的結果。
我想要的是:
輸入:
I have asked the question in StackOverflow. Therefore i can expect answer here.
“The”關鍵字的輸出:
The keyword count: 2
注意:它不應該考慮句子中的“The”和“The”。
基本上我想匹配整個單詞並得到計數。
像這樣嘗試
var searchText=" the ";
var input="I have asked the question in StackOverflow. Therefore i can expect answer here.";
var arr=input.Split(new char[]{' ','.'});
var count=Array.FindAll(arr, s => s.Equals(searchText.Trim())).Length;
Console.WriteLine(count);
編輯
對於你的搜索句子
var sentence ="I have asked the question in StackOverflow. Therefore i can expect answer here.";
var searchText="have asked";
char [] split=new char[]{',',' ','.'};
var splitSentence=sentence.ToLower().Split(split);
var splitText=searchText.ToLower().Split(split);
Console.WriteLine("Search Sentence {0}",splitSentence.Length);
Console.WriteLine("Search Text {0}",splitText.Length);
var count=0;
for(var i=0;i<splitSentence.Length;i++){
if(splitSentence[i]==splitText[0]){
var index=i;
var found=true;
var j=0;
for( j=0;j<splitText.Length;j++){
if(splitSentence[index++]!=splitText[j])
{
found=false;
break;
}
}
if(found){
Console.WriteLine("Index J {0} ",j);
count++;
i= index >i ? index-1 : i;
}
}
}
Console.WriteLine("Total found {0} substring",count);
一個可能的解決方案是使用正則表達式:
var count = Regex.Matches(input.ToLower(), String.Format("\b{0}\b", "the")).Count;
試試這樣(方式1)
string SpecificWord = " the ";
string sentence = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
int count = 0;
foreach (Match match in Regex.Matches(sentence, SpecificWord, RegexOptions.IgnoreCase))
{
count++;
}
Console.WriteLine("{0}" + " Found " + "{1}" + " Times", SpecificWord, count);
或像這樣(方式2)
string SpecificWord = " the ";
string sentence = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
int WordPlace = sentence.IndexOf(SpecificWord);
Console.WriteLine(sentence);
int TimesRep;
for (TimesRep = 0; WordPlace > -1; TimesRep++)
{
sentence = (sentence.Substring(0, WordPlace) +sentence.Substring(WordPlace +SpecificWord.Length)).Replace(" ", " ");
WordPlace = sentence.IndexOf(SpecificWord);
}
Console.WriteLine("this word Found " + TimesRep + " time");
您可以使用 while 循環來搜索第一次出現的索引,然后從找到的索引 ++ 位置進行搜索,並在循環末尾設置一個計數器。 While 循環一直持續到 index == -1。
像這樣嘗試
string Text = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
Text = Text.ToLower();
Dictionary<string, int> frequencies = null;
frequencies = new Dictionary<string, int>();
string[] words = Regex.Split(Text, "\\W+");
foreach (string word in words)
{
if (frequencies.ContainsKey(word))
{
frequencies[word] += 1;
}
else
{
frequencies[word] = 1;
}
}
foreach (KeyValuePair<string, int> entry in frequencies)
{
string word = entry.Key;
int frequency = entry.Value;
Response.Write(word.ToString() + "," + frequency.ToString()+"</br>");
}
並搜索特定單詞然后嘗試像這樣。
string Text = "I have asked the question in StackOverflow. Therefore the i can expect answer here.";
Text = Text.ToLower();
string searchtext = "the";
searchtext = searchtext.ToLower();
string[] words = Regex.Split(Text, "\\W+");
foreach (string word in words)
{
if (searchtext.Equals(word))
{
count = count + 1;
}
else
{
}
}
Response.Write(count);
嗯,問題並不像你想象的那么簡單; 有許多問題應該注意,例如標點符號、字母大小寫以及如何識別單詞邊界之類的問題。 但是使用N_Gram概念,我提供了以下解決方案:
1- 確定密鑰中有多少個單詞。 將其命名為 N
2- 提取文本中所有 N 個連續的單詞序列 (N_Grams)。
3- 計算 N_Grams 中 key 的出現次數
string text = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
string key = "the question";
int gram = key.Split(' ').Count();
var parts = text.Split(' ');
List<string> n_grams = new List<string>();
for (int i = 0; i < parts.Count(); i++)
{
if (i <= parts.Count() - gram)
{
string sequence = "";
for (int j = 0; j < gram; j++)
{
sequence += parts[i + j] + " ";
}
if (sequence.Length > 0)
sequence = sequence.Remove(sequenc.Count() - 1, 1);
n_grams.Add(sequence);
}
}
// The result
int count = n_grams.Count(p => p == key);
}
例如對於 key = the question
並考慮single space
作為單詞邊界,提取以下二元組:
我有
問過了
問
問題
提問
在 StackOverflow 中。
堆棧溢出。 所以
因此我
我可以
可以期待
期待答案
在這里回答。
並且the question
在文中出現的次數不明顯:1
此解決方案應該適用於字符串所在的任何地方:
var str = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
var numMatches = Regex.Matches(str.ToUpper(), "THE")
.Cast<Match>()
.Count(match =>
(match.Index == 0 || str[match.Index - 1] == ' ') &&
(match.Index + match.Length == str.Length ||
!Regex.IsMatch(
str[match.Index + match.Length].ToString(),
"[a-zA-Z]")));
string input = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
string pattern = @"\bthe\b";
var matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine(matches.Count);
請參閱正則表達式錨點- "\\b"。
Count 出現在字符串中的整個單詞的可能性有很多。
例如
第一的:
string name = "pappu kumar sdffnsd sdfnsdkfbsdf sdfjnsd fsdjkn fsdfsd sdfsd pappu kumar";
var res= name.Contains("pappu kumar");
var splitval = name.Split("pappu kumar").Length-1;
第二:
var r = Regex.Matches(name, "pappu kumar").Count;
怎么樣(似乎比其他解決方案更有效):
public static int CountOccurences(string haystack, string needle)
{
return (haystack.Length - haystack.Replace(needle, string.Empty).Length) / needle.Length;
}
試試這也適用於結構化數據。
var splitStr = inputStr.Split(' ');
int result_count = splitStr.Count(str => str.Contains("userName"));
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.