用REGEX计算文本中的单词

Question

您好：)我必须在给定文本中查找具有以下限制的所有单词：

匹配应该不区分大小写。 并非所有匹配的子字符串都是单词，应该计数。 单词是由标点符号或文本开头/结尾分隔的字母序列。 输出应为单个整数。

我已经用StringComparison和for循环解决了它。

下面的代码是我尝试使用REGEX（C＃）进行的操作。 它只给我提供模式字的数量，但不知道限制。

您能给我一些如何改善我的REGEX模式的提示吗？

string patternWord = Console.ReadLine();
string[] inputSentence = Console.ReadLine().Split();
int count = 0;
string pattern = @"(?:\b\w+\ \s|\S)*" + patternWord + @"(?:\b\w+\b\ \s|\S)?";
Regex rx = new Regex(pattern, RegexOptions.IgnoreCase);
for (int i = 0; i < inputSentence.Length; i++)
{
    var mc = rx.Matches(inputSentence[i]);
    foreach (Match m in mc)
    {
        count++;
    }
}
Console.WriteLine("{0}", count);

编辑：

例：

输入字词-嗨

输入语句-隐藏网络仅对Hitachi设备说“ Hi” 。 嗨，Matuhi说。 嗨！

我只需要大胆的。

编辑2：我也编辑了限制。

Answer 1

一个简单的断字正则表达式怎么样？

\bhi\b

在此处输入图片说明

在C＃中，可以这样实现：

private static int WordCount(string word, string text)
{
    var regex = new Regex(string.Format(@"\b{0}\b", word), 
                      RegexOptions.IgnoreCase);
    return regex.Matches(text).Count;
}

Answer 2

很抱歉没有回答您的确切问题，但是为什么要使用正则表达式？ LINQ和Char类中的一些实用程序方法就足够了：

using System.Linq;

public class Test
{
    static void Main(string[] args)
    {
        string patternWord = Console.ReadLine();
        string inputSentence = Console.ReadLine();
        var words = GetWords(inputSentence);
        var count = words.Count(word => string.Equals(patternWord, word, StringComparison.InvariantCultureIgnoreCase));
        Console.WriteLine(count);
        Console.ReadLine();
    }

    private static IEnumerable<string> GetWords(string sentence)
    {
        while (!string.IsNullOrEmpty(sentence))
        {
            var word = new string(sentence.TakeWhile(Char.IsLetterOrDigit).ToArray());
            yield return word;
            sentence = new string(sentence.Skip(word.Length).SkipWhile(c => !Char.IsLetterOrDigit(c)).ToArray());
        }
    }
}

用REGEX计算文本中的单词

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-07-16 13:36:05

解决方案2
0 2014-07-16 11:50:55

用REGEX计算文本中的单词

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-07-16 13:36:05

解决方案2 0 2014-07-16 11:50:55

解决方案1
2 已采纳 2014-07-16 13:36:05

解决方案2
0 2014-07-16 11:50:55