简体   繁体   English

C#Regex部分字符串匹配

[英]C# Regex partial string match

everyone, i've got below function to return true if input is badword 每个人,如果输入是badword,我有以下函数返回true

public bool isAdultKeyword(string input)
{
    if (input == null || input.Length == 0)
    {
        return false;
    }
    else
    {
        Regex regex = new Regex(@"\b(badword1|badword2|anotherbadword)\b");
        return regex.IsMatch(input);
    }
}

above function only matched to whole string ie if input badword it wont match but it will when input is bawrod1. 上面的函数只匹配整个字符串,即如果输入badword它不匹配但输入时将是bawrod1。

what im trying to do it is get match when part of input contains one of the badwords 我试图做的是当输入的一部分包含一个坏词时获得匹配

Try: 尝试:

Regex regex = new Regex(@"(\bbadword1\b|\bbadword2\b|\banotherbadword\b)"); 
return regex.IsMatch(input);

Your method seems to be working fine. 你的方法似乎工作正常。 Can you clarify what wrong with it? 你能澄清一下它有什么问题吗? My tester program below shows it passing a number of tests with no failures. 我的下面的测试程序显示它通过了许多测试而没有失败。

using System;
using System.Text.RegularExpressions;

namespace CSharpConsoleSandbox {
  class Program {
    public static bool isAdultKeyword(string input) {
      if (input == null || input.Length == 0) {
        return false;
      } else {
        Regex regex = new Regex(@"\b(badword1|badword2|anotherbadword)\b");
        return regex.IsMatch(input);
      }
    }

    private static void test(string input) {
      string matchMsg = "NO : ";
      if (isAdultKeyword(input)) {
        matchMsg = "YES: ";
      }
      Console.WriteLine(matchMsg + input);
    }

    static void Main(string[] args) {
      // These cases should match
      test("YES badword1");
      test("YES this input should match badword2 ok");
      test("YES this input should match anotherbadword. ok");

      // These cases should not match
      test("NO badword5");
      test("NO this input will not matchbadword1 ok");
    }
  }
}

Output: 输出:

YES: YES badword1
YES: YES this input should match badword2 ok
YES: YES this input should match anotherbadword. ok
NO : NO badword5
NO : NO this input will not matchbadword1 ok

So under your logic, would you match as to ass? 所以根据你的逻辑,你会匹配屁股吗?

Also, remember the classic place Scunthorpe - your adult filter needs to be able to allow this word through. 此外,请记住经典的地方Scunthorpe - 您的成人过滤器需要能够通过这个词。

You probably don't have to do it in such a complex way but you can try to implement Knuth-Morris-Pratt . 你可能不必以这么复杂的方式去做,但你可以尝试实现Knuth-Morris-Pratt I had tried using it in one of my failed(totally my fault) OCR enhancer modules. 我曾尝试在我失败的(完全是我的错误)OCR增强器模块中使用它。

Is \\b the word boundary in a regular expression? \\ b是正则表达式中的单词边界吗?

In that case your regular expression is only looking for entire words. 在这种情况下,您的正则表达式仅查找整个单词。 Removing these will match any occurances of the badwords including where it has been included as part of a larger word. 删除这些将匹配坏词的任何出现,包括它作为较大单词的一部分被包括在内的位置。

Regex regex = new Regex(@"(bad|awful|worse)", RegexOptions.IgnoreCase);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM