简体   繁体   English

使用RegEx比较两个值

[英]Compare two values using RegEx

如果我有两个值,例如/ ABC001和ABC100或A0B0C1和A1B0C0,是否可以使用RegEx来确保两个值具有相同的模式?

If you don't know the pattern in advance, but are only going to encounter two groups of characters (alpha and digits), then you could do the following: 如果您不预先知道模式,而只遇到两组字符(字母和数字),则可以执行以下操作:

Write some C# that parsed the first pattern, looking at each char and determine if it's alpha, or digit, then generate a regex accordingly from that pattern. 编写一些解析第一个模式的C#,查看每个字符并确定它是字母还是数字,然后从该模式相应地生成一个正则表达式。

You may find that there's no point writing code to generate a regex, as it could be just as simple to check the second string against the first. 您可能会发现没有必要编写代码来生成正则表达式,因为将第二个字符串与第一个字符串进行比较可能很简单。

Alternatively, without regex: 或者,不使用正则表达式:

First check the strings are the same length. 首先检查字符串长度是否相同。 Then loop through both strings at the same time, char by char. 然后同时逐个循环遍历两个字符串。 If char[x] from string 1 is alpha, and char[x] from string two is the same, you're patterns are matching. 如果来自字符串1的char [x]是alpha,并且来自字符串2的char [x]是相同的,则说明模式是匹配的。

Try this, it should cope if a string sneaks in some symbols. 尝试此操作,它可以应对字符串是否潜入某些符号的情况。 Edited to compare character values ... and use Char.IsLetter and Char.IsDigit 编辑以比较字符值 ... 并使用Char.IsLetter和Char.IsDigit

private bool matchPattern(string string1, string string2)
{
    bool result = (string1.Length == string2.Length);
    char[] chars1 = string1.ToCharArray();
    char[] chars2 = string2.ToCharArray();

    for (int i = 0; i < string1.Length; i++)
    {
        if (Char.IsLetter(chars1[i]) != Char.IsLetter(chars2[i]))
        {
            result = false;
        }
        if (Char.IsLetter(chars1[i]) && (chars1[i] != chars2[i]))
        {   
            //Characters must be identical
            result = false;
        }
        if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
            result = false;
    }
    return result;
}

Well, here's my shot at it. 好吧,这是我的看法。 This doesn't use regular expressions, and assumes s1 and s2 only contain numbers or digits: 这不使用正则表达式,并假设s1s2仅包含数字或数字:

public static bool SamePattern(string s1, string s2)
{
   if (s1.Length == s2.Length)
   {
      char[] chars1 = s1.ToCharArray();
      char[] chars2 = s2.ToCharArray();

      for (int i = 0; i < chars1.Length; i++)
      {
         if (!Char.IsDigit(chars1[i]) && chars1[i] != chars2[i])
         {
            return false;
         }
         else if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
         {
            return false;
         }
      }

      return true;
   }
   else
   {
      return false;
   }
}

A description of the algorithm is as follows: 该算法的说明如下:

  1. If the strings have different lengths, return false . 如果字符串的长度不同,则返回false
  2. Otherwise, check the characters in the same position in both strings: 否则,请检查两个字符串中相同位置的字符:
    1. If they are both digits or both numbers, move on to the next iteration. 如果它们都是数字或两个数字,请继续进行下一个迭代。
    2. If they aren't digits but aren't the same, return false . 如果它们不是数字但不相同,则返回false
    3. If one is a digit and one is a number, return false . 如果一个是数字,一个是数字,则返回false
  3. If all characters in both strings were checked successfully, return true . 如果两个字符串中的所有字符都已成功检查,则返回true

Consider using Char.GetUnicodeCategory 考虑使用Char.GetUnicodeCategory
You can write a helper class for this task: 您可以为此任务编写一个帮助程序类:

public class Mask
{
    public Mask(string originalString)
    {
        OriginalString = originalString;
        CharCategories = originalString.Select(Char.GetUnicodeCategory).ToList();
    }

    public string OriginalString { get; private set; }
    public IEnumerable<UnicodeCategory> CharCategories { get; private set; }

    public bool HasSameCharCategories(Mask other)
    {
        //null checks
        return CharCategories.SequenceEqual(other.CharCategories);
    }
}

Use as 用于

Mask mask1 = new Mask("ab12c3");
Mask mask2 = new Mask("ds124d");
MessageBox.Show(mask1.HasSameCharCategories(mask2).ToString());

I don't know C# syntax but here is a pseudo code: 我不知道C#语法,但这是一个伪代码:

  • split the strings on '' 在''上分割字符串
  • sort the 2 arrays 对2个数组排序
  • join each arrays with '' 用''连接每个数组
  • compare the 2 strings 比较两个字符串

A general-purpose solution with LINQ can be achieved quite easily. LINQ的通用解决方案可以轻松实现。 The idea is: 这个想法是:

  1. Sort the two strings (reordering the characters). 对两个字符串进行排序(对字符重新排序)。
  2. Compare each sorted string as a character sequence using SequenceEquals . 使用SequenceEquals比较每个排序的字符串作为字符序列。

This scheme enables a short, graceful and configurable solution, for example: 此方案可实现一个简短,优美和可配置的解决方案,例如:

// We will be using this in SequenceEquals
class MyComparer : IEqualityComparer<char>
{
    public bool Equals(char x, char y)
    {
        return x.Equals(y);
    }

    public int GetHashCode(char obj)
    {
        return obj.GetHashCode();
    }
}

// and then:
var s1 = "ABC0102";
var s2 = "AC201B0";

Func<char, double> orderFunction = char.GetNumericValue;
var comparer = new MyComparer();
var result = s1.OrderBy(orderFunction).SequenceEqual(s2.OrderBy(orderFunction), comparer);

Console.WriteLine("result = " + result);

As you can see, it's all in 3 lines of code (not counting the comparer class). 如您所见,全部都是3行代码(不计算比较器类)。 It's also very very easily configurable. 它也非常容易配置。

  • The code as it stands checks if s1 is a permutation of s2 . 现在的代码检查s1是否为s2的排列。
  • Do you want to check if s1 has the same number and kind of characters with s2 , but not necessarily the same characters (eg "ABC" to be equal to "ABB")? 您是否要检查s1是否具有与s2相同的数量和种类的字符,但不一定是相同的字符(例如,“ ABC”等于“ ABB”)? No problem, change MyComparer.Equals to return char.GetUnicodeCategory(x).Equals(char.GetUnicodeCategory(y)); 没问题,更改MyComparer.Equals return char.GetUnicodeCategory(x).Equals(char.GetUnicodeCategory(y)); .
  • By changing the values of orderFunction and comparer you can configure a multitude of other comparison options. 通过更改orderFunctioncomparer的值,您可以配置许多其他比较选项。

And finally, since I don't find it very elegant to define a MyComparer class just to enable this scenario, you can also use the technique described in this question: 最后,由于我发现仅定义MyComparer类来实现此方案不是很优雅,因此您还可以使用此问题中描述的技术:

Wrap a delegate in an IEqualityComparer 在IEqualityComparer中包装一个委托

to define your comparer as an inline lambda. 将比较器定义为内联lambda。 This would result in a configurable solution contained in 2-3 lines of code. 这将导致2-3行代码中包含可配置的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM