简体   繁体   English

规范化两个字符串然后比较

[英]Nomalize Two Strings Then Compare

I have 2 strings which both are some kind of reference number (have a prefix and digits).我有 2 个字符串,它们都是某种参考号(有前缀和数字)。

string a = "R&D123";
string b = "R&D 123";

string a and string b are two different user input, and I'm trying to compare if the two strings matches. string astring b是两个不同的用户输入,我试图比较这两个字符串是否匹配。

I know I can use String.Compare() to check if two strings are the same, but like in the example above, they could be different strings but are technically the same thing.我知道我可以使用String.Compare()来检查两个字符串是否相同,但就像上面的例子一样,它们可能是不同的字符串,但在技术上是一样的。

Because they are both user inputs (from different users), there can be several different formats.因为它们都是用户输入(来自不同用户),所以可以有几种不同的格式。

"R&D123"
"R&D 123" //with space in between
"R.D.123 " //using period or other character
"r&d123" //different case
"RD123" //no special character
...etc

Is there a way I can somehow "normalize" the two strings first then compare them??有没有办法我可以先“标准化”两个字符串然后比较它们?

I know a easy-to-understand way is use string.Replace() to replace special characters and spaces to blank space and use string.ToLower() so I don't have to worry about cases.我知道一个易于理解的方法是使用string.Replace()将特殊字符和空格替换为空格并使用string.ToLower()所以我不必担心案例。 But the problem with this method is that if I have many special characters, I'll be doing .Replace() quite a few times and that's not ideal.但是这种方法的问题是,如果我有很多特殊字符,我会多次执行.Replace() ,这并不理想。

Another problem is that R&D is not the only prefix I need to worry about, there are others such as AP , KD , etc. Not sure if this will make a difference :/另一个问题是R&D并不是我需要担心的唯一前缀,还有其他前缀,例如APKD等。不确定这是否会有所作为:/

Any help is appreciated, thanks!任何帮助表示赞赏,谢谢!

If you want to just letters and digits,you can do it with linq:如果您只想输入字母和数字,可以使用 linq:

var array1 = a.Where(x =>char.IsLetterOrDigit(x)).ToArray();
var array2 = b.Where(x => char.IsLetterOrDigit(x)).ToArray();
var normalizedStr1 = new String(array1).ToLower();
var normalizedStr2 = new String(array2).ToLower();

String.Compare(normalizedStr1,normalizedStr2);

This might not be the prettiest way to to do but it's the fastest这可能不是最漂亮的方法,但它是最快的

   static void Main(string[] args)
    {
        string sampleResult = NormlizeAlphaNumeric("Hello wordl 3242348&&))&)*^&#R&#&R#)R#@)R#@R#R#@");

    }

    public static string NormlizeAlphaNumeric(string someValue)
    {
        var sb = new StringBuilder(someValue.Length);
        foreach (var ch in someValue)
        {
            if(char.IsLetterOrDigit(ch))
            {
                sb.Append(ch);
            }
        }
        return sb.ToString().ToLower();
    }

try this...尝试这个...

string s2 = Regex.Replace(s, @"[^[a-zA-Z0-9]]+", String.Empty);

it will replace all the special characters and give you the normalize string.它将替换所有特殊字符并为您提供规范化字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM