简体   繁体   中英

String.Contains() method - incorrect when only numbers in string

Some months ago, I created a little method in my program to search for strings within a defined column in a 2D array (of String types) which works perfectly but when it comes to strings containing numbers or dot-separated numbers, it fails very badly.

    private void gather_matches()
    {
        SearchFor = null;
        SearchFor = tb_text.Text.ToLower();
        Int32 Column = cb_main.SelectedIndex;
        Int32 Counter = 0;
        for (Int32 i = 0; i < DYL; i++)
        {
            if (Data[i, XUniprotID] == null) break;
            else
            {
                if (Data[i, Column] == null) continue;
                if (Data[i, Column].ToLower().Contains(SearchFor))
                {

                    for (Int32 j = 0; j < DXL; j++)
                    {
                        Found[Counter, j] = Data[i, j];

                    }
                    Counter++;
                }
            }
        }

Very simple code but it works except for those columns (yes I checked if the Index is still correct). That's the input: 在此处输入图片说明

When searching for "3" in Cath Class column, it spits out 3, 2, 1 and empty cells. When searching for "30" in Cath Architecture, it spits out everything that contains a 3 and a 0. When searching for 3.40 in Cath Architecture, it spits out that it found nothing.

What might be the problem? Haven't seen anything in the internet about that method having struggles with length or special characters.

Edith1 says:

How that data was created:

    private void cut_cath()
    {
        for (Int32 i = 0; i < DYL; i++)
        {
            if (Data[i, XUniprotID] == null) break;
            try
            {
                Datapath = startupPath + "\\cath+" + Data[i, XUniprotID] + ".txt";
                using (StreamReader Read = new StreamReader(Datapath))
                {
                    String Reader = Read.ReadToEnd();
                    String[] Parts = Regex.Split(Reader, "\t");
                    Data[i, Xcath] = Parts[0];
                    String[] CathParts = Parts[0].Split('.');
                    Data[i, XcaCl] = CathParts[0];
                    Data[i, XcaArch] = CathParts[0]+"."+CathParts[1];
                    Data[i, XcaTopo] = CathParts[0]+"."+CathParts[1]+"."+CathParts[2];
                    Data[i, XcaHomo] = Parts[0];
                    Data[i, XcaDom] = Parts[1];
                    Read.Close();
                }
            }
            catch
            {
                continue;
            }
        }

    }

Edit2:

Output when searching for "3.40" in Cath Architecture Column: 在此处输入图片说明

As you can see, it's mostly correct but some aren't matching and still there.

Edit3:

Added Code:

     public bool Kontainser(String Value, String Input) //yeah, I know, stupid name...
     {
         return Input.IndexOf(Value, StringComparison.OrdinalIgnoreCase) >= 0;
     }

[...]

                if (Data[i, Column] == null) continue;

                if (Kontainser(SearchFor, Data[i, Column]))
                {

                    for (Int32 j = 0; j < DXL; j++)
                    {
                        Found[Counter, j] = Data[i, j];

                    }
                    Counter++;
                }

Now it works perfectly for half of the search and then decides to ignore the IF. The search was "3.40.50" in the CathTopology column.

Output: 在此处输入图片说明

All that drama just in these CATH and Genome3D columns... nowhere else.

AND I SOLVED IT... can't believe it was that simple... String Helper = Data[i, Column].ToLower(); if (Helper.Contains(SearchFor)) I added just one line of code out of the blue. It seems thatToLower() and Contains() had little conflict. oO Although it still does some strange stuff when meeting with not exactly matching queries...

In the future you can do a case insensitive bar.Contains(foo) by doing

if(bar.IndexOf(foo, StringComparison.OrdinalIgnoreCase) >= 0)
{

}

Internally the code for Contains(string value) is (code retrieved from the reference source )

[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
[__DynamicallyInvokable]
public bool Contains(string value)
{
  return this.IndexOf(value, StringComparison.Ordinal) >= 0;
}

So the performance should fairly close to just using Contains itself, and will likely be much better than using ToLower() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM