简体   繁体   English

String.Contains给出错误的值

[英]String.Contains gives wrong values

I'm trying to create an application that searches through files, much like WindowsXP has. 我正在尝试创建一个搜索文件的应用程序,就像WindowsXP一样。 I'm using 4 threads that search through the specified directories and open every file to search for a string. 我正在使用4个线程来搜索指定目录并打开每个文件以搜索字符串。 This is done by calling a static method from a static class. 这是通过从静态类调用静态方法来完成的。 The method then tries to find out the extension, and runs it through a private method depending on what extension is found. 然后,该方法尝试找出扩展名,并根据找到的扩展名通过私有方法运行它。 I've only created the possibility to read plain text files to the class. 我仅创建了将纯文本文件读取到类的可能性。 Here is the code: 这是代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace Searcher
{
    static public class Searching 
    {
        static public bool Query(string file, string q)
        {
            file = file.ToLower();

            if (file.EndsWith(".txt")) // plain textfiles
            {
                return txt(file, q);
            } // #####################################
            else if (file.EndsWith(".doc"))
            {
                return false;
            } // #####################################
            else if (file.EndsWith(".dll")) // Ignore these
            {
                return false;
            }
            else if (file.EndsWith(".exe")) // Ignore these
            {
                return false;
            }
            else // will try reading as a textfile
            {
                return txt(file, q);
            }
        }

        static private bool txt(string file, string q)
        {
            string contents;
            TextReader read = new StreamReader(file);
            contents = read.ReadToEnd();
            read.Dispose();
            read.Close();

            return contents.ToLower().Contains(q);
        }

        static private bool docx(string file, string q)
        {
            return false;
        }
    }
}

Query reads the extension, and then forwards the processing. 查询读取扩展名,然后转发处理。 As I only included plain text files at this moment, not much can be chosen. 由于我目前仅包括纯文本文件,因此选择不多。 Before the search begins I also tell my program that it needs to read all files possible. 在搜索开始之前,我还告诉程序需要读取所有可能的文件。

Now my problem lies here, though the reader can only read plain text files, it also reads images and applications (.exe/.dll). 现在我的问题就在这里,尽管阅读器只能读取纯文本文件,但它也可以读取图像和应用程序(.exe / .dll)。 This is expected as it tries to read everything. 这是预期的,因为它尝试读取所有内容。 The weird thing though is that it returns with a match. 奇怪的是,它返回一个匹配项。 I've searched the files in Notepad++ but there were no matches. 我已经在Notepad ++中搜索了文件,但没有匹配项。 I also pulled out the content by using breakpoints right after the file is read into the 'contents'-variable, and tried to search that, but again without a match. 在将文件读入'contents'变量之后,我还通过使用断点提取了内容,并尝试搜索该内容,但又没有匹配项。 So this would mean that the content is not searched very well by the String.Contains() method, which seems to believe that the given query is in the file. 因此,这意味着String.Contains()方法对内容的搜索不是很好,该方法似乎认为给定的查询在文件中。

I hope someone knows what the problem could be. 我希望有人知道可能是什么问题。 The string I searched for was "test", and the program works when searching textfiles. 我搜索的字符串是“ test”,并且该程序在搜索文本文件时有效。

Glad you found a solution. 很高兴您找到了解决方案。

I'd still like to see some of the offending "false positive" files to be able to have a look. 我仍然希望看到一些令人讨厌的“误报”文件,以便能够查看。

In the meanwhile, and a bit of a tangent, but still relevant, I'd change your txt function to : 在此期间,虽然有点切线,但仍然有意义,但我将txt函数更改为:

private bool txt(string file, string q)
{
    string contents = "";
    using (TextReader read = new StreamReader(file))
    {
        contents = read.ReadToEnd();
    }

    return contents.ToLower().Contains(q);
}

Cleaner that way. 这样清洁。

Edit : 编辑:
Well, the reason they return true is because those files do contain the string "Test" in them, Specifically: [CCP_ TEST RMCCPSearchValidateProductIDSetODBCFoldersAllocateRegistrySpaceNOT] in the MSI and [OnUpda teSt ring] in the dll . 好吧,它们返回true的原因是因为这些文件中确实包含字符串“ Test”,特别是: MSI中的[CCP_ TEST RMCCPSearchValidateProductIDSetODBCFoldersAllocateRegistrySpaceNOT]和dll中的[OnUpda teSt ring]。 So, the String.Contains() is working properly. 因此, String.Contains()正常工作。

So, back to filtering what you're searching for. 因此,回到过滤您要搜索的内容。 Either give a list of known text endings, or let the user choose what he wants. 给出已知文本结尾的列表,或者让用户选择他想要的内容。

Some other things you might want to consider is only searching for exact words, so test won't be true in the case of OnUpdateString :) 您可能要考虑的其他一些事项仅是搜索精确的单词,因此对于OnUpdateStringtest不会正确:)

Text extensions: on wiki , on fileinfo 文本扩展名:在Wiki上 ,在fileinfo

I tried for a .Dll and exe file , It worked fine for me. 我尝试使用.Dll和exe文件,对我来说效果很好。 You are getting true because the value you are searching is present in the file. 之所以如此,是因为文件中存在要搜索的值。 Try opening the file with notepad and search for the value. 尝试使用记事本打开文件并搜索值。

also try searching for some other string like "eafrd" instead of test(which is a dictionary word which can be present in dll or exe files).It returned me false. 还尝试搜索“ eafrd”之类的其他字符串而不是test(这是DLL或exe文件中可以存在的词典词)。它返回false。

also see for any random word in the file which you opened in the notepad try searching for it. 还可以查看在记事本中打开的文件中是否有任何随机单词,然后尝试搜索它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM