简体   繁体   English

从文本文件 C# 中提取特定单词

[英]Specific Word Extraction from a text file C#

I currently have a method I am building that goes through a text file using streamreader.我目前有一个正在构建的方法,该方法使用 streamreader 遍历文本文件。 I want to use either regex or something similar to change the current method which you'll see just below here.我想使用正则表达式或类似的东西来更改您将在下面看到的当前方法。

using (StreamReader fs = File.OpenText(FilePath))
    {

        int count = 0; //counts the number of times wordResponse is found.
        int lineNumber = 0;
        while (!fs.EndOfStream)
        {
            string line = fs.ReadLine();
            lineNumber++;
            int position = line.IndexOf(WordSearch);
            if (position != -1)
            {
                count++;
                Console.WriteLine("Match#{0} line {1}: {2}", count, lineNumber, line);
            }
        }

        if (count == 0)
        {
            Console.WriteLine("your word was not found!");
        }
        else
        {
            Console.WriteLine("Your word was found " + count + " times!");
        }
        Console.WriteLine("Press enter to quit.");
        Console.ReadKey();
    }

the output I get from the current method is:我从当前方法得到的输出是:

Match#1 line 3: Proin eleifend tortor velit, **True** quis aliquam arcu congue ut. Fusce sed mattis purus, sed vehicula diam. Nullam in leo sit amet massa pharetra semper et vel diam.
Match#2 line 7: lobortis nisl. Fusce dignissim ligula **True** a nunc maximus, vitae sollicitudin erat dictum. Vivamus commodo massa a tellus gravida posuere.
Match#3 line 17: **True** Sed pellentesque ipsum vel neque accumsan, quis fermentum augue pretium. Praesent fermentum risus nec ultricies sodales.
Match#4 line 24: Fusce nulla risus, ornare in eleifend id, **True** tincidunt eu sem. Donec enim sapien, rhoncus vitae ex lobortis, sagittis molestie libero.
Your word was found 4 times!
Press enter to quit.

As you can see I get the entire line of code, when all I want is a single word from each sentence.正如你所看到的,我得到了整行代码,而我想要的只是每个句子中的一个单词。 The word it is searching for right now is True它现在正在搜索的词是True

I believe it is the string string line = fs.ReadLine();我相信它是字符串string line = fs.ReadLine(); I have to manipulate a few extra steps to get the result I want.我必须操纵一些额外的步骤才能得到我想要的结果。

Any tips or pointers would be appreciated.任何提示或指示将不胜感激。

就这么简单.....?

Console.WriteLine("Match#{0} line {1}: {2}", count, lineNumber, WordSearch);

You just need to add this after int position =...你只需要在 int position =...

var word = line.SubString(position, Word.Length)

Then然后

Console.WriteLine("Match#{0} line {1}: {2}", count, lineNumber, word);

I want to use either regex or something similar...我想使用正则表达式或类似的东西......

Since you mention an interest in changing your current implementation to use a regular expression I'll offer up this snippet:由于您提到有兴趣更改当前实现以使用正则表达式,因此我将提供以下代码段:

var matches = Regex.Match(line, $".*({WordSearch})\\b.*", RegexOptions.IgnoreCase);
if (matches.Captures.Count > 0)
{
    count++;
    Console.WriteLine($"Match#{count} line {lineNumber}: {matches.Groups[1]}");
}        

The RegexOption.IgnoreCase in the Match constructor seemed appropriate along with adding the \\b in the expression to limit partial matches. Match构造函数中的RegexOption.IgnoreCase在表达式中添加\\b以限制部分匹配似乎很合适。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM