简体   繁体   中英

Get specific words from string c#

I am working on a final year project. I have a file that contain some text. I need to get words form this file that contain "//jj" tag. eg abc//jj, bcd//jj etc.

suppose file is containing the following text

ffafa adada//bb adad ssss//jj aad adad adadad aaada dsdsd//jj dsdsd sfsfhf//vv dfdfdf

I need all the words that are associated with //jj tag. I am stuck here past few days. My code that i am trying

  // Create OpenFileDialog
        Microsoft.Win32.OpenFileDialog dlg = new Microsoft.Win32.OpenFileDialog();

        // Set filter for file extension and default file extension
        dlg.DefaultExt = ".txt";
        dlg.Filter = "Text documents (.txt)|*.txt";

        // Display OpenFileDialog by calling ShowDialog method
        Nullable<bool> result = dlg.ShowDialog();

        // Get the selected file name and display in a TextBox
        string filename = string.Empty;
        if (result == true)
        {
            // Open document
            filename = dlg.FileName;
            FileNameTextBox.Text = filename;
        }

        string text;
        using (var streamReader = new StreamReader(filename, Encoding.UTF8))
        {
            text = streamReader.ReadToEnd();
        }

        string FilteredText = string.Empty;

        string pattern = @"(?<before>\w+) //jj (?<after>\w+)";

        MatchCollection matches = Regex.Matches(text, pattern);

        for (int i = 0; i < matches.Count; i++)
        {
            FilteredText="before:" + matches[i].Groups["before"].ToString();
            //Console.WriteLine("after:" + matches[i].Groups["after"].ToString());
        }

        textbx.Text = FilteredText;

I cant find my result please help me.

With LINQ you could do this with one line:

string[] taggedwords = input.Split(' ').Where(x => x.EndsWith(@"//jj")).ToArray();

And all your //jj words will be there...

Personally I think Regex is overkill if that's definitely how the string will look. You haven't specified that you definitely need to use Regex so why not try this instead?

// A list that will hold the words ending with '//jj'
List<string> results = new List<string>();

// The text you provided
string input = @"ffafa adada//bb adad ssss//jj aad adad adadad aaada dsdsd//jj dsdsd sfsfhf//vv dfdfdf";

// Split the string on the space character to get each word
string[] words = input.Split(' ');

// Loop through each word
foreach (string word in words)
{
    // Does it end with '//jj'?
    if(word.EndsWith(@"//jj"))
    {
        // Yes, add to the list
        results.Add(word);
    }
}

// Show the results
foreach(string result in results)
{
    MessageBox.Show(result);
}

Results are:

ssss//jj
dsdsd//jj

Obviously this is not quite as robust as a regex, but you didn't provide any more detail for me to go on.

You have an extra space in your regex, it assumes there's a space before "//jj". What you want is:

 string pattern = @"(?<before>\w+)//jj (?<after>\w+)";

This regular expression will yield the words you are looking for:

string pattern = "(\\S*)\\/\\/jj"

A bit nicer without backslash escaping:

(\S*)\/\/jj

Matches will include the //jj but you can get the word from the first bracketed group.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM