简体   繁体   中英

Regular expression for performing task being done by string functions

The below code is performing following functionality which I intend to integrate into larger application.

  1. Splitting large input string input by dot (.) character wherever it occurs in input string.
  2. Storing the splitted substrings into array result[] ;
  3. In the foreach loop , a substring is matched for occurrence of keyword .
  4. If match occurs , starting from position of this matched substring in original input string , upto 300 characters are to be printed.

      string[] result = input.Split('.'); foreach (string str in result) { //Console.WriteLine(str); Match m = Regex.Match(str, keyword); if (m.Success) { int start = input.IndexOf(str); if ((input.Length - start) < 300) { Console.WriteLine(input.Substring(start, input.Length - start)); break; } else { Console.WriteLine(input.Substring(start, 300)); break; } } 

The input is in fact large amount of text and I think this should be done by regular expression. Being a novice ,I am not able to put everything together using a regular expressions .

Match keyword. Match m = Regex.Match(str, keyword);

300 characters starting from dot (.) ie starting from matched sentence , print 300 characters "^.\\w{0,300}"

What I intend to do is :

  1. Search for keyword in input text.

  2. Just as a match is found , start from the sentence containing the keyword and print upto 300 characters from input string.

    How should I proceed ? Please help .

If I got it right, all you need to do is find your keyword and capture all that follows until you find first dot or reach maximum number of characters:

@"keyword([^\.]{0,300})"

See sample demo here .

C# code:

var regex = new Regex(@"keyword([^\.]{0,300})");

foreach (Match match in regex.Matches(input))
{
   var result = match.Groups[1].Value;

   // work with the result
}

Try this regex:

(?<=\.?)([\w\s]{0,300}keyword.*?)(?=\.)

explain:

(?= subexpression) Zero-width positive lookahead assertion.

(?<= subexpression) Zero-width positive lookbehind assertion.

*? Matches the previous element zero or more times, but as few times as possible.

and a simple code:

foreach (Match match in Regex.Matches(input, 
                                      @"(?<=\.?)([\w\s]{0,300}print.*?)(?=\.)"))
{
    Console.WriteLine(match.Groups[1].Value);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM