简体   繁体   中英

Get string plus a certain amount of characters after from text

I couldn't seem to find an answer to my question after looking around for a bit.

I have a thing of text: "this is some text level 190 this is some more text sells for 1999"

I am needing to get the 190 value. I would just extract all numbers but most texts have more than just one set of numbers.

I've tried to substring the "level 190" but I end up getting left with characters after. How would I get rid of all text after "level 190" and pull only that specific text?

Code example:

string CIT = "ID 12321 ITEM name is ITEM level 100 cost some gold around 129"
string a = "level";

var index = CurrentItem.ToLower().IndexOf(a);
var final = index + 9; // Index of "level" is 28, add 4 for space and 3 numbers (level length + 4 = 9)
string CurrentItemSub = CurrentItem.Substring(index, final); // Sub it

Messagebox.Show(CurrentItemSub);

Here are two different ways of getting the value of "level"--assuming that the word "level" precedes the desired value.

Given the following user input: ID 12321 has level 100 and the cost is around 129.

Option 1 (using Regex)

Add using statement:

using System.Text.RegularExpressions;

Create a Match

Note :

  • ^ indicates the match must start at the beginning of the string; in multiline mode, it must start at the beginning of the line.
  • $ indicates the match must occur at the end of the string or before \\n at the end of the string; in multiline mode, it must occur before the end of the line or before \\n at the end of the line.
  • . Matches any single character except \\n. To match a literal period character (. or \.), you must precede it with the escape character (\\.).
  • * matches previous element 0 or more times
  • + matches previous element 1 or more times
  • ? matches the previous element 0 or 1 time
  • *? matches the previous element 0 or more times, but as few times as possible
  • +? matches the previous element one or more times, but as few times as possible

We can use the following pattern which uses a named group:

^.+level\\\\s+(?<level>\\\\d+).+$

Note : There may be other patterns that one can write that will also result in the desired data.

^ indicates to start at the beginning of the string (or line) .+ indicates that any character, except \\n, should be matches 1 or more times

level matches the word "level"

\\\\s+ matches 1 or more spaces

The format for a named group is (?<nameOfGroup>patternToMatch) . So (?<level>\\\\d+) indicates to match 1 or more digits and place it in a group named "level".

.+ indicates that any character, except \\n, should be matches 1 or more times

$ indicates the match must occur at the end of the string or before \\n at the end of the string; in multiline mode, it must occur before the end of the line or before \\n at the end of the line.

See Regular Expression Language - Quick Reference

Match match = Regex.Match(userInput, "^.+level\\s+(?<level>\\d+).+$", RegexOptions.IgnoreCase);

Check if there are any matches and do something with the result:

if (match.Success && match.Groups.Count > 1)
{
    for (int i = 0; i < match.Groups.Count; i++)
    {
        Group group = match.Groups[i];
        System.Diagnostics.Debug.WriteLine("group [" + i + "]:  Name: '" + group.Name + "' Value: " + group.Value);
    }

    System.Diagnostics.Debug.WriteLine("Level: '" + match.Groups["level"].ToString() + "'");
}

Here's a method that implements the above:

GetLevelRegex :

private string GetLevelRegex(string userInput)
{
    string level = string.Empty;

    Match match = Regex.Match(userInput, "^.+level\\s+(?<level>\\d+).+$", RegexOptions.IgnoreCase);

    if (match.Success && match.Groups.Count > 1)
    {
        for (int i = 0; i < match.Groups.Count; i++)
        {
            Group group = match.Groups[i];
            System.Diagnostics.Debug.WriteLine("group [" + i + "]:  Name: '" + group.Name + "' Value: " + group.Value);
        }

        level = match.Groups["level"].ToString();
        //System.Diagnostics.Debug.WriteLine("Level: '" +  level + "'");
    }

    return level;
}

Option 2 (without Regex)

Declare a variable:

string level = string.Empty;

Ensure user input in not null or empty and contains the word "level":

if (!String.IsNullOrEmpty(userInput) && userInput.IndexOf("level", StringComparison.OrdinalIgnoreCase) >= 0)
{

}

Replace multiple spaces with single space

string tempInput = userInput.Replace(@"\s+", " "); 

Get substring starting after the word "level"; remove space

level = userInput.Substring(userInput.IndexOf("level", StringComparison.OrdinalIgnoreCase) + 5).TrimStart();

Which results in the following string: 100 and the cost is around 129.

Now, the desired value is at the beginning of the string and ends when a space occurs. Get the desired value:

level = level.Substring(0, level.IndexOf(" "));

"level" now contains the following string: 100

Here's a method that implements the above:

GetLevel :

private string GetLevel(string userInput)
{
    string level = string.Empty;

    //ensure user input isn't null or empty AND user input contains the word "level"
    if (!String.IsNullOrEmpty(userInput) && userInput.IndexOf("level", StringComparison.OrdinalIgnoreCase) >= 0)
    {
        //replace multiple spaces with single space
        string tempInput = userInput.Replace(@"\s+", " "); 

        //get substring starting after the word "level"; remove space
        level = userInput.Substring(userInput.IndexOf("level", StringComparison.OrdinalIgnoreCase) + 5).TrimStart();

        //get text until a space is encountered
        level = level.Substring(0, level.IndexOf(" "));

        System.Diagnostics.Debug.WriteLine("level: '" + level + "'");
    }

    return level;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM