简体   繁体   中英

Extract substring from string with Regex

Imagine that users are inserting strings in several computers.

On one computer, the pattern in the configuration will extract some characters of that string, lets say position 4 to 5. On another computer, the extract pattern will return other characters, for instance, last 3 positions of the string.

These configurations (the Regex patterns) are different for each computer, and should be available for change by the administrator, without having to change the source code.

Some examples:

         Original_String       Return_Value
User1 -  abcd78defg123         78
User2 -  abcd78defg123         78g1
User3 -  mm127788abcd          12
User4 -  123456pp12asd         ppsd

Can it be done with Regex? Thanks.

Why do you want to use regex for this? What is wrong with:

string foo = s.Substring(4,2);
string bar = s.Substring(s.Length-3,3);

(you can wrap those up to do a bit of bounds-checking on the length easily enough)

If you really want, you could wrap it up in a Func<string,string> to put somewhere - not sure I'd bother, though:

Func<string, string> get4and5 = s => s.Substring(4, 2);
Func<string,string> getLast3 = s => s.Substring(s.Length - 3, 3);
string value = "abcd78defg123";
string foo = getLast3(value);
string bar = get4and5(value);

If you really want to use regex:

^...(..)

And:

.*(...)$

I'm not sure what you are hoping to get by using RegEx. RegEx is used for pattern matching. If you want to extract based on position, just use substring.

It seems to me that Regex really isn't the solution here. To return a section of a string beginning at position pos (starting at 0) and of length length , you simply call the Substring function as such:

string section = str.Substring(pos, length)

Grouping. You could match on /^.{3}(.{2})/ and then look at group $1 for example.

The question is why? Normal string handling ie actual substring methods are going to be faster and clearer in intent.

To have a regex capture values for further use you typically use (), depending on the regex compiler it might be () or for microsoft MSVC I think it's []

Example

User4 -  123456pp12asd         ppsd  

is most interesting in that you have here 2 seperate capture areas. Is there some default rule on how to join them together, or would you then want to be able to specify how to make the result?

Perhaps something like

r/......(..)...(..)/\1\2/  for ppsd
r/......(..)...(..)/\2-\1/ for sd-pp

do you want to run a regex to get the captures and handle them yourself, or do you want to run more advanced manipulation commands?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM