简体   繁体   中英

RegEx remove part of string and and replace another part

I have a challenge getting the desired result with RegEx (using C#) and I hope that the community can help.

I have a URL in the following format: https://somedomain.com/subfolder/category/?abc=text:value&ida=0&idb=1

I want make two modifications, specifically:

1) Remove everything after 'value' eg '&ida=0&idb=1'

2) Replace 'category' with eg 'newcategory'

So the result is: https://somedomain.com/subfolder/newcategory/?abc=text:value

I can remove the string from 1) eg ^[^&]+ above but I have been unable to figure out how to replace the 'category' substring.

Any help or guidance would be much appreciated.

Thank you in advance.

Use the following:

  • Find: /(category/.+?value)&.+
  • Replace: /new$1 or /new\1 depending on your regex flavor

Demo & explanation


Update according to comment.

If the new name is completely_different_name , use the following:

  • Find: /category(/.+?value)&.+
  • Replace: /completely_different_name$1

Demo & explanation

You haven't specified language here, I mainly work on python so the solution is in python.

url = re.sub('category','newcategory',re.search('^https.*value', value).group(0))

Explanation re.sub is used to replace value a with b in c.

re.search is used to match specific patterns in string and store value in the group. so in the above code re.search will store value from "https to value" in group 0.

Using Python and only built-in string methods (there is no need for regular expressions here):

url = r"https://somedomain.com/subfolder/category/?abc=text:value&ida=0&idb=1"

new_url = (url.split('value')[0] + "value").replace("category", 'newcategory')

print(new_url)

Outputs:

https://somedomain.com/subfolder/newcategory/?abc=text:value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM