简体   繁体   English

正则表达式捕获单词和字符第一次出现之间的字符串

[英]regex to capture the string between a word and first occurrence of a character

Want to capture the string after the last slash and before the first occurrence of backward slash().想要在最后一个斜杠之后和第一次出现反斜杠()之前捕获字符串。

sample data:样本数据:

  1. sessionId=30a793b1-ed7e-464a-a630; sessionId=30a793b1-ed7e-464a-a630; Url=https://www.example.com/mybook/order/newbooking/itemSummary;网址=https://www.example.com/mybook/order/newbooking/itemSummary; sid=KJ4dgQGdhg7dDn1h0TLsqhsdfhsfhjhsdjfhjshdjfhjsfddscg139bjXZQdkbHpzf9l6wy1GdK5XZp; sid=KJ4dgQGdhg7dDn1h0TLsqhsdfhsfhjhsdjfhjshdjfhjsfddscg139bjXZQdkbHpzf9l6wy1GdK5XZp; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemSummary/amex","Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=122; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemSummary/amex","Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl =https://www.example.com/mybook/order/newbooking/page1?id=122;

  2. sessionId=sfdsdfsd-ba57-4e21-a39f-34; sessionId=sfdsdfsd-ba57-4e21-a39f-34; Url=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true;网址=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/basket","Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl=https://www.example.com/ mybook/order/newbooking/page1?id=123; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/basket","Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl =https://www.example.com/mybook/order/newbooking/page1?id=123;

  3. sessionId=0e1acab1-45b8-sdf3454fds-afc1-sdf435sdfds; sessionId=0e1acab1-45b8-sdf3454fds-afc1-sdf435sdfds; Url=https://www.example.com/mybook/order/newbooking/;网址=https://www.example.com/mybook/order/newbooking/; sid=hkm2gRSL2t5ScKSJKSJn3vg2sfdsfdsfdsfdsfdfdsfdsfdsfvJZkDD3ng0kYTjhNQw8mFZMn; sid=hkm2gRSL2t5ScKSJKSJn3vg2sfdsfdsfdsfdsfdfdsfdsfdsfvJZkDD3ng0kYTjhNQw8mFZMn; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/","Accept":"application/json, application/javascript","sessionId":"ggh76734",targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=343;List item ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/","Accept":"application/json, application/javascript","sessionId":"ggh76734",targetUrl= https://www.example.com/mybook/order/newbooking/page1?id=343;列表项

  4. sessionId=sfdsdfsd-ba57-4e21-a39f-34; sessionId=sfdsdfsd-ba57-4e21-a39f-34; Url=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true;网址=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/basket?id=76734&para=jhjdfhj&type=new&ordertype=kjkf", "Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl=https://www.example.com/ mybook/order/newbooking/page1?id=123; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/basket?id=76734&para=jhjdfhj&type=new&ordertype=kjkf", "接受":"application/json, application/javascript" ,"sessionId":"ggh76734", targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=123;

Expecting the below output:期待以下输出:

  1. amex美国运通
  2. basket篮子
  3. ''(empty string) ''(空字符串)
  4. basket篮子

Have build the below regex to capture it but its 100% accurate.已构建以下正则表达式来捕获它,但它 100% 准确。 It is capturing some additional part.它正在捕获一些额外的部分。

Regex正则表达式

\bmyreferer\\\":\\\"\S+\/(.*?)\\\",

Could you please help me to improve the regex to get desired output?你能帮我改进正则表达式以获得所需的输出吗?

You could use a negated character class with a capture group:您可以将否定字符类与捕获组一起使用:

\bmyreferer":"[^"]+/([^/"]*)"
  • \\bmyreferer":" Match literally preceded by a word boundary \\bmyreferer":"匹配字面上的字边界
  • [^"]+/ Match 1+ times any char except " , followed by a / [^"]+/匹配 1+ 次除"之外的任何字符,后跟一个/
  • ( Capture group 1 (捕获组 1
    • [^/"]* Optionally match (to also match an empty string) any char except / and " [^/"]*可选匹配(也匹配空字符串)除/"之外的任何字符
  • )" Close group 1 and match " )"关闭第 1 组并匹配"

regex demo |正则表达式演示| Java demo Java 演示

Example code示例代码

String regex = "\\bmyreferer\":\"[^\"]+/([^/\"]*)\"";
String string = "sessionId=30a793b1-ed7e-464a-a630; Url=https://www.example.com/mybook/order/newbooking/itemSummary; sid=KJ4dgQGdhg7dDn1h0TLsqhsdfhsfhjhsdjfhjshdjfhjsfddscg139bjXZQdkbHpzf9l6wy1GdK5XZp; ,\"myreferer\":\"https://www.example.com/mybook/order/newbooking/itemSummary/amex\",\"Accept\":\"application/json, application/javascript\",\"sessionId\":\"ggh76734\", targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=122;\n\n"
+ "sessionId=sfdsdfsd-ba57-4e21-a39f-34; Url=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; ,\"myreferer\":\"https://www.example.com/mybook/order/newbooking/itemList/basket\",\"Accept\":\"application/json, application/javascript\",\"sessionId\":\"ggh76734\", targetUrl=https://www.example.com/ mybook/order/newbooking/page1?id=123;\n\n"
+ "sessionId=0e1acab1-45b8-sdf3454fds-afc1-sdf435sdfds; Url=https://www.example.com/mybook/order/newbooking/; sid=hkm2gRSL2t5ScKSJKSJn3vg2sfdsfdsfdsfdsfdfdsfdsfdsfvJZkDD3ng0kYTjhNQw8mFZMn; ,\"myreferer\":\"https://www.example.com/mybook/order/newbooking/itemList/\",\"Accept\":\"application/json, application/javascript\",\"sessionId\":\"ggh76734\",targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=343;List item";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Group 1 value: " + matcher.group(1));
}

Output输出

Group 1 value: amex
Group 1 value: basket
Group 1 value: 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式提取字符的第一次出现和最后一次出现之间的字符串 - Regex to extract string between the first and last occurrence of a character 使用正则表达式替换单词中第一个出现的字符而不是单词的开头 - Replace first occurrence of character not in the beginning of a word in a word using regex 用于在两个字符之间查找字符串的正则表达式模式-但第二个字符首次出现 - Regex pattern for finding string between two characters - but first occurrence of the second character 正则表达式用于捕获“_”和“。”之间的单词。 - Regex for capture the word between “_” and “.” 如何在字符和首次出现双引号之间获取字符串? - How to get string between a character and first occurrence of double quotes? 正则表达式 - 如何在第一次出现角色时停止 - regex- how to stop at first occurrence of a character 用RegEx替换2个单词之间最后出现的单词 - Replace last occurrence of word between 2 words with RegEx 如何将字符串中每个字符的第一次出现大写 - How to capitalize first occurrence of each character in a string 正则表达式以匹配字符串的第一个匹配项与最后一个匹配的字符串 - Regex to match first occurrence of a string is matching the last 仅捕获第一次出现的正则表达式字符串 - Only catching the first occurrence of a regex string
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM