简体   繁体   English

Java正则表达式在字符串中查找单词模式

[英]Java regex to find a word pattern in the String

I was trying to find specific word from the string But I couldn't able to find the exact match regex. 我试图从字符串中查找特定单词,但找不到完全匹配的正则表达式。 The String can dynamically be changed in two forms 字符串可以两种形式动态更改

https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F

https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F&paymentid=PAY-12345K4776H687987R

I need to find the pattern to get the token value. 我需要找到模式以获取令牌值。 I have tried with this regex (?<=token\\=).* I was able to get the token from first string but not in second. 我已经尝试过使用此正则表达式(?<=token\\=).*我能够从第一个字符串中获取令牌,但不能从第二个字符串中获取令牌。 Output should be like below. 输出应如下所示。

EC-1J942953KU425764F

The .* matches any character zero or more times and is greedy and in your regex will match until the end of the string. .*与任何字符零次或多次匹配,并且是贪婪的,并且在您的正则表达式中,该匹配将一直持续到字符串结尾。

You could use your positive lookbehind and followed by matching not an ampersand or a newline one or more times using a negated character class [^&\\n]+ . 您可以在后面使用正面的表情,然后使用否定的字符类[^&\\n]+一次或多次不匹配与号或换行符。 You do not have to escape the equals sign. 您不必逃避等号。

(?<=token=)[^&\\n]+

Regex demo 正则表达式演示

You don't need the lookbehind if you define a capture group instead, which can be a little easier to read IMO. 如果您定义捕获组,则不需要后面的内容,这会使IMO读取起来更容易一些。

Also note that the semicolon character used to be an allowed URL param separator according to the spec, so you may want to include that when you match param values in case you need to support an older or inconsistent platform: 另请注意,根据规范,分号字符曾经是允许的URL参数分隔符,因此,当您需要支持较旧或不一致的平台时,在匹配参数值时,您可能需要包括分号:

token=([^&;\n]+)

The second match should be the token itself. 第二个匹配项应该是令牌本身。

Instead you can use spring-web UriComponentsBuilder 相反,您可以使用spring-web UriComponentsBuilder

String url = "https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F&paymentid=PAY-12345K4776H687987R";
MultiValueMap<String, String> queryParams =
        UriComponentsBuilder.fromUriString(url).build().getQueryParams();
queryParams.get("token")

or you can use URIBuilder 或者您可以使用URIBuilder

List<NameValuePair> queryParams = new URIBuilder(url)
                .getQueryParams();

If the format is always one of these two, and you don't specifically want to use a regex , then something like this may suffice: 如果格式始终是这两种格式之一,并且您不想特别使用regex ,则可以满足以下条件:

int val = str.indexOf("paymentid");
System.out.println(str.substring(str.indexOf("token"), (val != -1) ? val - 1 : str.length()));

Or of course you can replace val with str.indexOf("paymentid") and do it in one line. 或者当然可以用str.indexOf("paymentid")替换val并一行完成。

How about using the regex pattern 如何使用正则表达式模式

[&?]token=([^&\r\n]*)

Then just extract capture group 1 然后只提取捕获组1

String regex = "[&?]token=([^&\r\n]*)";
String input =
        "https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F\n" +
        "https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F&paymentid=PAY-12345K4776H6879";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while(matcher.find())
{
    System.out.printf("Token is %s%n", matcher.group(1));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM