简体   繁体   English

Java Regex模式提取数据

[英]Java Regex Pattern to extract data

I have incoming data something like this 我有这样的传入数据

http://localhost:1111/search?id=10&time=3200&type=abc
http://localhost:1111/search?time=3200&id=11&type=abc
http://localhost:1111/search?id=12
http://localhost:1111/search?id=13&time=3200&type=abc

The data is varying but not something completely random or unpredictable . 数据是变化的,但不是完全随机或不可预测的

So basically how do we extract what are the IDs that are incoming in each string ignoring rest of the junk? 因此,基本上,我们如何在忽略其余垃圾的情况下提取每个字符串中传入的ID?

You can try using the regex id=(\\d+) and extract the value of the first capturing group : 您可以尝试使用正则表达式id=(\\d+)并提取第一个捕获组的值:

String url = "http://localhost:1111/search?id=10&time=3200&type=abc";

Pattern id = Pattern.compile("id=(\\d+)");

Matcher m = id.matcher(url);
if (m.find())
    System.out.println(m.group(1));
10

See Pattern and Matcher . 请参阅PatternMatcher

What if there are several ID's that are passed (which is valid)? 如果传递了多个ID(有效)怎么办?

IMHO I wouild rather do somethis more like this: 恕我直言,我宁愿做这样的事:

URL url = new URL(<your link>);
String queryString = url.getQuery();

parse queryString into map for example of <String,List<String>> and get the value of ID key queryString解析为映射,例如<String,List<String>>并获取ID键的值

(?<=[?&])id=(\\d+)(?=(?:\\&|$))

works in Regex Buddy under the Java and Perl flavor, but not in TextPad, which uses the Boost regex engine. 在Java和Perl风格的Regex Buddy中工作,但在使用Boost regex引擎的TextPad中不起作用。 Boost has issues with back-references. Boost存在反向引用问题。

(?<=(?:
   [?&]    //PRECEDED BY a question-mark or ampersand
))          
   id=(\d+) //"id=[one-or-more-digits]"
(?=(?:
   \&|$     //FOLLOWED BY an ampersand or the end of the input
))

This captures the digits only, and avoids issues such as capturing incorrect fields like 这仅捕获数字,并避免了诸如捕获不正确的字段之类的问题,例如

anotherid=123sometext

Why exactly do you want to use a regular expression to do this? 您为什么要使用正则表达式来执行此操作?

I would do it like this: 我会这样做:

String url = "http://localhost:1111/search?id=13&time=3200&type=abc";
     String[] split = url.split("&");
     String id = "";    
     for (String s : split){
         if (s.contains("id")){
             id = s.substring(s.indexOf("id=")+3, s.length());
         }
     }

     System.out.println(id);

13 13

Expanding on @user1631616's answer: 扩展@ user1631616的答案:

Here is a sample code: 这是一个示例代码:

public static void main(String[] args) throws MalformedURLException {         
    URL aURL = new URL("http://localhost:1111/search?id=10&time=3200&type=abc");

    HashMap<String, String> params = new HashMap<>();
    String[] query = aURL.getQuery().split("&");
    for(String s: query) {
        String[] split = s.split("=");
        params.put(split[0],split[1]);
    }
    System.out.println(params.get("id")); 
    System.out.println(params.get("type")); 
    System.out.println(params.get("time")); 

}

That way if your HashMap param returns null you know that value was not set on the query string. 这样,如果您的HashMap参数返回null ,则知道未在查询字符串上设置值。

And also don't have to worry about the ordering of the parameters. 并且也不必担心参数的顺序。

Something like this should do what you want: 像这样的事情应该做你想要的:

(?<=id=)\\d+ (?<= id =)\\ d +

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM