简体   繁体   English

使用Java中的RegEx从String中提取两个double值

[英]Extract two double Values from String using RegEx in Java

I am reading a file by line and need to extract latitude and longitude from it. 我正在逐行阅读文件,需要从中提取纬度和经度。 This how lines can looks: 线条看起来如何:

DE  83543   Rott am Inn Bayern  BY  Oberbayern      Landkreis Rosenheim 47.983  12.1278 
DE  21147   Hamburg Hamburg HH          Kreisfreie Stadt Hamburg    53.55   10  

What's for sure is, there are no dots surrounded by digits except for the ones representing the doubles. 可以肯定的是,除了表示双打的数字之外,没有数字包围的点。 Unfortunately there are Values without a dot, so it's probably best to check for numbers from the end of the String. 不幸的是,没有点的值,所以最好从字符串末尾检查数字。

thanks for your help! 谢谢你的帮助!

If you can use the java.lang.String#split() 如果你可以使用java.lang.String#split()

//Split by tab
String values[] = myTextLineByLine.split("\t");
List<String> list = Arrays.asList(values);
//Reverse the list so that longitude and latitude are the first two elements
Collections.reverse(list);

String longitude = list.get(0);
String latitude = list.get(1);

Is it a tabulator separated csv table? 它是一个分隔csv表的制表工具吗? Then I'd suggest looking at String#split and simply choosing the two last fields from the resulting String array. 然后我建议查看String#split并简单地从结果String数组中选择最后两个字段。

... anyway, even if not csv, split on whitechars and take the two last fields of the String array - those are the lat/lon values and you can convert them with Double#parseDouble. ...无论如何,即使不是csv,在whitechars上拆分并获取String数组的最后两个字段 - 这些是lat / lon值,你可以用Double#parseDouble转换它们。

I think this is the correct pattern for getting the latitude and longitude out of the string which must match for example (45.23423,15.23423) (with or without space after the comma [,]) 我认为这是正确的模式,用于获取必须匹配的字符串的纬度和经度,例如(45.23423,15.23423)(在逗号[,]之后有或没有空格)

Answer based on the aioobe's answer above: 答案基于aioobe上面的回答:

Pattern p = Pattern.compile("^(\\d+\\.?\\d*),\\s?(\\d+\\.?\\d*)$");
Matcher m = p.matcher(s1);
if (m.matches()) {
    System.out.println("Long: " + Double.parseDouble(m.group(1)));
    System.out.println("Latt: " + Double.parseDouble(m.group(2)));
}

cheers 干杯

    Pattern p = Pattern.compile(".*?(\\d+\\.?\\d*)\\s+(\\d+\\.?\\d*)");
    Matcher m = p.matcher(s1);
    if (m.matches()) {
        System.out.println("Long: " + Double.parseDouble(m.group(1)));
        System.out.println("Latt: " + Double.parseDouble(m.group(2)));
    }
  1. .*? eat characters reluctantly 不情愿地吃人物
  2. (\\\\d+\\\\.?\\\\d*) some digits, an optional decimal point, some more digits (\\\\d+\\\\.?\\\\d*)一些数字,一个可选的小数点,一些更多的数字
  3. \\\\s+ at least one white-space character (such as a tab character) \\\\s+至少一个空格字符(例如制表符)
  4. (\\\\d+\\\\.?\\\\d*) some digits, an optional decimal point, some more digits (\\\\d+\\\\.?\\\\d*)一些数字,一个可选的小数点,一些更多的数字

This solution uses Scanner.findWithinHorizon and capturing groups: 此解决方案使用Scanner.findWithinHorizon并捕获组:

    import java.util.*;
    import java.util.regex.*;
    //...

    String text = 
        "DE  83543 Blah blah blah 47.983  12.1278\n" +
        "DE\t21147 100% hamburger beef for 4.99 53.55 10\n";

    Scanner sc = new Scanner(text);
    Pattern p = Pattern.compile(
        "(\\w+) (\\d+) (.*) (decimal) (decimal)"
            .replace("decimal", "\\d+(?:\\.\\d+)?")
            .replace(" ", "\\s+")
    );
    while (sc.findWithinHorizon(p, 0) != null) {
        MatchResult mr = sc.match();
        System.out.printf("[%s|%s] %-30s [%.4f:%.4f]%n",
            mr.group(1),
            mr.group(2),
            mr.group(3),
            Double.parseDouble(mr.group(4)),
            Double.parseDouble(mr.group(5))
        );
    }

This prints: 这打印:

[DE|83543] Blah blah blah                 [47.9830:12.1278]
[DE|21147] 100% hamburger beef for 4.99   [53.5500:10.0000]

Note the meta-regex approach of using replace to generate the "final" regex. 注意使用replace生成“最终”正则表达式的meta-regex方法。 This is done for readability of the "big picture" pattern. 这是为了“大图”模式的可读性。

I have tried this: 我试过这个:

    public static void main(String[] args)
    {
        String str  ="DE 83543   Rott am Inn Bayern  BY  Oberbayern  Landkreis Rosenheim 47.983  12.1278";
        String str1  ="DE  21147   Hamburg Hamburg HH          Kreisfreie Stadt Hamburg    53.55   10  ";

        String[] tempStr1 = str1.split("[ \t]+");

        System.out.println(tempStr1.length);
        double latitude = Double.parseDouble(tempStr1[tempStr1.length - 2]);
        double longitude = Double.parseDouble(tempStr1[tempStr1.length - 1]);

        System.out.println(latitude + ", " + longitude);
    }

It splits the string whenever it encounters white spaces. 它会在遇到空格时分割字符串。 Since the coordinates will always be the last two elements, it should be able to print them without any problem. 由于坐标始终是最后两个元素,因此它应该能够毫无问题地打印它们。 Below is the output. 以下是输出。

53.55, 10.0 53.55,10.0

47.983, 12.1278 47.983,12.1278

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM