[英]Extract two double Values from String using RegEx in Java
I am reading a file by line and need to extract latitude and longitude from it. 我正在逐行阅读文件,需要从中提取纬度和经度。 This how lines can looks:
线条看起来如何:
DE 83543 Rott am Inn Bayern BY Oberbayern Landkreis Rosenheim 47.983 12.1278
DE 21147 Hamburg Hamburg HH Kreisfreie Stadt Hamburg 53.55 10
What's for sure is, there are no dots surrounded by digits except for the ones representing the doubles. 可以肯定的是,除了表示双打的数字之外,没有数字包围的点。 Unfortunately there are Values without a dot, so it's probably best to check for numbers from the end of the String.
不幸的是,没有点的值,所以最好从字符串末尾检查数字。
thanks for your help! 谢谢你的帮助!
If you can use the java.lang.String#split()
如果你可以使用
java.lang.String#split()
//Split by tab
String values[] = myTextLineByLine.split("\t");
List<String> list = Arrays.asList(values);
//Reverse the list so that longitude and latitude are the first two elements
Collections.reverse(list);
String longitude = list.get(0);
String latitude = list.get(1);
Is it a tabulator separated csv table? 它是一个分隔csv表的制表工具吗? Then I'd suggest looking at String#split and simply choosing the two last fields from the resulting String array.
然后我建议查看String#split并简单地从结果String数组中选择最后两个字段。
... anyway, even if not csv, split on whitechars and take the two last fields of the String array - those are the lat/lon values and you can convert them with Double#parseDouble. ...无论如何,即使不是csv,在whitechars上拆分并获取String数组的最后两个字段 - 这些是lat / lon值,你可以用Double#parseDouble转换它们。
I think this is the correct pattern for getting the latitude and longitude out of the string which must match for example (45.23423,15.23423) (with or without space after the comma [,]) 我认为这是正确的模式,用于获取必须匹配的字符串的纬度和经度,例如(45.23423,15.23423)(在逗号[,]之后有或没有空格)
Answer based on the aioobe's answer above: 答案基于aioobe上面的回答:
Pattern p = Pattern.compile("^(\\d+\\.?\\d*),\\s?(\\d+\\.?\\d*)$");
Matcher m = p.matcher(s1);
if (m.matches()) {
System.out.println("Long: " + Double.parseDouble(m.group(1)));
System.out.println("Latt: " + Double.parseDouble(m.group(2)));
}
cheers 干杯
Pattern p = Pattern.compile(".*?(\\d+\\.?\\d*)\\s+(\\d+\\.?\\d*)");
Matcher m = p.matcher(s1);
if (m.matches()) {
System.out.println("Long: " + Double.parseDouble(m.group(1)));
System.out.println("Latt: " + Double.parseDouble(m.group(2)));
}
.*?
eat characters reluctantly (\\\\d+\\\\.?\\\\d*)
some digits, an optional decimal point, some more digits (\\\\d+\\\\.?\\\\d*)
一些数字,一个可选的小数点,一些更多的数字 \\\\s+
at least one white-space character (such as a tab character) \\\\s+
至少一个空格字符(例如制表符) (\\\\d+\\\\.?\\\\d*)
some digits, an optional decimal point, some more digits (\\\\d+\\\\.?\\\\d*)
一些数字,一个可选的小数点,一些更多的数字 This solution uses Scanner.findWithinHorizon
and capturing groups: 此解决方案使用
Scanner.findWithinHorizon
并捕获组:
import java.util.*;
import java.util.regex.*;
//...
String text =
"DE 83543 Blah blah blah 47.983 12.1278\n" +
"DE\t21147 100% hamburger beef for 4.99 53.55 10\n";
Scanner sc = new Scanner(text);
Pattern p = Pattern.compile(
"(\\w+) (\\d+) (.*) (decimal) (decimal)"
.replace("decimal", "\\d+(?:\\.\\d+)?")
.replace(" ", "\\s+")
);
while (sc.findWithinHorizon(p, 0) != null) {
MatchResult mr = sc.match();
System.out.printf("[%s|%s] %-30s [%.4f:%.4f]%n",
mr.group(1),
mr.group(2),
mr.group(3),
Double.parseDouble(mr.group(4)),
Double.parseDouble(mr.group(5))
);
}
This prints: 这打印:
[DE|83543] Blah blah blah [47.9830:12.1278]
[DE|21147] 100% hamburger beef for 4.99 [53.5500:10.0000]
Note the meta-regex approach of using replace
to generate the "final" regex. 注意使用
replace
生成“最终”正则表达式的meta-regex方法。 This is done for readability of the "big picture" pattern. 这是为了“大图”模式的可读性。
I have tried this: 我试过这个:
public static void main(String[] args)
{
String str ="DE 83543 Rott am Inn Bayern BY Oberbayern Landkreis Rosenheim 47.983 12.1278";
String str1 ="DE 21147 Hamburg Hamburg HH Kreisfreie Stadt Hamburg 53.55 10 ";
String[] tempStr1 = str1.split("[ \t]+");
System.out.println(tempStr1.length);
double latitude = Double.parseDouble(tempStr1[tempStr1.length - 2]);
double longitude = Double.parseDouble(tempStr1[tempStr1.length - 1]);
System.out.println(latitude + ", " + longitude);
}
It splits the string whenever it encounters white spaces. 它会在遇到空格时分割字符串。 Since the coordinates will always be the last two elements, it should be able to print them without any problem.
由于坐标始终是最后两个元素,因此它应该能够毫无问题地打印它们。 Below is the output.
以下是输出。
53.55, 10.0
53.55,10.0
47.983, 12.1278
47.983,12.1278
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.