简体   繁体   English

Java String replaceAll regex删除除数字,点和空格之外的所有内容

[英]Java String replaceAll regex to remove everything except digits, dots and spaces

I've browsed a lot of regex topics but none of these lead me to success in my particular situation. 我浏览了很多正则表达式主题,但这些都没有让我在特定情况下取得成功。

(Using java) I have some charsequence which I then convert to array and extract numbers to array of doubles. (使用java)我有一些charsequence然后我转换为数组并将数字提取到双精度数组。

asdsad 59 asdf .2 asdf 56 89 .a 2.4 wef 95 asdf. asdsad 59 asdf .2 asdf 56 89 .a 2.4 wef 95 asdf。

then I want to use regex to remove the extra part and compose the following string 然后我想使用正则表达式删除额外的部分并组成以下字符串

59 2 56 89 2.4 95 59 2 56 89 2.4 95

so then I could just use .split(" ") and put them all to an array of doubles. 所以我可以使用.split(" ")并将它们全部放入双打数组中。

Until this moment I used the following expression 直到这一刻,我使用了以下表达式

[^0-9.\s]

but it leaves the extra dots and therefore not reliable. 但它留下了额外的点,因此不可靠。 Now I'm trying something like 现在我正在尝试类似的东西

[^0-9.\s]|([^0-9]\.[^0-9])

but it's not working at all, I'm not really good with regex, so could you explain me why the last expression is not working and how to fix it. 但它根本不起作用,我对正则表达式并不是很好,所以你能解释一下为什么最后一个表达式不起作用以及如何解决它。

Have you tried string.replaceAll("[^\\\\d\\\\. ]","") ? 你试过string.replaceAll("[^\\\\d\\\\. ]","")吗?

You can see the results here: https://regex101.com/r/X6gLaY/2 您可以在此处查看结果: https//regex101.com/r/X6gLaY/2

String string = "asdsad 59 asdf 2 asdf 56 89 .a 2.4 wef 95 asdf.";
String regex = "[^\\d\\. ]| \\.|\\.$";
System.out.println(string.replaceAll(regex,""));

Java example: http://ideone.com/w4BWOZ Java示例: http//ideone.com/w4BWOZ

Outputs: 59 2 56 89 2.4 95 产出: 59 2 56 89 2.4 95

Add an alternative to remove dots that are not preceded with a digit that is followed with a dot and a digit: 添加一个替代方法,以删除前面没有数字的点,后面跟一个点和一个数字:

[^\d\s.]+|(?<!\d\.\d)\.

See this regex demo . 看到这个正则表达式演示

Details : 细节

  • [^\\d\\s.]+ - 1+ chars other than digits, whitespaces and dots [^\\d\\s.]+ - 除了数字,空格和点之外的1 +个字符
  • | - or - 要么
  • (?<!\\d\\.\\d) - a location not preceded with a digit, dot, digit (?<!\\d\\.\\d) - 一个前面没有数字,点,数字的位置
  • \\. - a dot. - 一个点。

Sample code: 示例代码:

String re = "[^\\d\\s.]+|(?<!\\d\\.\\d)\\.";
System.out.println("asdsad 59 asdf 2 asdf 56 89 .a 2.4 wef 95 asdf.".replaceAll(re, ""));
System.out.println("asdsad 59 asdf .2 asdf 56 89 .a 2.4 wef 95 asdf.".replaceAll(re, ""));

Java demo Java演示

I've played with regex for half a day until I come up with this. 我用正则表达式玩了半天,直到我想出这个。

Apparently it really matters what the order of the expression is. 显然,表达的顺序真的很重要。 I assume it's because it iterates over each condition and always uses the data that's left after execution of previous condition, so I changed the regular expression to: 我假设这是因为它遍历每个条件并始终使用执行先前条件后剩下的数据,因此我将正则表达式更改为:

  1. exclude all dots followed by non-digit 排除所有后跟非数字

  2. exclude all non-digits followed by dot 排除所有非数字后跟

  3. exclude all left-over non-digits 排除所有遗留的非数字

[^0-9]\.|\.[^0-9]|[^0-9.\s]

Now it works like a charm. 现在它就像一个魅力。 Hope it helps someone. 希望它可以帮助某人。 :) :)

You can split directly on this regex: 你可以直接拆分这个正则表达式:

([^\d.]|\B\.|\.\B)+

ie

String[] parts = str.split("([^\\d.]|\\B\\.|\\.\\B)+");

Although this (may) leave a leading blank in the array. 虽然这(可能)在阵列中留下了前导空白。

To go directly to a double[] in one line (handling the leading blank): 直接转到一行中的double[] (处理前导空格):

double[] numbers = Arrays.stream(str.split("([^\\d.]|\\B\\.|\\.\\B)+"))
    .filter(s -> !s.isEmpty())
    .mapToDouble(Double::parseDouble)
    .toArray();

See live demo . 查看现场演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM