简体   繁体   English

创建正则表达式进行拆分

[英]Creating regular expression for splitting

I'm currently studying informatics and know what a regex is (not in java so). 我目前正在研究信息学,并且知道什么是正则表达式(不是Java)。 I have an input like : 我有这样的输入:

"String1  ( Nr = 323) String2 String3 (Nr  = 3)"

I wanted to split it by using: 我想使用以下方法拆分它:

split("[ ()=]");

because I think this would split all " ","(",")","=". 因为我认为这会分割所有的“”,“(”,“)”,“ =”。 Am i right? 我对吗? or do I need to put a + behind it? 还是我需要在其后面加上+号? and if this is already right I could add a * so I can also split for something like "(("? 如果已经是正确的话,我可以添加一个*,这样我也可以拆分成类似“((”?

If this isn't the problem then my other question regarding regex in java is how can I check if my String only contains numbers. 如果这不是问题,那么我关于Java正则表达式的另一个问题是如何检查我的String是否仅包含数字。

I tried: 我试过了:

contains(".*\\d+.*")
matches(".*\\d+.*")

But I'm pretty sure one of them is working. 但是我很确定其中一个正在工作。 So my problem should be with the splitting regex. 所以我的问题应该是正则表达式拆分。

My original problem is that I get a NumberFormatException for my splitted String array at the index 2 which normally should be "323"? 我的原始问题是我在索引2处获得了拆分后的String数组的NumberFormatException,通常应为“ 323”?

Can I use my regex with a * ? 我可以将正则表达式与*一起使用吗? like "[ ()=]*" ? "[ ()=]*"吗?

Thanks in advance 提前致谢

Yes, it will split on those characters but not produce the expected results. 是的,它将拆分这些字符,但不会产生预期的结果。 You need to use a quantifier with your character class. 您需要在字符类中使用量词。 I recommend using + meaning "one or more" times. 我建议使用+表示“一次或多次”。

String s = "String1  ( Nr = 323) String2 String3 (Nr  = 3)";
String[] parts = s.split("[ ()=]+");
System.out.println(Arrays.toString(parts));

Output 产量

[String1, Nr, 323, String2, String3, Nr, 3]

A regex that is useful for splitting a string 用于拆分字符串的正则表达式

  • must describe all separators between the strings you want to have 必须描述您想要的字符串之间的所有分隔符
  • may not describe the empty string 可能无法描述空字符串

You have non-space separators ()= surrounded by spaces between the strings you want to have. 您有非空格分隔符()=想要的字符串之间用空格包围。 You could be generous and use 您可能会很慷慨和使用

"[ ()=]+"   any mixture

or fiddly (requiring one of the ()= ) and do 或轻率地(需要()= )并执行

"\\s*[()=]\\s*"

With this regex, a split of "foo (( bar" would give you three strings. 使用此正则表达式,分割"foo (( bar"会给您三个字符串。

to make sure it is only *one+ of the trio ()= . 以确保它只是三重奏()= *)的* one +。

To check whether a string only contains numbers you'll have to define "numbers". 要检查字符串是否仅包含数字,您必须定义“数字”。 There is a general confusion between "digit" and "number", and "number" might be signed or unsigend, integer or fraction,... Number s implies at least one space. 有“数字”和“数字”和“数字”可能会被签名或unsigend,整数或分数之间的一般的混乱,......数s意味着至少有一个空格。

"\\s*(\\d+)(\\s+\\d+)*\\s*"

describes unsigned integers separated and surrounded by (optional) spaces. 描述由(可选)空格分隔并包围的无符号整数。 In this simple case, also 在这种简单的情况下,

"[\\s\\d]+"

will do. 会做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM