简体   繁体   English

空字符串的模式是什么?

[英]What is the pattern for empty string?

I need to validate input: valid variants are either number or empty string. 我需要验证输入:有效的变量是数字或空字符串。 What is the correspondent regular expression? 相应的正则表达式是什么?

String pattern = "\d+|<what shoudl be here?>";

UPD: dont suggest "\\d*" please, I'm just curious how to tell "empty string" in regexp. UPD:请不要建议“\\ d *”,我只是好奇如何在regexp中告诉“空字符串”。

In this particular case, ^\\d*$ would work, but generally speaking, to match pattern or an empty string, you can use: 在这种特殊情况下, ^\\d*$可以工作,但一般来说,为了匹配pattern或空字符串,您可以使用:

^$|pattern

Explanation 说明

  • ^ and $ are the beginning and end of the string anchors respectively. ^$分别是字符串锚点的开头和结尾。
  • | is used to denote alternates, eg this|that . 用于表示替换,例如, this|that

References 参考

Related questions 相关问题


Note on multiline mode 关于多线模式的注意事项

In the so-called multiline mode ( Pattern.MULTILINE/(?m) in Java), the ^ and $ match the beginning and end of the line instead. 在所谓的多模式( Pattern.MULTILINE/(?m)在Java中),该^$匹配的开头和结尾来代替。 The anchors for the beginning and end of the string are now \\A and \\Z respectively. 字符串开头和结尾的锚点现在分别是\\A\\Z

If you're in multiline mode, then the empty string is matched by \\A\\Z instead. 如果您处于多线模式,则空字符串将与\\A\\Z匹配。 ^$ would match an empty line within the string. ^$将匹配字符串中的空行。


Examples 例子

Here are some examples to illustrate the above points: 以下是一些例子来说明以上几点:

String numbers = "012345";

System.out.println(numbers.replaceAll(".", "<$0>"));
// <0><1><2><3><4><5>

System.out.println(numbers.replaceAll("^.", "<$0>"));
// <0>12345

System.out.println(numbers.replaceAll(".$", "<$0>"));
// 01234<5>

numbers = "012\n345\n678";
System.out.println(numbers.replaceAll("^.", "<$0>"));       
// <0>12
// 345
// 678

System.out.println(numbers.replaceAll("(?m)^.", "<$0>"));       
// <0>12
// <3>45
// <6>78

System.out.println(numbers.replaceAll("(?m).\\Z", "<$0>"));     
// 012
// 345
// 67<8>

Note on Java matches 关于Java matches注释

In Java, matches attempts to match a pattern against the entire string . 在Java中, matches尝试将模式与整个字符串进行匹配。

This is true for String.matches , Pattern.matches and Matcher.matches . 这适用于String.matchesPattern.matchesMatcher.matches

This means that sometimes, anchors can be omitted for Java matches when they're otherwise necessary for other flavors and/or other Java regex methods. 这意味着有时候,对于其他版本和/或其他Java正则表达式方法而言,对于Java matches ,可以省略锚点。

Related questions 相关问题

/^\d*$/

Matches 0 or more digits with nothing before or after. 匹配0或更多数字之前或之后没有任何内容。

Explanation: 说明:

The '^' means start of line. '^'表示行首。 '$' means end of line. '$'表示行尾。 '*' matches 0 or more occurences. '*'匹配0或更多次出现。 So the pattern matches an entire line with 0 or more digits. 因此,模式匹配整个行与0或更多位数。

To explicitly match the empty string, use \\A\\Z . 要显式匹配空字符串,请使用\\A\\Z

You can also often see ^$ which works fine unless the option is set to allow the ^ and $ anchors to match not only at the start or end of the string but also at the start/end of each line. 您还可以经常看到^$工作正常,除非将该选项设置为允许^$ anchors不仅匹配字符串的开头或结尾,而且还匹配每行的开头/结尾。 If your input can never contain newlines, then of course ^$ is perfectly OK. 如果您的输入永远不能包含换行符,那么^$当然是完全正常的。

Some regex flavors don't support \\A and \\Z anchors (especially JavaScript). 一些正则表达式风格不支持\\A\\Z锚点(尤其是JavaScript)。

If you want to allow "empty" as in "nothing or only whitespace", then go for \\A\\s*\\Z or ^\\s*$ . 如果你想在“没有或只有空格”中允许“空”,那么去\\A\\s*\\Z^\\s*$

Just as a funny solution, you can do: 就像一个有趣的解决方案,你可以这样做:

\d+|\d{0}

A digit, zero times. 一个数字,零次。 Yes, it does work. 是的,它确实有效。

One of the way to view at the set of regular language as the closure of the below things: 查看常规语言集合的方法之一是关闭以下内容:

  1. Special < EMPTY_STRING > is the regular language 特殊<EMPTY_STRING>是常规语言
  2. Any symbol from alphaphet is the valid regular language 来自alphaphet的任何符号都是有效的常规语言
  3. Any concatentation and union of two valid regexps is the regular language 两个有效正则表达式的任何连接和联合是常规语言
  4. Any union of two valid regular language is the regular language 任何两种有效常规语言的联合都是常规语言
  5. Any transitive closure of the regexp is the regular language 正则表达式的任何传递闭包都是常规语言

Concreate regular language is concrete element of this closure. Concreate常规语言是此闭包的具体元素。


I didn't find empty symbol in POSIX standard to express regular language idea from step (1). 我没有在POSIX标准中找到空符号来表达步骤(1)中的常规语言思想。

But it is exist extra thing like question mark there which is by posix definition is the following: 但是存在像posix定义那样的问号额外的东西如下:

(regexp|< EMPTY_STRING >) (regexp | <EMPTY_STRING>)

So you can do in the following manner for bash, perl, and python: 所以你可以用以下方式为bash,perl和python做:

echo 9023 | grep -E "(1|90)?23"
perl -e "print 'PASS' if (qq(23) =~ /(1|90)?23/)"
python -c "import re; print bool(re.match('^(1|90)?23$', '23'))"

只有"\\d+|"应该没有任何问题

To make any pattern that matches an entire string optional, ie allow a pattern match an empty string, use an optional group : 要使任何匹配整个字符串的模式可选,即允许模式匹配空字符串,请使用可选组

^(pattern)?$
^^       ^^^

See the regex demo 请参阅正则表达式演示

If the regex engine allows (as in Java), prefer a non-capturing group since its main purpose is to only group subpatterns, not keep the subvalues captured: 如果正则表达式引擎允许(如在Java中),则首选非捕获组,因为其主要目的是仅对子模式进行分组,而不是保留捕获的子值:

^(?:pattern)?$

The ^ will match the start of a string (or \\A can be used in many flavors for this), $ will match the end of string (or \\z can be used to match the very end in many flavors, and Java, too), and the (....)? ^将匹配字符串的开头(或者\\A可以在很多种情况下用于此), $将匹配字符串的结尾(或者\\z可用于匹配许多种类的最终结尾,而Java也是如此) ),和(....)? will match 1 or 0 (due to the ? quantifier) sequences of the subpatterns inside parentheses. 将匹配括号内的子模式的1或0(由于?量词) 序列

A Java usage note: when used in matches() , the initial ^ and trailing $ can be omitted and you can use Java使用说明:在matches() ,可以省略初始^和尾随$ ,您可以使用

String pattern = "(?:\d+)?";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM