简体   繁体   English

Perl分裂模式

[英]Perl split pattern

According to the perldoc , the syntax for split is: 根据perldoc ,split的语法是:

 split /PATTERN/,EXPR,LIMIT 

But the PATTERN can also be a single- or double-quoted string: split "PATTERN", EXPR . PATTERN也可以是单引号或双引号字符串: split "PATTERN", EXPR What difference does it make? 它有什么不同?

Edit: A difference I'm aware of is splitting on backslashes: split /\\\\/ vs split '\\\\' . 编辑:我所知道的差异是分裂反斜杠: split /\\\\/ vs split '\\\\' The second form doesn't work. 第二种形式不起作用。

It looks like it uses that as "an expression to specify patterns": 看起来它使用它作为“指定模式的表达式”:

The pattern /PATTERN/ may be replaced with an expression to specify patterns that vary at runtime. 模式/ PATTERN /可以用表达式替换,以指定在运行时变化的模式。 (To do runtime compilation only once, use /$variable/o .) (要仅运行一次运行时编译,请使用/ $ variable / o。)

edit: I tested it with this: 编辑:我测试了这个:

my $foo = 'a:b:c,d,e';
print join(' ', split("[:,]", $foo)), "\n";
print join(' ', split(/[:,]/, $foo)), "\n";
print join(' ', split(/\Q[:,]\E/, $foo)), "\n";

Except for the ' ' special case, it looks just like a regular expression. 除了' '特殊情况,它看起来就像一个正则表达式。

PATTERN is always interpreted as... well, a pattern -- never as a literal value. PATTERN总是被解释为......好吧,一个模式 - 从不作为文字值。 It can be either a regex 1 or a string. 它可以是正则表达式1或字符串。 Strings are compiled to regexes. 字符串被编译为正则表达式。 For the most part the behavior is the same, but there can be subtle differences caused by the double interpretation. 在大多数情况下,行为是相同的,但双重解释可能会产生微妙的差异。

The string '\\\\' only contains a single backslash. 字符串'\\\\'仅包含一个反斜杠。 When interpreted as a pattern, it's as if you had written /\\/ , which is invalid: 当解释为模式时,就好像你写了/\\/ ,这是无效的:

C:\>perl -e "print join ':', split '\\', 'a\b\c'"
Trailing \ in regex m/\/ at -e line 1.

Oops! 哎呀!

Additionally, there are two special cases: 此外,还有两种特殊情况:

  • The empty pattern // , which splits on the empty string. 空模式// ,在空字符串上分割。
  • A single space ' ' , which splits on whitespace after first trimming any leading or trailing whitespace. 单个空格' ' ,在首次修剪任何前导或尾随空格后在空白处分裂。

1. Regexes can be supplied either inline /.../ or via a precompiled qr// quoted string. 1.可以通过内联/.../或通过预编译的qr//引用字符串提供正则表达式。

I believe there's no difference. 我相信没有区别。 A string pattern is also interpreted as a regular expression. 字符串模式也被解释为正则表达式。

perl -e 'print join("-",split("[a-e]","regular"))';
r-gul-r

As you see, the delimiter is interpreted as a regular expression, not a string literal. 如您所见,分隔符被解释为正则表达式,而不是字符串文字。

So, it's mostly the same - with one important exception: split(" ",... ) and split(/ /,... ) are different. 所以,它大致相同 - 有一个重要的例外: split(" ",... )和split(/ /,... )是不同的。

I prefer to use /PATTERN/ to avoid confusion, it's easy to forget that it's a regexp otherwise. 我更喜欢使用/PATTERN/以避免混淆,否则很容易忘记它是一个正则表达式。

Two observable rules: 两个可观察的规则:

  • the special case split(" ") is equivalent to split(/\\s+/) . 特殊情况split(" ")等同于split(/\\s+/)
  • for everything else (it seems—don't nail me), split("something") is equal to split(/something/) 对于其他一切(似乎 - 不要指责我), split("something")等于split(/something/)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM