简体   繁体   English

在字符串中分割两个字符-Perl

[英]Splitting Two Characters In a String - Perl

I'm trying to split this string. 我正在尝试拆分此字符串。 Here's the code: 这是代码:

 my $string = "585|487|314|1|1,651|365|302|1|1,585|487|314|1|1,651|365|302|1|1,656|432|289|1|1,136|206|327|1|1,585|487|314|1|1,651|365|302|1|1,585|487|314|1|1,651|365|302|1|1%656|432|289|1|1%136|206|327|1|1%654|404|411|1|1";
 my @ids = split(",", $string);

What I want is to split only % and , in the string, I was told that I could use a pattern, something like this? 我要的是分裂只%,在字符串中,有人告诉我,我可以用一个模式,这样的事情? /[^a-zA-Z0-9_]/

Character classes can be used to represent a group of possible single characters that can match. 字符类可用于表示一组可能匹配的单个字符。 And the ^ symbol at the beginning of a character class negates the class, saying "Anything matches except for ...." In the context of split , whatever matches is considered the delimiter. 字符类开头的^符号使该类取反,说“除...以外的任何其他匹配项”。在split的上下文中,任何匹配项都视为定界符。

That being the case, `[^a-zA-Z0-9_] would match any character except for the ASCII letters 'a' through 'z', 'A' through 'Z', and the numeric digits '0' through '9', plus underscore. 在这种情况下,“ [^ a-zA-Z0-9_]”将匹配除ASCII字母“ a”至“ z”,“ A”至“ Z”以及数字数字“ 0”至“ 9',再加上下划线。 In your case, while this would correctly split on "," and "%" (since they're not included in az, AZ, 0-9, or _), it would mistakenly also split on "|", as well as any other character not included in the character class you attempted. 在您的情况下,尽管这会正确地分割为“,”和“%”(因为它们未包含在z,AZ,0-9或_中),但它也会错误地分割为“ |”以及您尝试的字符类中未包含的任何其他字符。

In your case it makes a lot more sense to be specific as to what delimiters to use, and to not use a negated class; 在您的情况下,明确使用什么定界符而不使用否定的类更有意义。 you want to specify the exact delimiters rather than the entire set of characters that delimiters cannot be. 您要指定确切的定界符,而不是定界符不能指定的整个字符集。 So as mpapec stated in his comment, a better choice would be [%,] . 因此,如mpapec在其评论中所述,更好的选择是[%,]

So your solution would look like this: 因此,您的解决方案如下所示:

my @ids = split/[%,]/, $string;

Once you split on ' % ' and ' , ', you'll be left with a bunch of substrings that look like this: 585|487|314|1|1 (or some variation on those numbers). 一旦在' % '和' , '上分割,您将得到一串看起来像这样的子字符串: 585|487|314|1|1 (或这些数字的一些变体)。 In each case, it's five positive integers separated by ' | 在每种情况下,它都是五个以' |分隔的正整数| ' characters. '个字符。 It seems possible to me that you'll end up wanting to break those down as well by splitting on ' | 在我看来,您可能最终也想通过分割' |将这些细分| '. '。

You could build a single data structure represented by list of lists, where each top level element represents a [,%] delimited field, and consists of a reference to an anonymous array consisting of the pipe-delimited fields. 您可以构建一个由列表列表表示的数据结构,其中每个顶级元素代表一个[,%]分隔字段,并且由对由管道分隔字段组成的匿名数组的引用组成。 The following code will build that structure: 以下代码将构建该结构:

my @ids = map { [ split /\|/, $_ ] } split /[%,]/, $string;

When that is run, you will end up with something like this: 运行该命令后,您将得到如下结果:

@ids = ( 
    [ '585', '487', '314', '1', '1' ],
    [ '651', '365', '302', '1', '1' ],
    # ...
);

Now each field within an ID can be inspected and manipulated individually. 现在,可以单独检查和处理ID中的每个字段。

To understand more about how character classes work, you could check perlrequick , which has a nice introduction to character classes. 要了解有关字符类如何工作的更多信息,可以查看perlrequick ,它对字符类进行了很好的介绍。 And for more information on split , there's always perldoc -f split (as mentioned by mpapec ). 有关split更多信息,总有perldoc -f split (如mpapec所述 )。 split is also discussed in chapter nine of the O'Reilly book, Learning Perl, 6th Edition. O'Reilly的书《 Learning Perl,第6版》第9章中也讨论了split

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM