否定 Perl 正則表達式和 grep 中的括號字符類

Question

我正在嘗試解決一個非常簡單的問題 - 在只包含某些字母的數組中查找字符串。 但是，我在正則表達式和/或grep的行為中遇到了一些我不明白的問題。

#!/usr/bin/perl

use warnings;
use strict;

my @test_data = qw(ant bee cat dodo elephant giraffe horse);

# Words wanted include these letters only. Hardcoded for demonstration purposes
my @wanted_letters = qw/a c d i n o t/;

# Subtract those letters from the alphabet to find the letters to eliminate.
# Interpolate array into a negated bracketed character class, positive grep
# against a list of the lowercase alphabet: fine, gets befghjklmpqrsuvwxyz.
my @unwanted_letters = grep(/[^@wanted_letters]/, ('a' .. 'z'));

# The desired result can be simulated by hardcoding the unwanted letters into a
# bracketed character class then doing a negative grep: matches ant, cat, and dodo.
my @works = grep(!/[befghjklmpqrsuvwxyz]/, @test_data);

# Doing something similar but moving the negation into the bracketed character
# class fails and matches everything.
my @fails1 = grep(/[^befghjklmpqrsuvwxyz]/, @test_data);

# Doing the same thing that produced the array of unwanted letters also fails.
my @fails2 = grep(/[^@unwanted_letters]/, @test_data);

print join ' ', @works; print "\n";
print join ' ', @fails1; print "\n";
print join ' ', @fails2; print "\n";

問題：

為什么@works得到正確的結果而不是@fails1 ？ grep文檔建議前者，而perlrecharclass的否定部分建議后者，盡管它在其示例中使用=~ 。 這與使用grep有什么特別關系嗎？
為什么@fails2不起作用？ 它與數組與列表上下文有關嗎？ 否則它看起來與減法步驟相同。
除此之外，是否有一種純粹的正則表達式方法可以避免減法步驟？

Answer 1

兩者都通過添加錨點^和$以及量詞+來修復

這些都有效：

my @fails1 = grep(/^[^befghjklmpqrsuvwxyz]+$/, @test_data);
my @fails2 = grep(/^[^@unwanted_letters]+$/, @test_data);

請記住， /[^befghjklmpqrsuvwxyz]/或/[^@unwanted_letters]/只匹配一個字符。 添加+表示盡可能多。 添加^和$表示從字符串開頭到結尾的所有字符。

使用/[@wanted_letters]/如果有單個想要的字符（即使有不需要的字符），您將返回一個匹配項——邏輯上等同於any 。 與/^[@wanted_letters]+$/ ，其中所有字母都需要在@wanted_letters集合中，並且等效於all 。

Demo1只有一個字符，所以grep失敗。

Demo2量詞意味着不止一個但沒有錨點 - grep 失敗

Demo3錨點和量詞 - 預期結果。

一旦你了解字符類只匹配一個字符和錨整個字符串和量詞的比賽延伸到錨一切，你可以直接只需用想用grep字母：

my @wanted = grep(/^[@wanted_letters]+$/, @test_data);

Answer 2

您正在匹配字符串中任何字符集之外的內容。 但是它仍然可以在字符串的其他地方的字符集中包含字符。 例如，如果測試詞是elephant ，則否定字符類與a字符匹配。

如果要測試整個字符串，則需要對其進行量化並錨定到末端。

grep(/^[^befghjklmpqrsuvwxyz]*$/, @test_data);

翻譯成英文，就是“詞不包含在集合中的字符”和“詞包含不在集合中的字符”之間的區別。

否定 Perl 正則表達式和 grep 中的括號字符類

問題描述

2 個解決方案

解決方案1
2 2021-11-01 18:19:49

解決方案2
2 2021-11-01 18:20:55

否定 Perl 正則表達式和 grep 中的括號字符類

問題描述

2 個解決方案

解決方案1 2 2021-11-01 18:19:49

解決方案2 2 2021-11-01 18:20:55

解決方案1
2 2021-11-01 18:19:49

解決方案2
2 2021-11-01 18:20:55