Ruby 正則表達式：字符串開頭和結尾的空格

Question

我想找到所有名字開頭或結尾有空格的用戶。 它可能看起來像：“Juliette”或“Juliette”現在我只有在空格位於字符串末尾時匹配的正則表達式： ^[ab]:[[:space:]]|$我沒有找到如何匹配字符串開頭的空格，我不知道是否可以在一個正則表達式中完成這兩個條件？ 謝謝你的幫助。

Answer 1

測試沒有 Regexp 的可剝離空白

有一個小技巧可以用於String#strip！ ，如果找不到要剝離的空格，則返回nil 。 例如：

# return true if str has leading/trailing whitespace;
# otherwise returns false
def strippable? str
  { str => !!str.dup.strip! }
end

# leading space, trailing space, no space
test_values = [ ' foo', 'foo ', 'foo' ]

test_values.map { |str| strippable? str }
#=> [{" foo"=>true}, {"foo "=>true}, {"foo"=>false}]

這不依賴於正則表達式，而是依賴於 String 的屬性和倒置#strip 的 Boolean 結果。不管 Ruby 引擎是否在后台使用正則表達式，這些類型的 String 方法通常比可比較的正則表達式匹配。 但您的里程和具體用例可能會有所不同。

正則表達式的替代方案

使用與上面相同的測試數據，您可以使用正則表達式執行類似的操作。 例如：

# leading space, trailing space, no space
test_values = [ ' foo', 'foo ', 'foo' ]

# test start/end of string
test_values = [ ' foo', 'foo ', 'foo' ].grep /\A\s+|\s+\z/
#=> [" foo", "foo "]

# test start/end of line
test_values = [ ' foo', 'foo ', 'foo' ].grep /^\s+|\s+$/
#=> [" foo", "foo "]

基准

require 'benchmark'

ITERATIONS  = 1_000_000
TEST_VALUES = [ ' foo', 'foo ', 'foo' ]

def regex_grep array
  array.grep /^\s+|\s+$/
end

def string_strip array
  array.map { |str| { str => !!str.dup.strip! } }
end

Benchmark.bmbm do |x|
  n = ITERATIONS
  x.report('regex') { n.times { regexp_grep  TEST_VALUES } }
  x.report('strip') { n.times { string_strip TEST_VALUES } }
end

 user system total real regex 1.539269 0.001325 1.540594 ( 1.541438) strip 1.256836 0.001357 1.258193 ( 1.259955)

超過一百萬次迭代的四分之一秒可能看起來差別不大，但在更大的數據集或迭代上，它可以加起來。 是否足以讓您關心這個特定的用例取決於您，但一般模式是原生 String 方法（無論它們如何由解釋器在后台實現）通常比正則表達式模式更快匹配。 當然有邊緣情況，但這就是基准測試的目的！

Answer 2

您可以使用

/\A([a-zA-Z]+ | [a-zA-Z]+)\z/
/\A(?:[[:alpha:]]+[[:space:]]|[[:space:]][[:alpha:]]+)\z/
/\A(?:\p{L}+[\p{Z}\t]|[\p{Z}\t]\p{L}+)\z/

請參閱Rubular 演示（使用線錨而不是用於演示目的的字符串錨）

詳情：

\A - 字符串起始錨點
(...) - 一個捕獲組
(?:...) - 一個非捕獲組（這里是首選，因為您不提取，只是驗證）
[a-zA-Z]+ - 任意一個或多個 ASCII 字母
\p{L}+ - 任何一個或多個 Unicode 字母
| - 或者
\z - 字符串錨的結尾。

Ruby 正則表達式：字符串開頭和結尾的空格

問題描述

2 個解決方案

解決方案1
1 2020-12-11 14:04:18

測試沒有 Regexp 的可剝離空白

正則表達式的替代方案

基准

解決方案2
0 2020-12-11 10:14:48

Ruby 正則表達式：字符串開頭和結尾的空格

問題描述

2 個解決方案

解決方案1 1 2020-12-11 14:04:18

測試沒有 Regexp 的可剝離空白

正則表達式的替代方案

基准

解決方案2 0 2020-12-11 10:14:48

解決方案1
1 2020-12-11 14:04:18

解決方案2
0 2020-12-11 10:14:48