Ruby regex 從只包含一個數字的字符串中提取一個數字並修剪逗號后的部分

Question

我有一種使用正則表達式從字符串中提取數字的方法，如下所示：

def format 
  str = "R$ 10.000,00 + Benefits"
  str.split(/[^\d]/).join
end

它的回報 --> 1000000 。 我需要修改正則表達式以返回10000 ，刪除逗號后的零。

Answer 1

您可以使用

str.gsub(/(?<=\d),\d+|\D/, '')

請參閱正則表達式演示。

正則表達式詳情

(?<=\\d),\\d+ - 一個逗號緊跟一個數字（ (?<=\\d)是一個正向后視），然后是一個或多個數字
| - 或者
\\D - 任何非數字符號

一個重要的方面是您應該像這樣訂購這些替代品， \\D必須用作最后一個替代品。 否則， \\D可以匹配 a ,該解決方案將不起作用。

Answer 2

str = "R$ 10.000,00 R$1.200.000,03 R$ 0,09 R$ 4.00,10 R$ 3.30005,00 R$ 6.700 R$ 6, R$ 6,0 R$ 00,20 R$6,001 US$ 5.122,00 Benefits"

R = /(?:(?<=\bR\$)|(?<=\bR\$ ))(?:0|[1-9]\d{0,2}(?:\.\d{3})*),\d{2}(?!\d)/

str.scan(R).map { |s| s.delete('.') }
  #=> ["10000,00", "1200000,03", "0,09"]

以下子字符串均不匹配，因為它們的格式無效： "4.00,10" 、 " 3.30005,00" 、 "6.700" 、 "6," 、 "6,0" 、 "00,20" 、 "6,001"和"5.122,00" （最后一個，因為它前面沒有"$R"或"$R " 。

正則表達式可以以自由間距模式( /x ) 編寫，以使其具有自文檔性。

R = /
    (?:            # begin non-capture group
      (?<=\bR\$)   # positive lookbehind asserts match is preceded by 'R$'
                   #   that is preceded by a word break
      |            # or
      (?<=\bR\$\ ) # positive lookbehind asserts match is preceded by 'R$ '
                   #   that is preceded by a word break
    )              # end non-capture group
    (?<=           # begin negative lookbehind 
      $R[ ])       #  asserts that match is preceded by a space
    (?:            # begin non-capture group
      0            # match zero
      |            # or
      [1-9]        # match a digit other than zero
      \d{0,2}      # match 0-2 digits
      (?:\.\d{3})  # match '.' followed by three digits in a non-capture group 
      *            # execute preceding non-capture group 0+ times
    )              # end non-capture group
    ,\d{2}         # match ',' followed by two digits
    (?!\d)         # negative lookahead asserts match is not followed by a digit
    /x

Answer 3

這是一個稍長但可能更簡單、更容易理解的解決方案。 您可以將其用作Wiktor Stribiżew出色而簡潔的答案以及Cary Swoveland非常徹底和完整的答案的替代方案。 請注意，我的答案可能不適用於某些（更復雜的）字符串，如下面Cary的評論中所述。

str = "R$ 10.000,00 + Benefits"
puts str.gsub(/^.*?(\d+[\d.]*).*$/, '\1').gsub(/[.]/, '')
# => 10000

這里gsub被應用於輸入字符串兩次：

gsub(/^.*?(\\d+[\\d.]*).*$/, '\\1') ：抓取10.000部分。
^是字符串的開頭。
.*? 是任何重復 0 次或多次的字符，非貪婪的（即最少次數）。
(\\d[\\d.]*)是任何后跟數字或文字點 ( . ) 的數字。 括號捕獲它並放入第一個捕獲組（稍后用作'\\1'作為替換字符串）。
.*是任何重復 0 次或更多次的字符，貪婪（即盡可能多）。
$是字符串的結尾。
因此，我們用第一個捕獲的組替換整個字符串： '\\1' ，這里是10.000 。 請記住在\\1周圍使用單引號，否則像這樣將其轉義兩次： "\\\\1" 。
gsub(/[.]/, '') ：刪除字符串中的所有文字點 ( . )。

請注意，此代碼對許多類似的字符串進行了預期的替換（但沒有什么更好的，例如保留001原樣）：

['R$ 10.000,00 + Benefits',
 'R$      0,00 + Benefits',
 'R$   .001,00 + Benefits',
 '.  10.000,00 + Benefits',].each do |str|
  puts [str, str.gsub(/^.*?(\d+[\d.]*).*$/, '\1').gsub(/[.]/, '')].join(" => ")
end

輸出：

R$ 10.000,00 + Benefits => 10000
R$      0,00 + Benefits => 0
R$   .001,00 + Benefits => 001
.  10.000,00 + Benefits => 10000

Ruby regex 從只包含一個數字的字符串中提取一個數字並修剪逗號后的部分

問題描述

3 個解決方案

解決方案1
2 2020-10-05 16:58:44

解決方案2
2 2020-10-05 18:55:49

解決方案3
1 2020-10-05 17:41:43

Ruby regex 從只包含一個數字的字符串中提取一個數字並修剪逗號后的部分

問題描述

3 個解決方案

解決方案1 2 2020-10-05 16:58:44

解決方案2 2 2020-10-05 18:55:49

解決方案3 1 2020-10-05 17:41:43

解決方案1
2 2020-10-05 16:58:44

解決方案2
2 2020-10-05 18:55:49

解決方案3
1 2020-10-05 17:41:43