使用 Python 和 Regex 獲取最后一次出現和剩余部分

Question

我正在嘗試使用 python 和正則表達式來獲取文件名（字符串）中的最后一組整數，該方法可以滿足我的需要，但是我還想返回正則表達式的逆向或剩余部分。 我怎樣才能做到這一點？

這是正則表達式([0-9]+|#+)(?..*([0-9]+|#+))

import re

values = [
    'image.0001',
    'image###',
    '###image###',
    'image001',
    'image_001',
    '001',
    '0001.image',
    '001image',
    '001_image',
    'image',
    '01_image01',
    '03_image01',
]

pattern = '([0-9]+|#+|@+)'
regex = '{0}(?!.*{0})'.format(pattern)

for v in values:
    result = re.search(regex, v)
    if result:
        print result.groups()

目前它正在返回.... ('01', None)我希望它返回類似('image', '0001')

更新

或者有一種方法可以按數字組拆分字符串...例如

'image.0001' > ['image.', '0001']
'image###' > ['image', '###']
'###image###' > ['###', 'image', '###']
'image001' > ['image', '001']
'image_001' > ['image_', '001']
'001' > ['001']
'0001.image' > ['0001', '.image']
'001image' > ['001', 'image']
'001_image' > ['001', '_image']
'image' > ['image']
'01_image01' > ['01', '_image', '01']
'03_image01' > ['03', '_image', '01']

Answer 1

編輯：

利用

re.findall(r'\d+|#+|@+|[^#@\d]+', v)

見證明。

解釋

--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  #+                       '#' (1 or more times (matching the most
                           amount possible))
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  @+                       '@' (1 or more times (matching the most
                           amount possible))
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  [^#@\d]+                 any character except: '#', '@', digits (0-
                           9) (1 or more times (matching the most
                           amount possible))

原始：使用re.split ，添加捕獲組以將捕獲的部分保留在結果中：

import re

values = [
    'image.0001',
    'image###',
    '###image###',
    'image001',
    'image_001',
    '001',
    '0001.image',
    '001image',
    '001_image',
    'image',
    '01_image01',
    '03_image01',
]

pattern = '[0-9]+|#+|@+'
regex = re.compile(r'({0})(?!.*(?:{0}))'.format(pattern))
for v in values:
    print(regex.split(v))

見Python 證明

結果：

['image.', '0001', '']
['image', '###', '']
['###image', '###', '']
['image', '001', '']
['image_', '001', '']
['', '001', '']
['', '0001', '.image']
['', '001', 'image']
['', '001', '_image']
['image']
['01_image', '01', '']
['03_image', '01', '']

請參閱正則表達式證明。

解釋

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [0-9]+                   any character of: '0' to '9' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    #+                       '#' (1 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    @+                       '@' (1 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    (?:                      group, but do not capture:
--------------------------------------------------------------------------------
      [0-9]+                   any character of: '0' to '9' (1 or
                               more times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      #+                       '#' (1 or more times (matching the
                               most amount possible))
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      @+                       '@' (1 or more times (matching the
                               most amount possible))
--------------------------------------------------------------------------------
    )                        end of grouping
--------------------------------------------------------------------------------
  )                        end of look-ahead

Answer 2

import re

values = [
    'image.0001',
    'image###',
    '###image###',
    'image001',
    'image_001',
    '001',
    '0001.image',
    '001image',
    '001_image',
    'image',
    '01_image01',
    '03_image01',
]

for v in values:
    print (re.sub(r"[^A-Za-z]+","",v), end = " ")
    print (re.sub(r"(.+[_.]){0,1}[^0-9]+","",v))

Output：

image 0001
image 
image 
image 001
image 001
 001
image 
image 001
image 
image 
image 01
image 01

使用 Python 和 Regex 獲取最后一次出現和剩余部分

問題描述

2 個解決方案

解決方案1
1 已采納 2021-01-07 21:24:50

解決方案2
0 2021-01-07 21:12:51

使用 Python 和 Regex 獲取最后一次出現和剩余部分

問題描述

2 個解決方案

解決方案1 1 已采納 2021-01-07 21:24:50

解決方案2 0 2021-01-07 21:12:51

解決方案1
1 已采納 2021-01-07 21:24:50

解決方案2
0 2021-01-07 21:12:51