簡體   English   中英

從字符串中創建兩個列表,括號中的字符串不包括在內

[英]Creating two lists from string excluding and including strings between brackets

支持我們有一個像這樣的字符串:

s = u'apple banana lemmon (hahaha) dog cat whale (hehehe) red blue black'

我要創建以下列表:

including = ['hahaha', 'hehehe']
excluding = ['apple banana lemmon (', ') dog cat whale (', ') red blue black']

首先使用regex直接列出:

including = re.findall('\((.*?)\)',s)

但是我無法在其他列表中找到類似的東西。 你可以幫幫我嗎? 先感謝您!!

使用RegEx

a = re.findall('\)?[^()]*\(?', s)
excluded = a[::2]
included = a[1::2]
print(included, excluded, sep='\n')

['hahaha', 'hehehe', '']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']

照顧空字符串

a = re.findall('\)?[^()]*\(?', s)
excluded = [*filter(bool, a[::2])]
included = [*filter(bool, a[1::2])]
print(included, excluded, sep='\n')

['hahaha', 'hehehe']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']

沒有正則表達式

from itertools import cycle

def f(s):
  c = cycle('()')
  a = {'(': 1, ')': 0}
  while s:
    d = next(c)
    i = s.find(d)
    if i > -1:
      j = a[d]
      yield d, s[:i + j]
      s = s[i + j:]
    else:
      yield d, s
      break

included = []
excluded = []

for k, v in f(s):
  if k == '(':
    excluded.append(v)
  else:
    included.append(v)

print(included, excluded, sep='\n')

['hahaha', 'hehehe']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']

而不覆蓋同樣的想法s

from itertools import cycle

def f(s):
  c = cycle('()')
  a = {'(': 1, ')': 0}
  j = 0
  while True:
    d = next(c)
    i = s.find(d, j)
    if i > -1:
      k = a[d]
      yield d, s[j:i + k]
      j = i + k
    else:
      yield d, s[j:]
      break

included = []
excluded = []

for k, v in f(s):
  if k == '(':
    excluded.append(v)
  else:
    included.append(v)

print(included, excluded, sep='\n')

['hahaha', 'hehehe']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']
excluding = re.split('|'.join(including), s)

對於您知道的簡單情況,包含信息將不包含特殊字符或正則表達式定義。

如果您不確定是否會這樣:

re.split('|'.join(map(re.escape, including)), s)

這將轉義特殊的正則表達式字符,否則將導致re.split函數功能異常

您可以使用正向后看和正向前看來在括號之間分割單詞:

>>> re.split(r'(?<=\().*?(?=\))', s)
['apple banana lemmon (', ') dog cat whale (', ') red blue black']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM