使用正则表达式拆分 aa 字符串中的括号

Question

我需要拆分多项式的括号，就像这样。
'ac*(ab+(2ab+4ac))' --> ['ac*',['ab+',['2ab+4ac']]] 。
我尝试使用此正则表达式，但出了点问题。 \[[^\]]*\]|\([^\)]*\)|\"[^\"]*\"|\S+

Answer 1

编辑 2，

将@Cary Swoveland 的 Ruby 代码翻译成 Python 以演示执行此操作的递归方法！

def polyparse(string):
  start_idx = 0
  curr_idx = 0
  arr = []
  while curr_idx != len(string):
    try:
      lft_idx = string.index('(', curr_idx)  
    except ValueError:
      arr.append(string[curr_idx:len(string)+1])
      break
    if lft_idx > curr_idx:
      arr.append(string[curr_idx:lft_idx])
    rt_idx = find_matching(string, lft_idx+1)
    # code here to raise exception if rt_idx.nil?
    if rt_idx > lft_idx + 2:
      arr.append(polyparse(string[lft_idx+1:rt_idx]))
    curr_idx = rt_idx + 1
  return arr

def find_matching(string, start_idx):
  nbr_unmatched = 0
  for i in range(start_idx, len(string)):
    c = string[i]
    if c == ')':
      if nbr_unmatched == 0:
        return i
      nbr_unmatched = nbr_unmatched - 1
    if c == '(':
      nbr_unmatched = nbr_unmatched + 1
  return None

print(polyparse("ac*(ab+(2ab+4ac))"))

print(polyparse("ac*(ab+(2ab+4*(ac+bd)))+((x+2)*3)"))

退货：

['ac*', ['ab+', ['2ab+4ac']]]
['ac*', ['ab+', ['2ab+4*', ['ac+bd']]], '+', [['x+2'], '*3']]

编辑 1，原始方法不适用于更复杂的多项式，谢谢 @Cary Swoveland 指出这一点，但与以前的想法类似：将其转换为字符串表示，然后使用 json 安全地解析为列表：

import json
import re

def to_list(polynomial_exp):
  v = '[' + ''.join([x.replace('(', '[').replace(')', ']' ) for x in [x if re.search(r'(\(|\))', x) else ',"' + x + '",' for x in [x for x in re.split(r'(\(|\))', polynomial_exp) if x != '']]]) + ']'
  return json.loads(v.replace('[,', '[').replace(',]', ']'))

# original example: 
to_list('ac*(ab+(2ab+4ac))')

# more complex example:
to_list("ac*(ab+(2ab+4*(ac+bd)))+((x+2)*3)")

Output：

>>> to_list('ac*(ab+(2ab+4ac))')
['ac*', ['ab+', ['2ab+4ac']]]
>>> to_list("ac*(ab+(2ab+4*(ac+bd)))+((x+2)*3)")
['ac*', ['ab+', ['2ab+4*', ['ac+bd']]], '+', [['x+2'], '*3']]

Answer 2

假设字符串可以有任意级别的嵌套括号，我不认为正则表达式是生成所需数组的正确工具。 不知道 Python，我在 Ruby 中提供了一个（递归）解决方案。由于这两种语言在很多方面都很相似，我希望读者能够使用与我使用的算法类似的算法提供 Python 解决方案。 （即使不知道 Ruby 的读者也可能会弄清楚我的算法。）如果发布了 Python 解决方案，我将删除我的答案。

def polyparse(str)
  start_idx = 0
  curr_idx = 0
  arr = []
  loop do
    return arr if curr_idx == str.size 
    lft_idx = str.index('(', curr_idx)        
    return arr << str[curr_idx..-1] if lft_idx.nil?
    arr << str[curr_idx..lft_idx-1] if lft_idx > curr_idx
    rt_idx = find_matching(str, lft_idx+1)
    # code here to raise exception if rt_idx.nil?
    arr << polyparse(str[lft_idx+1..rt_idx-1]) if rt_idx > lft_idx + 2
    curr_idx = rt_idx + 1
  end 
end

def find_matching(str, start_idx)
  nbr_unmatched = 0
  (start_idx..str.size-1).each do |i|
    c = str[i]
    case c
    when ')'
      return i if nbr_unmatched.zero?
      nbr_unmatched -= 1
    when '('
      nbr_unmatched += 1
    end
  end
  nil
end

polyparse("ac*(ab+(2ab+4ac))")
  #=> ["ac*", ["ab+", ["2ab+4ac"]]]
polyparse("ac*(ab+(2ab+4*(ac+bd)))+((x+2)*3)")
  #=> ["ac*", ["ab+", ["2ab+4*", ["ac+bd"]]], "+", [["x+2"], "*3"]]

请参阅String#index并特别参考第二个（可选）参数。

笔记：

str = "ac*(ab+(2ab+4ac))"
       01234567890123456
           ^           ^
               ^      ^ 

find_matching(str, 3+1) #=> 16 
find_matching(str, 7+1) #=> 15

使用正则表达式拆分 aa 字符串中的括号

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-05-17 19:04:58

解决方案2
1 2020-05-17 19:45:16

使用正则表达式拆分 aa 字符串中的括号

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-05-17 19:04:58

解决方案2 1 2020-05-17 19:45:16

解决方案1
1 已采纳 2020-05-17 19:04:58

解决方案2
1 2020-05-17 19:45:16