如何在python中使用正則表達式形成單獨的塊？

Question

這是我的代碼：

results = re.finditer(r'([A-Z ?]+)\n+(.*)\n',inputfile,flags=re.MULTILINE)

for match in results:

    print match.groups()

I / P：

基本信息

姓名：約翰

電話號碼：+ 91-9876543210

DOB：21-10-1995

技能

Java的

蟒蛇

o / p :('基本信息'，'姓名：John'）（'技能集'，'Java'）

但需要o / p :('基本信息'，'姓名：John'，'電話號碼：+ 91-9876543210'，'DOB'：'21 -10-1995'）（'技能集'，'Java' ，'Python'）

Answer 1

更換re.MULTILINE與re.DOTALL讓你的.*多行（是的，標志的名字是有點誤導）相匹配。 您還需要將結果字符串拆分為\\n 。

通常，可能使用regexp執行此任務不是最好的主意，這應該更好：

import string
results = []
for line in inputfile.splitlines():
  if all(c in (string.ascii_uppercase + ' ') for c in line):
    results.append([ line ])
  elif line != '':
    results[-1].append(line)

Answer 2

使用正則表達式獲取所有輸出很困難，因為文件文本並不簡單。

但正則表達式+額外的努力，你可以輕松實現這一點

# This regex fetch all Titles (i.e. BASIC INFO, SKILL SET...)
results = re.findall(r"([A-Z ]{4,})", inputfile)

經過一些小工作將幫助您獲得理想的結果

items=[]
for z in results:
    item = inputfile[:inputfile.index(z)]
    inputfile = inputfile.replace(item,'')
    if item:
      items.append(filter(str,item.split('\n')))
items.append(filter(str,inputfile.split('\n')))
print items

輸出：
[['基本信息'，'姓名：John'，'電話號碼：+ 91-9876543210'，'DOB'：'21 -10-1995']，
['SKILL SET'，'Java'，'Python']
]

如何在python中使用正則表達式形成單獨的塊？

問題描述

2 個解決方案

解決方案1
0 2017-06-14 10:09:41

解決方案2
0 已采納 2017-06-14 10:28:21

如何在python中使用正則表達式形成單獨的塊？

問題描述

2 個解決方案

解決方案1 0 2017-06-14 10:09:41

解決方案2 0 已采納 2017-06-14 10:28:21

解決方案1
0 2017-06-14 10:09:41

解決方案2
0 已采納 2017-06-14 10:28:21