簡體   English   中英

如何使用 Python 正則表達式在多行文本中查找重復模式?

[英]How to find repeated pattern in multiline text with Python regex?

我對 Python 的正則表達式模塊非常陌生。 我正在嘗試查找提出問題的問題編號和相應的公司名稱。 我的文字如下所示:

文字輸入:

text = """
# Daily Coding Problem

Solutions to problems sent by dailycodingproblem.com

---

#### Problem 1

Given a list of numbers, return whether any two sums to k.
For example, given [10, 15, 3, 7] and k of 17, return true since 10 + 7 is 17.

Bonus: Can you do this in one pass?

[Solution](solutions/problem_001.py)

---

#### Problem 2

This problem was asked by Uber.

Given an array of integers, return a new array such that each element at index i of the new array is the product of all the numbers in the original array except the one at i.

For example, if our input was [1, 2, 3, 4, 5], the expected output would be [120, 60, 40, 30, 24]. If our input was [3, 2, 1], the expected output would be [2, 3, 6].

Follow-up: what if you can't use division?

[Solution](solutions/problem_002.py)

---

#### Problem 3

This problem was asked by Google.

Given the root to a binary tree, implement serialize(root), which serializes the tree into a string, and deserialize(s), which deserializes the string back into the tree.

[Solution](solutions/problem_003.py)

---

"""

import re
from pathlib import Path

pat = r"Problem (\d+)$\n.*asked by (.*)\.$"
out = re.findall(pat,text,flags=re.MULTILINE)
print(out)

"""

我的代碼嘗試:

import re

pat = r"^Problem (\d+).* asked by (\w+[\s]\w+)."
out = re.findall(pat, text, flags=re.MULTILINE|re.DOTALL)

print(out)
# [('1', 'Google company')]

但是我弄錯了 output。 如何獲得正確的預期答案:

problem_num = [2,3]
company = ["Uber", "Google"]

我假設帶有“詢問者”的行總是在問題編號之后。 對我來說,它適用於模式。

pat = r"### Problem (\d+)$\n*.*asked by ([a-zA-Z]+)\."
out = re.findall(pat,text,flags=re.MULTILINE)

$ - 由於 MULTILINE 標志而導致的行尾

請注意,這將獲得“谷歌公司”而不是“谷歌”

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM