如何构建正则表达式来查找以\\ n和字母开头以及以数字或单词结尾的单词？

Question

Here's an example of string, spacing after digit could be different. 这是一个字符串示例，数字后的空格可能不同。

product_list = 'Buy:\n Milk \nYoughurt 4 \nBread  \nSausages 4     \nBanana '

I want to build a regexp with the following output: 我想用以下输出构建一个正则表达式：

import re

re.findall(r'some pattern', product_list)
['Milk', 'Youghurt 4', 'Bread', 'Sausages 4', 'Banana']

This is what I thought it should look like. 这就是我认为的样子。 However, it returns empty list: 但是，它返回空列表：

re.findall(r'\n(\w+\w$))', product_list)

Answer 1

The approach of the below script is to first strip off the leading term:\\n in this case Buy:\\n . 以下脚本的方法是首先删除开头的term:\\n在这种情况下为Buy:\\n 。 Then, we use re.findall with the following pattern to find all matches: 然后，将re.findall与以下模式结合使用以查找所有匹配项：

(.+?)\s*(?:\n|$)

This says to capture anything up until the first optional whitespace character, which is then followed by a newline, or the end of the string. 这表示要捕获直到第一个可选的空白字符为止的所有内容，然后再跟换行符或字符串的末尾。

product_list = 'Buy:\n Milk \nYoughurt 4 \nBread  \nSausages 4     \nBanana '
product_list = re.sub(r'^[^\s]*\s+', '', product_list)

matches = re.findall(r'(.+?)\s*(?:\n|$)', product_list)
print(matches)

['Milk', 'Youghurt 4', 'Bread', 'Sausages 4', 'Banana']

Answer 2

I would suggest to use a non-regex (a regex seems expensive), if you can guarantee similar pattern of input: 如果可以保证类似的输入模式，我建议使用非正则表达式（正则表达式似乎很昂贵）：

list(map(lambda x: x.strip(), product_list.split('\n')))[1:]

Code : 代码：

product_list = 'Buy:\n Milk \nYoughurt 4 \nBread  \nSausages 4     \nBanana '

print(list(map(lambda x: x.strip(), product_list.split('\n')))[1:])
# ['Milk', 'Youghurt 4', 'Bread', 'Sausages 4', 'Banana']

Answer 3

This example can be done without a regex, split on : and then \\n 此示例可以在不使用正则表达式的情况下完成:先在:分割，然后\\n

actual_list = 'Buy:\n Milk \nYoughurt 4 \nBread  \nSausages 4     \nBanana '
product_list = actual_list.split(':')[1]
processed_list = [product.strip() for product in product_list.split('\n') if product.strip() != '']
print(processed_list)
#['Milk', 'Youghurt 4', 'Bread', 'Sausages 4', 'Banana']

如何构建正则表达式来查找以\\ n和字母开头以及以数字或单词结尾的单词？

问题描述

3 个解决方案

解决方案1
1 2019-04-16 15:13:14

解决方案2
1 已采纳 2019-04-16 15:13:33

解决方案3
0 2019-04-16 15:11:55

如何构建正则表达式来查找以\\ n和字母开头以及以数字或单词结尾的单词？

问题描述

3 个解决方案

解决方案1 1 2019-04-16 15:13:14

解决方案2 1 已采纳 2019-04-16 15:13:33

解决方案3 0 2019-04-16 15:11:55

解决方案1
1 2019-04-16 15:13:14

解决方案2
1 已采纳 2019-04-16 15:13:33

解决方案3
0 2019-04-16 15:11:55