简体   繁体   English

错误:使用正则表达式在 Python 脚本中错误转义

[英]Error: bad escape in Python script with regex

I am trying to do a search between lists and return the value when it matches, and when it does not.我正在尝试在列表之间进行搜索,并在匹配时返回值,当不匹配时返回值。

import re

array = ['brasil','argentina','chile','canada']
array2 = ['brasil.sao_paulo','chile','argentina']

for x,y in zip(array,array2):
  if re.search('\\{}\\b'.format(x), y, re.IGNORECASE):
    print("Match: {}".format(x))
  else:
    print("Not match: {}".format(y))

Output:输出:

Not match: brasil.sao_paulo
Not match: chile
Traceback (most recent call last):
  File "main.py", line 7, in <module>
    if re.search('\\{}\\b'.format(x), y, re.IGNORECASE):
  File "/usr/local/lib/python3.7/re.py", line 183, in search
re.error: bad escape \c at position 0

Desired output:期望的输出:

Match: brasil
Match: argentina
Match: chile
Not match: canada

If I understand correctly, you don't need regex here.如果我理解正确,这里不需要正则表达式。

group_1 = ['brasil','argentina','chile','canada']
group_2 = ['brasil.sao_paulo','chile','argentina']

for x in group_1:
    # For group 2 only, this picks out the part of the string that appears before the first ".".
  if x in [y.split('.')[0] for y in group_2]:
    print("Match: {}".format(x))
  else:
    print("Not match: {}".format(x))

which returns返回

Match: brasil
Match: argentina
Match: chile
Not match: canada

If you zip , you'll only get pairwise matches.如果你zip ,你只会得到成对匹配。 Given the nature of your search, you can just join up the haystack into a space-delimited string and join needles into a pattern with alternation and let findall chug away:鉴于您搜索的性质,您可以将 haystack 连接成一个以空格分隔的字符串,然后将针头连接成一个交替的模式,然后让findall突然消失:

>>> import re
>>> needles = ['brasil', 'argentina', 'chile', 'canada']
>>> haystack = ['brasil.sao_paulo', 'chile', 'argentina']
>>> re.findall(r"\b%s\b" % "|".join(needles), " ".join(haystack), re.I)
['brasil', 'chile', 'argentina']

The intent behind \\\\ in the original regex is unclear, so I assume you want \\b on both sides of the pattern.原始正则表达式中\\\\背后的意图尚不清楚,因此我假设您希望在模式的两侧都使用\\b

A simple solution with the any method:使用any方法的简单解决方案:

array = ['brasil', 'argentina', 'chile', 'canada']
array2 = ['brasil.sao_paulo', 'chile', 'argentina']

for x in array:
    if any(x.casefold() in y.casefold() for y in array2):
        print("Match:", x)
    else:
        print("Not match:", x)

Try it online! 在线试试吧!

Edit : Using casefold() to make it case-insensitive.编辑:使用casefold()使其不区分大小写。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM