[英]Simple way to handle letter and number combination
Let's say I have two strings:假设我有两个字符串:
string_ex1 = 'AbC024'
string_ex2 = 'aBc24'
string_ex3 = 'AbC24'
string_ex4 = 'aBc24'
And I want a result that the two strings are equal if I compare each other.如果我相互比较,我想要两个字符串相等的结果。 For example 'AbC' == 'aBc', '024' == '24'
例如'AbC' == 'aBc', '024' == '24'
I already know if I distinguish them with \w+ and \d+ and convert to lowercase and to int respectively, I can get a result saying two strings are identical.我已经知道如果我用 \w+ 和 \d+ 区分它们并分别转换为小写和 int,我可以得到一个结果,说两个字符串是相同的。 But I want to know if there's some simpler function to do it.
但我想知道是否有一些更简单的 function 可以做到。
string1_str = lower(re.findall('\w+', string_ex1))
string1_int = int(re.findall('\d+', string_ex1))
string2_str = lower(re.findall('\w+', string_ex2))
string2_int = int(re.findall('\d+', string_ex2))
if string1_str == string2_str and string1_int == string2_int:
print('identical')
*Edit The comparison should work both for string_ex1, string_ex2 and string_ex3, string_ex4 *编辑 比较应该适用于 string_ex1、string_ex2 和 string_ex3、string_ex4
You can use a regex that removes leading zeros, then use casefold
comparison:您可以使用删除前导零的正则表达式,然后使用
casefold
比较:
import re
string_ex1 = 'AbC024'
string_ex2 = 'aBc24'
string_ex1 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex1)
string_ex2 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex2)
print(string_ex1.casefold() == string_ex2.casefold())
# True
Alternatively, you can call lower
on both strings when calling re.sub
:或者,您可以在调用
re.sub
时在两个字符串上调用lower
:
import re
string_ex1 = 'AbC024'
string_ex2 = 'aBc24'
string_ex1 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex1.lower())
string_ex2 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex2.lower())
print(string_ex1 == string_ex2)
There is no built-in way to do it.没有内置的方法可以做到这一点。 I'd suggest that for both strings you find the groups: only letters, or only digits and compare them in lower case and without leading zeros
我建议你找到两个字符串的组:只有字母,或者只有数字,并以小写形式比较它们,不带前导零
def test(str1, str2):
values1 = re.findall("([a-z]+|[0-9]+)", str1, flags=re.I)
values2 = re.findall("([a-z]+|[0-9]+)", str2, flags=re.I)
clean = lambda x: x.lower().lstrip("0")
return all(a == b for a, b in zip(map(clean, values1), map(clean, values2)))
print(test('AbC024', 'aBc24')) # True
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.