简体   繁体   English

处理字母和数字组合的简单方法

[英]Simple way to handle letter and number combination

Let's say I have two strings:假设我有两个字符串:

string_ex1 = 'AbC024'
string_ex2 = 'aBc24'

string_ex3 = 'AbC24'
string_ex4 = 'aBc24'

And I want a result that the two strings are equal if I compare each other.如果我相互比较,我想要两个字符串相等的结果。 For example 'AbC' == 'aBc', '024' == '24'例如'AbC' == 'aBc', '024' == '24'

I already know if I distinguish them with \w+ and \d+ and convert to lowercase and to int respectively, I can get a result saying two strings are identical.我已经知道如果我用 \w+ 和 \d+ 区分它们并分别转换为小写和 int,我可以得到一个结果,说两个字符串是相同的。 But I want to know if there's some simpler function to do it.但我想知道是否有一些更简单的 function 可以做到。

string1_str = lower(re.findall('\w+', string_ex1))
string1_int = int(re.findall('\d+', string_ex1))
string2_str = lower(re.findall('\w+', string_ex2))
string2_int = int(re.findall('\d+', string_ex2))

if string1_str == string2_str and string1_int == string2_int:
    print('identical')

*Edit The comparison should work both for string_ex1, string_ex2 and string_ex3, string_ex4 *编辑 比较应该适用于 string_ex1、string_ex2 和 string_ex3、string_ex4

You can use a regex that removes leading zeros, then use casefold comparison:您可以使用删除前导零的正则表达式,然后使用casefold比较:

import re

string_ex1 = 'AbC024'
string_ex2 = 'aBc24'

string_ex1 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex1)
string_ex2 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex2)

print(string_ex1.casefold() == string_ex2.casefold())
# True

Alternatively, you can call lower on both strings when calling re.sub :或者,您可以在调用re.sub时在两个字符串上调用lower

import re

string_ex1 = 'AbC024'
string_ex2 = 'aBc24'

string_ex1 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex1.lower())
string_ex2 = re.sub(r'(?<=\D)0+(?=\d)', '', string_ex2.lower())

print(string_ex1 == string_ex2)

There is no built-in way to do it.没有内置的方法可以做到这一点。 I'd suggest that for both strings you find the groups: only letters, or only digits and compare them in lower case and without leading zeros我建议你找到两个字符串的组:只有字母,或者只有数字,并以小写形式比较它们,不带前导零

def test(str1, str2):
    values1 = re.findall("([a-z]+|[0-9]+)", str1, flags=re.I)
    values2 = re.findall("([a-z]+|[0-9]+)", str2, flags=re.I)
    clean = lambda x: x.lower().lstrip("0")
    return all(a == b for a, b in zip(map(clean, values1), map(clean, values2)))

print(test('AbC024', 'aBc24'))  # True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM