[英]How do string comparisons work when they contain both numbers and letters?
I'm trying to compare time in Python, and came up with some weird comparisons. 我正在尝试比较Python中的时间,并提出了一些奇怪的比较。 I've got no idea how the following statements work: 我不知道以下语句如何工作:
>>> "17:30" > "16:30"
True
>>> "12:30" > "13:30"
False
>>> '18:00 - asdfj' > '16:30 - asdfj'
True
My guess is that it takes the first number from before the colon, I'm not completely sure about it. 我的猜测是,它需要冒号之前的第一个数字,我不确定。
Basically, in python, it is lexicographical comparison. 基本上,在python中,它是字典比较。
Example 'a' comes before 'b', hence 'a' < 'b' is true. 示例“ a”在“ b”之前,因此“ a” <“ b”为真。 Similarly '2' < '3'. 同样,'2'<'3'。 Hence '199' < '2' is true because 1 comes before 2. 因此,'199'<'2'是正确的,因为1在2之前。
As others have pointed out, a comparison between strings is a question of lexicographical ordering. 正如其他人指出的那样,字符串之间的比较是字典顺序的问题。
What that means procedurally: 在程序上意味着什么:
For example, 'ab' > 'a'
is True
, because 'a' == 'a'
, but the first string has an extra character. 例如, 'ab' > 'a'
为True
,因为'a' == 'a'
,但是第一个字符串有一个额外的字符。 And 'abc' < 'abd'
because 'c' < 'd'
. 还有'abc' < 'abd'
因为'c' < 'd'
。
'a' < 'b'
because ord('a') < ord('b')
. 'a' < 'b'
因为ord('a') < ord('b')
。 The ordinal value of a character is typically its ASCII value for normal characters, or more precisely, its the Unicode code point ( https://docs.python.org/3/library/functions.html#ord ). 字符的序数值通常是普通字符的ASCII值,或更确切地说,是Unicode代码点( https://docs.python.org/3/library/functions.html#ord )。 This also means that 'A' < 'a'
, because uppercase letters come before lowercase letters in Unicode. 这也意味着'A' < 'a'
,因为在Unicode中大写字母位于小写字母之前。 And '1' < 'A'
because numbers come before letters. 而'1' < 'A'
是因为数字位于字母之前。
Note that this may sometimes give surprising results (note the dots on the Ӓ
): 请注意,有时这可能会产生令人惊讶的结果(请注意Ӓ
上的点):
>>> 'Ӓ' > 'a'
True
>>> 'A' > 'a'
False
There are many online tables and overviews of Unicode, but here's a fairly plain example: https://www.tamasoft.co.jp/en/general-info/unicode.html 有许多在线表和Unicode概述,但这是一个非常简单的示例: https : //www.tamasoft.co.jp/en/general-info/unicode.html
As for your example: 至于你的例子:
>>> '18:00 - asdfj' > '16:30 - asdfj'
True
This makes sense, because '8' > '6'
- the rest of the string doesn't matter. 这是有道理的,因为'8' > '6'
6'-字符串的其余部分无关紧要。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.