简体   繁体   English

如何比较python中的两个字符串?

[英]How do I compare two strings in python?

I have two strings like我有两个字符串

string1="abc def ghi"

and

string2="def ghi abc"

How to get that this two string are same without breaking the words?如何在不破坏单词的情况下使这两个字符串相同?

Seems question is not about strings equality, but of sets equality.似乎问题不是关于字符串相等,而是关于集合相等。 You can compare them this way only by splitting strings and converting them to sets:只能通过拆分字符串并将它们转换为集合来以这种方式比较它们:

s1 = 'abc def ghi'
s2 = 'def ghi abc'
set1 = set(s1.split(' '))
set2 = set(s2.split(' '))
print set1 == set2

Result will be结果将是

True

If you want to know if both the strings are equal, you can simply do如果你想知道两个字符串是否相等,你可以简单地做

print string1 == string2

But if you want to know if they both have the same set of characters and they occur same number of times, you can use collections.Counter , like this但是如果你想知道它们是否有相同的字符集并且它们出现的次数相同,你可以使用collections.Counter ,像这样

>>> string1, string2 = "abc def ghi", "def ghi abc"
>>> from collections import Counter
>>> Counter(string1) == Counter(string2)
True
>>> s1="abc def ghi"
>>> s2="def ghi abc"
>>> s1 == s2  # For string comparison 
False
>>> sorted(list(s1)) == sorted(list(s2)) # For comparing if they have same characters. 
True
>>> sorted(list(s1))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>> sorted(list(s2))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

Equality in direct comparing:直接比较中的相等性:

string1 = "sample"
string2 = "sample"

if string1 == string2 :
    print("Strings are equal with text : ", string1," & " ,string2)
else :
    print ("Strings are not equal")

Equality in character sets:字符集中的平等:

string1 = 'abc def ghi'
string2 = 'def ghi abc'

set1 = set(string1.split(' '))
set2 = set(string2.split(' '))

print set1 == set2

if string1 == string2 :
    print("Strings are equal with text : ", string1," & " ,string2)
else :
    print ("Strings are not equal")

For that, you can use default difflib in python为此,您可以在 python 中使用默认的 difflib

from difflib import SequenceMatcher

def similar(a, b):
    return SequenceMatcher(None, a, b).ratio()

then call similar() as然后将 similar() 称为

similar(string1, string2)

it will return compare as ,ratio >= threshold to get match result它将返回 compare as ,ratio >= threshold 以获得匹配结果

Something like this:像这样的东西:

if string1 == string2:
    print 'they are the same'

update: if you want to see if each sub-string may exist in the other:更新:如果您想查看每个子字符串是否可能存在于另一个中:

elem1 = [x for x in string1.split()]
elem2 = [x for x in string2.split()]

for item in elem1:
    if item in elem2:
        print item

If you just need to check if the two strings are exactly same,如果您只需要检查两个字符串是否完全相同,

text1 = 'apple'

text2 = 'apple'

text1 == text2

The result will be结果将是

True

If you need the matching percentage,如果您需要匹配的百分比,

import difflib

text1 = 'Since 1958.'

text2 = 'Since 1958'

output = str(int(difflib.SequenceMatcher(None, text1, text2).ratio()*100))

Matching percentage output will be,匹配的百分比输出将是,

'95'

I am going to provide several solutions and you can choose the one that meets your needs:我将提供几种解决方案,您可以选择满足您需求的一种:

1) If you are concerned with just the characters, ie, same characters and having equal frequencies of each in both the strings, then use: 1)如果您只关心字符,即相同的字符并且在两个字符串中每个字符的频率相等,请使用:

''.join(sorted(string1)).strip() == ''.join(sorted(string2)).strip()

2) If you are also concerned with the number of spaces (white space characters) in both strings, then simply use the following snippet: 2)如果您还关心两个字符串中的空格(空白字符)的数量,那么只需使用以下代码片段:

sorted(string1) == sorted(string2)

3) If you are considering words but not their ordering and checking if both the strings have equal frequencies of words, regardless of their order/occurrence, then can use: 3)如果您正在考虑单词而不是它们的排序并检查两个字符串是否具有相同的单词频率,无论它们的顺序/出现如何,那么可以使用:

sorted(string1.split()) == sorted(string2.split())

4) Extending the above, if you are not concerned with the frequency count, but just need to make sure that both the strings contain the same set of words, then you can use the following: 4)扩展上面的,如果你不关心频率计数,而只需要确保两个字符串包含相同的单词,那么你可以使用以下内容:

set(string1.split()) == set(string2.split())

I think difflib is a good library to do this job我认为 difflib 是一个很好的库来完成这项工作

   >>>import difflib 
   >>> diff = difflib.Differ()
   >>> a='he is going home'
   >>> b='he is goes home'
   >>> list(diff.compare(a,b))
     ['  h', '  e', '   ', '  i', '  s', '   ', '  g', '  o', '+ e', '+ s', '- i', '- n', '- g', '   ', '  h', '  o', '  m', '  e']
    >>> list(diff.compare(a.split(),b.split()))
      ['  he', '  is', '- going', '+ goes', '  home']

open both of the files then compare them by splitting its word contents;打开两个文件,然后通过拆分其单词内容来比较它们;

log_file_A='file_A.txt'

log_file_B='file_B.txt'

read_A=open(log_file_A,'r')
read_A=read_A.read()
print read_A

read_B=open(log_file_B,'r')
read_B=read_B.read()
print read_B

File_A_set = set(read_A.split(' '))
File_A_set = set(read_B.split(' '))
print File_A_set == File_B_set

If you want a really simple answer:如果你想要一个非常简单的答案:

s_1 = "abc def ghi"
s_2 = "def ghi abc"
flag = 0
for i in s_1:
    if i not in s_2:
        flag = 1
if flag == 0:
    print("a == b")
else:
    print("a != b")

Try to covert both strings to upper or lower case.尝试将两个字符串转换为大写或小写。 Then you can use == comparison operator.然后你可以使用==比较运算符。

This is a pretty basic example, but after the logical comparisons (==) or string1.lower() == string2.lower() , maybe can be useful to try some of the basic metrics of distances between two strings.这是一个非常基本的例子,但在逻辑比较 (==) 或string1.lower() == string2.lower() ,也许可以用于尝试两个字符串之间距离的一些基本度量。

You can find examples everywhere related to these or some other metrics, try also the fuzzywuzzy package ( https://github.com/seatgeek/fuzzywuzzy ).您可以在任何地方找到与这些或其他一些指标相关的示例,也可以尝试使用 Fuzzywuzzy 包( https://github.com/seatgeek/fuzzywuzzy )。

import Levenshtein
import difflib

print(Levenshtein.ratio('String1', 'String2'))
print(difflib.SequenceMatcher(None, 'String1', 'String2').ratio())

You can use simple loops to check two strings are equal.您可以使用简单的循环来检查两个字符串是否相等。 .But ideally you can use something like return s1==s2 .但理想情况下,您可以使用类似 return s1==s2

s1 = 'hello'
s2 = 'hello'

a = []
for ele in s1:
    a.append(ele)
for i in range(len(s2)):
    if a[i]==s2[i]:
        a.pop()
if len(a)>0:
    return False
else:
    return True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM