简体   繁体   English

如何比较两个字符串作为python中的文件内容?

[英]How to compare two strings as file content in python?

I have a file at a certain location, which was generated by python code.我在某个位置有一个文件,它是由 python 代码生成的。 I want to run the same code, to generate another file, but with the same name and the same location, so it will replace the first one.我想运行相同的代码,生成另一个文件,但具有相同的名称和相同的位置,因此它将替换第一个文件。 Before running the code, I saved the initial file contents to a string.在运行代码之前,我将初始文件内容保存为一个字符串。 After running the code, I saved the final file contents to another string.运行代码后,我将最终文件内容保存到另一个字符串中。 How could I compare data_initial and data_final strings as file contents and to highlight exactly what words differ in those two strings?我如何将 data_initial 和 data_final 字符串作为文件内容进行比较,并准确突出显示这两个字符串中哪些单词不同? I tried like this:我试过这样的:

    data_initial="1234"
    data_final="12345 new thing"
    first_set = set(data_initial)
    second_set = set(data_final)
    difference = first_set.symmetric_difference(second_set)

But this gives me:但这给了我:

difference is
 {'t', 'n', ' ', 'i', '5', 'e', 'h', 'w', 'g'}

I would like to see the words which are different, like我想看看不同的词,比如

12345 new thing

Also if it's possible to check for each phrase that changed.另外,是否可以检查每个更改的短语。

you are using set also remember that string in python is a list even a single character.您正在使用set还请记住 python 中的字符串是一个列表,即使是单个字符。 the symmetric_difference return a list of new character. symmetric_difference 返回新字符列表。

https://stackoverflow.com/a/30683765/18846844 this may satisfy what you want: https://stackoverflow.com/a/30683765/18846844这可能会满足你想要的:

I changed the solution by using a single loop.我通过使用单个循环更改了解决方案。 How about this:这个怎么样:

# First, I removed the split... it is already an array
str1 = input("Enter first string:")
str2 = input("Enter second string:")

#then creating a new variable to store the result after  
#comparing the strings. You note that I added result2 because 
#if string 2 is longer than string 1 then you have extra characters 
#in result 2, if string 1 is  longer then the result you want to take 
#a look at is result 2

result1 = ''
result2 = ''

#handle the case where one string is longer than the other
maxlen=len(str2) if len(str1)<len(str2) else len(str1)

#loop through the characters
for i in range(maxlen):
  #use a slice rather than index in case one string longer than other
  letter1=str1[i:i+1]
  letter2=str2[i:i+1]
  #create string with differences
  if letter1 != letter2:
    result1+=letter1
    result2+=letter2

#print out result
print ("Letters different in string 1:",result1)
print ("Letters different in string 2:",result2)

If you want to get the changed words , not chars , simply transform your whole strings in lists and then sets of words by calling split() - see code below.如果您想获得更改后的words ,而不是chars ,只需将整个字符串转换为列表,然后通过调用split()转换单词集 - 请参见下面的代码。 Also, the same can be done if you want to get which sentences are changed in a paragraph, probably splitting by \n or .此外,如果您想获取段落中更改了哪些句子,也可以这样做,可能是用\n或 分割.

str1 = "the quick brown fox jumps over the lazy dog"
str2 = "the slow brown bull jumps over the crazy frog"
wset1=set(list(str1))
wset2=set(list(str2))

##words that are in one sentence and not in the other
wset1.symmetric_difference(wset2)
{'slow', 'crazy', 'frog', 'fox', 'quick', 'bull', 'lazy', 'dog'}

##words in str1 and not in str2
wset1.difference(wset2)
{'fox', 'quick', 'lazy', 'dog'}

But if you want a more comprehensive solution, ie you want to know which words replaced which, you should have a look at the standard library difflib module.但是如果你想要一个更全面的解决方案,即你想知道哪些单词替换了哪些,你应该看看标准库difflib模块。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM