简体   繁体   English

使用Python进行Excel列比较

[英]Excel column comparison using Python

I have a excel file in which there are some columns. 我有一个excel文件,其中有一些列。

COL 1    | COL 2    | COL 3  

ABCD     |  ABC(D)  |   CDA  
AB CD    | ABC D    |   C D - (B)  
A B C D  | (ABCD)   |   ABCD  
ABC D    | ABDC     | ABC D  
A(BC ) D |  AD B - C|   AB CD

I want to compare every column with every other column and want to print similarities and differences between columns. 我想将每列与每个其他列进行比较,并希望打印列之间的相似点和不同点。

for example : 例如 :

  1. comparing COL 1 and COL 2 比较COL 1和COL 2

    similarities : 相似之处:

     None 

    differences : 差异:

     ABCD AB CD ABCD A(BC ) D ABC(D) ABC D (ABCD) ABDC AD B - C 

then comparing COL 2 and COL 3 and then comparing COL 1 and COL 3. Need only exact string match, even a whitespace considered as mismatch. 然后比较COL 2和COL 3,然后比较COL 1和COL 3.只需要精确的字符串匹配,即使是被视为不匹配的空格。 It may be possible that column number may increase and comparison starts from 2nd row of the column. 列数可能会增加,比较从列的第2行开始。

How can I implement such recursive comparison in Python which gives me fast processing output? 如何在Python中实现这种递归比较,这使我能够快速处理输出?

You can use xlrd . 你可以使用xlrd First of all, read content from your file. 首先,阅读文件中的内容。 Second, save three columns into three dictionaries, since dict works faster in comparison. 其次,将三列保存为三个词典,因为dict比较快。 Third, do comparison work and output the result. 第三,做比较工作并输出结果。

I suggest you check API of xlrd and write code by yourself. 我建议你检查xlrd的API并自己编写代码。 Here is link . 这是链接

Any questions, feel free to ask. 有任何问题请随时询问我(们。

EDIT: 编辑:

Here is an example. 这是一个例子。

#!/usr/bin/python
#-*- coding:utf-8 -*-

name = {1:'a', 2:'b', 3:'c'}
lname = {1:'g', 2:'b', 3:'v'}
common = {}
diff_name   = {}
diff_lname  = {}


for key in name.keys():
    if name[key] == lname[key]:
        common[key] = name[key]
    else:
        diff_name[key] = name[key]
        diff_lname[key] = lname[key]

print 'common part is:', common
print 'diff_name  is: ', diff_name
print 'diff_lname  is: ', diff_lname

An algorithm might be 算法可能是

for colA in range(0, N):
     for colB in range (colA + 1, N - 1):
        compare(colA, colB)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM