简体   繁体   English

将列表与电子表格进行比较(Python / xlrd)

[英]Comparing list to spreadsheet (Python/ xlrd)

I'm trying to compare two groups of account numbers (with Python and xlrd/ xlwt). 我正在尝试比较两组帐号(使用Python和xlrd / xlwt)。

The first is a list of my favorite accounts. 第一个是我最喜欢的帐户的列表。 The seconds is a list of accounts that were recorded when someone from that account called me for help. 该秒数是帐户中有人呼叫我寻求帮助时记录的帐户列表。 However, the second list is actually a spreadsheet and has more than just the account numbers, such as a case ID and a category. 但是,第二个列表实际上是一个电子表格,不仅具有帐号,例如案例ID和类别。 For instance, account '100' called in about 'Shoes' and was recorded as case #50. 例如,帐户“ 100”调用了有关“鞋”的记录,并记录为案例50。 (also assume that the spreadsheet has three columns: Account, Category, and Case #). (还假设电子表格具有三列:“帐户”,“类别”和“案例编号”)。

My objective is to look at the spreadsheet and find any of the times that someone from one of my favorite accounts (from the first list) called in for help. 我的目标是查看电子表格,并从我最喜欢的帐户之一(从第一个列表中)中找人打电话寻求帮助。 So I basically want to use something like 所以我基本上想使用类似

myFavoriteAccounts = ['100','200','300','400','500']

and then go through the entirety of the spreadsheet, printing any instance where one of my favorite accounts appears, as well as the case ID and the category. 然后遍历整个电子表格,打印出现我最喜欢的帐户之一的任何实例,以及案例ID和类别。

I've been able to find the accounts that appear in both lists: 我已经找到了出现在两个列表中的帐户:

match = set(myFavoriteAccounts) & set(spreadsheetAccountsColumn)

But I don't know how to iterate through the spreadsheet and catch each time one of those accounts appeared as well as the category and case ID. 但我不知道如何遍历电子表格并每次出现其中一个帐户以及类别和案例ID时都可以捕获。

I'd like to be able to determine, for instance, that account '100' called in on two separate occasions about 'Shoes' for case #50 and then again for 'Socks' and case #70. 我希望能够确定,例如,在案例50的两次“鞋”上分别调用了该帐户“ 100”,然后在案例70的情况下又调用了“袜子”。

Assuming your data is a csv, you can use fileptr.readlines() to read it in, then split the lines based on your deliminator, from there it should be very easy to say 假设您的数据是csv,则可以使用fileptr.readlines()读取数据,然后根据分母来分割行,从那里说起来应该很容易

data = open('myfilepath','r').readlines()
data = [ d.split('delim') for d in data ]
accountitems = {}

for row in data:
    if row[0] in match: # the account number
        accountitems.setdefault(row[0],[]).append(line)

this will build you a dictionary whose keys are the account matches, and whose values are a list of all entries containing that account 这将为您构建一个词典,其关键字是帐户匹配项,并且其值是包含该帐户的所有条目的列表

you can also take a look at a modified bit of code I did using python csv, which may be helpful: http://code.activestate.com/recipes/577996-ordered-csv-read-write-with-colum-based-lookup/ 您还可以查看我使用python csv所做的修改后的代码,这可能会有所帮助: http : //code.activestate.com/recipes/577996-ordered-csv-read-write-with-colum-based -抬头/

or 要么

import re
data = open('myfilepath','r').read() #note using read vs readlines
for fave in favelist:
    print "\n".join( re.findall(r"^%s.*$" % fave, data) ), "\n"

Here's some code as a skeleton. 这是一些基本代码。

xls = xlrd.open_workbook(xlsname)
worksheet = xls.sheet_by_name('Accounts') # Use whatever name is on the worksheet tab in excel
max_col = worksheet.ncols - 1 # Cells addressed from 0
max_row = worksheet.nrows - 1 # Cells addressed from 0
account_col = 0 # Assuming it's the first column

for row in xrange(0, max_row):
    value = xlrd.cell_value(row, account_col)
    if value in favorites:
        print "Do something"
        print "I can address other cells in this row if I want"
        for col in xrange(0, max_col):
            new_value = xlrd.cell_value(row, col)

I haven't tested this particular script, but I've used this method in my own programs. 我还没有测试这个特定的脚本,但是我已经在自己的程序中使用了这种方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM