简体   繁体   English

使用Python字典进行表/数据操作

[英]Table/Data manipulation with Python Dictionary

I need help finishing up this python script. 我需要完成此python脚本的帮助。 I'm an intern at a company, and this is my first week. 我是一家公司的实习生,这是我的第一周。 I was asked to develop a python script that will take a .csv and put(append) any related columns into one column so that they have only the 15 or so necessary columns with the data in them. 我被要求开发一个Python脚本,该脚本将使用.csv并将任何相关列放入(附加)到一列中,这样它们只有15个左右的必要列,其中包含数据。 For example, if there are zip4, zip5, or postal code columns, they want those to all be underneath the zip code column. 例如,如果有zip4,zip5或邮政编码列,则他们希望它们都在zip列下面。

I just started learning python this week as I was doing this project so please excuse my noobish question and vocabulary. 我本周正在做这个项目的时候就刚刚开始学习python,所以请原谅我的笨拙问题和词汇。 I'm not looking for you guys to do this for me. 我不是在找你们为我做这件事。 I'm just looking for some guidance. 我只是在寻找一些指导。 In fact, I want to learn more about python, so anyone who could lead me in the right direction, please help. 实际上,我想了解有关python的更多信息,所以任何可以引导我朝正确方向发展的人,请帮忙。

I'm using dictionary key and values. 我正在使用字典键和值。 The keys are every column in the first row. 键是第一行中的每一列。 The values of each key are the remaining rows(second through 3000ish). 每个键的值是剩余的行(第二到3000ish)。 Right now, I'm only getting one key:value pair. 现在,我只得到一个key:value对。 I'm only getting the final row as my array of values, and I'm only getting one key. 我只得到最后一行作为我的值数组,而且只得到一个键。 Also, I'm getting a KeyError message, so my key's aren't being identified correctly. 另外,我收到KeyError消息,因此无法正确识别我的钥匙。 My code so far is underneath. 到目前为止,我的代码位于下面。 I'm gonna keep working on this, and any help is immensely appreciated! 我将继续努力,我们将不胜感激! Hopefully, I can by the person who helps me a beer and I can pick their brain a little:) 希望我能和帮我喝啤酒的人在一起,我可以稍微动动脑子:)

Thanks for your time 谢谢你的时间

# To be able to read csv formated files, we will frist have to import the csv module
import csv

# cols = line.split(',')# each column is split by a comma
#read the file
CSVreader = csv.reader(open('N:/Individual Files/Jerry/2013 customer list qc, cr, db, gb 9-19-2013_JerrysMessingWithVersion.csv', 'rb'), delimiter=',', quotechar='"')

# define open dictionary
SLSDictionary={}# no empty dictionary. Need column names to compare to. 


i=0
#top row are your keys. All other rows are your values

#adjust loop
for row in CSVreader:
# mulitple loops needed here
    if i == 0:
            key = row[i]
    else:
            [values] = [row[1:]]
            SLSDictionary = dict({key: [values]}) # Dictionary is keys and array of values
    i=i+1


#print Dictionary to check errors and make sure dictionary is filled with keys and values        
print SLSDictionary

# SLSDictionary has key of zip/phone plus any characters
#SLSDictionary.has_key('zip.+')
SLSDictionary.has_key('phone.+')

#value of key are set equal to x. Values of that column set equal to x
#[x]=value

#IF SLSDictionary has the key of zip plus any characters, move values to zip key
#if true:   
#        SLSDictionary['zip'].append([x])
    #SLSDictionary['phone_home'].append([value]) # I need to append the values of the specific column, not all columns
    #move key's values  to correct, corresponding key
SLSDictionary['phone_home'].append(SLSDictionary[has_key('phone.+')])#Append the values of the key/column 'phone plus characters' to phone_home key/column in SLSDictionary
#if false:
#        print ''
    # go to next key

SLSDictionary.has_value('')

if true:
    print 'Error: No data in column'

# if there's no data in rows 1-?. Delete column
#if value <= 0:
#        del column

print SLSDictionary 

Found a couple of errors just quickly looking at it. 很快就发现了几个错误。 One thing you need to watch out for is that you're assigning a new value to the existing dictionary every time: 您需要注意的一件事是,您每次都会为现有字典分配一个新值:

SLSDictionary = dict({key: [values]})

You're re-assigning a new value to your SLSDictionary every time it enters that loop. 每次进入循环时,您都要为SLSDictionary重新分配一个新值。 Thus at the end you only have the bottom-most entry. 因此,最后您只有最底端的条目。 To add a key to the dictionary you do the following: 要将键添加到字典,请执行以下操作:

SLSDictionary[key] = values

Also you shouldn't need the brackets in this line: 另外,您也不需要此行中的括号:

[values] = [row[1:]]

Which should instead just be: 而是应该是:

values = row[1:]

But most importantly is that you will only ever have one key because you constantly increment your i value. 但最重要的是,您将永远只有一把钥匙,因为您会不断增加i值。 So it will only ever have one key and everything will constantly be assigned to it. 因此,它将永远只有一个密钥,并且所有内容都会不断分配给它。 Without a sample of how the CSV looks I can't instruct you on how to restructure the loop so that it will catch all the keys. 如果没有CSV外观的样本,我将无法指导您如何重新构造循环,以使其能够捕获所有键。

Assuming your CSV is like this as you've described: 假设您所描述的CSV是这样的:

Col1, Col2, Col3, Col4
Val1, Val2, Val3, Val4
Val11, Val22, Val33, Val44
Val111, Val222, Val333, Val444

Then you probably want something like this: 然后,您可能想要这样的东西:

dummy = [["col1", "col2", "col3", "col4"],
         ["val1", "val2", "val3", "val4"],
         ["val11", "val22", "val33", "val44"],
         ["val111", "val222", "val333", "val444"]]

column_index = []
SLSDictionary = {}

for each in dummy[0]:
    column_index.append(each)
    SLSDictionary[each] = []

for each in dummy[1:]:
    for i, every in enumerate(each):
        try:
            if column_index[i] in SLSDictionary.keys():
                SLSDictionary[column_index[i]].append(every)
        except:
            pass

print SLSDictionary

Which Yields... 哪个产量...

{'col4': ['val4', 'val44', 'val444'], 'col2': ['val2', 'val22', 'val222'], 'col3': ['val3', 'val33', 'val333'], 'col1': ['val1', 'val11', 'val111']}

If you want them to stay in order then change the dictionary type to OrderedDict() 如果希望它们保持顺序,则将字典类型更改为OrderedDict()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM