使用python从csv文件中累积数据

Question

out_gate,useless_column,in_gate,num_connect
a,u,b,1
a,s,b,3
b,e,a,2
b,l,c,4
c,e,a,5
c,s,b,5
c,s,b,3
c,c,a,4
d,o,c,2
d,l,c,3
d,u,a,1
d,m,b,2

shown above is a given, sample csv file. 上面显示的是给定的示例csv文件。 First of all, My final goal is to get the answer as a form of csv file like below: 首先，我的最终目标是以如下形式的csv文件形式获得答案：

 ,a,b,c,d 
a,0,4,0,0 
b,2,0,4,0 
c,9,8,0,0 
d,1,2,5,0

I am trying to match this each data (a,b,c,d) one by one to the in_gate so, for example when out_gate 'c'-> in_gate 'b', number of connections is 8 and 'c'->'a' becomes 9. 我正在尝试将每个数据（a，b，c，d）一对一地匹配到in_gate，因此，例如，当out_gate'c'-> in_gate'b'时，连接数为8而'c'-> 'a'变成9。

I want to solve it with lists(or tuple, Dictionary, set) or collections. 我想用列表（或元组，字典，集合）或集合来解决它。 defaultdict WITHOUT USING PANDAS OR NUMPY, and I want a solution that can be applied to many gates (around 10 to 40) as well. defaultdict无需使用PANDAS或NUMPY，并且我想要一种也可以应用于许多门（大约10至40）的解决方案。

I understand there is a similar question and It helped a lot, but I still have some troubles in compiling. 我知道有一个类似的问题，它很有帮助，但是在编译时仍然有一些麻烦。 Lastly, Is there any way with using lists of columns and for loop? 最后，是否可以使用列列表和for循环？

((ex) list1=[a,b,c,d],list2=[b,b,a,c,a,b,b,a,c,c,a,b]) （（ex）list1 = [a，b，c，d]，list2 = [b，b，a，c，a，b，b，a，c，c，a，b]）

what if there are some useless columns that are not related to the data but the final goal remains same? 如果有一些无用的列与数据无关，但最终目标保持不变怎么办？

thanks 谢谢

Answer 1

I'd use a Counter for this task. 我会为此使用计数器。 To keep the code simple, I'll read the data from a string. 为了简化代码，我将从字符串中读取数据。 And I'll let you figure out how to produce the output as a CSV file in the format of your choice. 然后，我将告诉您如何以您选择的格式将输出生成为CSV文件。

import csv
from collections import Counter

data = '''\
out_gate,in_gate,num_connect
a,b,1
a,b,3
b,a,2
b,c,4
c,a,5
c,b,5
c,b,3
c,a,4
d,c,2
d,c,3
d,a,1
d,b,2
'''.splitlines()

reader = csv.reader(data)
#skip header
next(reader)
# A Counter to accumulate the data
counts = Counter()

# Accumulate the data
for ogate, igate, num in reader:
    counts[ogate, igate] += int(num)

# We could grab the keys from the data, but it's easier to hard-code them
keys = 'abcd'

# Display the accumulated data
for ogate in keys:
    print(ogate, [counts[ogate, igate] for igate in keys])

output 产量

a [0, 4, 0, 0]
b [2, 0, 4, 0]
c [9, 8, 0, 0]
d [1, 2, 5, 0]

Answer 2

If I understand your problem correctly, you could try and using a nested collections.defaultdict for this: 如果我正确理解了您的问题，则可以尝试为此使用嵌套的collections.defaultdict ：

import csv
from collections import defaultdict

d = defaultdict(lambda : defaultdict(int))

with open('gates.csv') as in_file:
    csv_reader = csv.reader(in_file)
    next(csv_reader)
    for row in csv_reader:
        outs, ins, connect = row
        d[outs][ins] += int(connect)

gates = sorted(d)
for outs in gates:
    print(outs, [d[outs][ins] for ins in gates])

Which Outputs: 哪些输出：

a [0, 4, 0, 0]
b [2, 0, 4, 0]
c [9, 8, 0, 0]
d [1, 2, 5, 0]

使用python从csv文件中累积数据

问题描述

2 个解决方案

解决方案1
1 2018-01-25 14:13:37

解决方案2
1 2018-01-25 14:38:44

使用python从csv文件中累积数据

问题描述

2 个解决方案

解决方案1 1 2018-01-25 14:13:37

解决方案2 1 2018-01-25 14:38:44

解决方案1
1 2018-01-25 14:13:37

解决方案2
1 2018-01-25 14:38:44