[英]Python — Read and compare rows and columns
0,1,foo
0,0,foo
0,1,foo
1,1,foobar
1,1,foobar
0,1,test
1,1,foobarbar
大約10,000個條目。
讓它成為csv文件。
我想知道Foo的第一篇專欄文章中有多少'0'。 並且第二列中foo的'1'和'0'的數量分別與foo有關。
我是否讀過該文件中的上一行並檢查? 有沒有辦法使用List理解來使用它? 我如何在那里維護一個櫃台?
預期產量:
Foo
Coloumn1 :
No. of 0's = 3
no. of 1's = 0
column 2:
no. of 1's =2
no. of 0's =1
from collections import defaultdict, Counter
import csv
with open('myfile.csv', 'rb') as inf:
incsv = csv.reader(inf)
col1, col2 = defaultdict(Counter), defaultdict(Counter)
for c1,c2,label in incsv:
col1[label][c1] += 1
col2[label][c2] += 1
labels = sorted(col1)
for lbl in labels:
print('{}:'.format(lbl))
print('Column1:')
for entry in ['0', '1']:
print("No. of {}'s = {}".format(entry, col1[lbl][entry]))
print('Column2:')
for entry in ['0', '1']:
print("No. of {}'s = {}".format(entry, col2[lbl][entry]))
回報
foo:
Column1:
No. of 0's = 3
No. of 1's = 0
Column2:
No. of 0's = 1
No. of 1's = 2
foobar:
Column1:
No. of 0's = 0
No. of 1's = 2
Column2:
No. of 0's = 0
No. of 1's = 2
foobarbar:
Column1:
No. of 0's = 0
No. of 1's = 1
Column2:
No. of 0's = 0
No. of 1's = 1
test:
Column1:
No. of 0's = 1
No. of 1's = 0
Column2:
No. of 0's = 0
No. of 1's = 1
datastring = """0,1,foo
0,0,foo
0,1,foo
1,1,foobar
1,1,foobar
0,1,test
1,1,foobarbar"""
def count_data(datastring):
datadict = {}
for line in datastring.split('\n'):
col1, col2, col3 = line.split(',')
for i, colval in enumerate((col1, col2)): # doing it this way in case there are more cols
datadict.setdefault(col3, {}).setdefault(colval, [0, 0])[i] += 1
return datadict
datadict = count_data(datastring)
輸出:
{'test': {'1': [0, 1], '0': [1, 0]}, 'foobar': {'1': [2, 2]}, 'foo': {'1': [0, 2], '0': [3, 1]}, 'foobarbar': {'1': [1, 1]}}
def print_data(datadict):
for key in datadict:
print key
for i, col in enumerate(datadict[key]):
print 'Column', i+1, ':'
colvalues = datadict[key][col]
for value in (0, 1):
print "Number of {0}'s:".format(value), colvalues[value]
test
Column 1 :
Number of 0's: 0
Number of 1's: 1
Column 2 :
Number of 0's: 1
Number of 1's: 0
foobar
Column 1 :
Number of 0's: 2
Number of 1's: 2
foo
Column 1 :
Number of 0's: 0
Number of 1's: 2
Column 2 :
Number of 0's: 3
Number of 1's: 1
foobarbar
Column 1 :
Number of 0's: 1
Number of 1's: 1
下面的代碼的列表理解創建,其中包括在最后一列的字符串是文件中的每一行列表'foo'
and
當前column
的字符串是你要找的數量。 打印該列表的長度將為您提供出現次數:
file.txt的:
0,1,foo
0,0,foo
0,1,foo
1,1,foobar
1,1,foobar
0,1,test
1,1,foobarbar
碼:
search_string = 'foo\n'
with open('file.txt', 'r') as f:
lines = list(f)
for column in [0, 1]: # Let's count columns from 0
print "Column %d: " % (column)
for number in ['0', '1']: # Strings for .csv file
print "Number of %s's = " % (number),
print len([line for line in lines if
(line.split(',')[-1] == search_string and
line.split(',')[column] == number)])
輸出:
Column 0:
Number of 0's = 3
Number of 1's = 0
Column 1:
Number of 0's = 1
Number of 1's = 2
file = "a.csv"
search = "foo"
lines = open(file).readlines()
(firstcol_zero, firstcol_one, secondcol_zero, secondcol_one) = (0, 0 ,0 ,0 )
for line in lines:
line = line.strip()
if not line : continue
split = line.split(',')
if not split[2] == search: continue
if (int(split[0]) == 0): firstcol_zero += 1
elif (int (split[0]) == 1): firstcol_one += 1
if (int(split[1]) == 0): secondcol_zero += 1
elif (int (split[1]) == 1): secondcol_one += 1
print firstcol_zero
print firstcol_one
print secondcol_zero
print secondcol_one
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.