[英]Reading csv file and compare objects to a list
I have a .txt file,primary list, with strings like this: 我有一个.txt文件,主列表,其字符串如下:
f
r
y
h
g
j
and I have a .csv file,recipes list, with rows like this: 而且我有一个.csv文件,配方列表,具有以下行:
d,g,r,e,w,s
j,f,o,b,x,q,h
y,n,b,w,q,j
My programe is going throw each row and counts number of objects which belongs to primary list, for example in this case outcome is: 2 3 2 I always get 0, the mistake must be silly, but I can't figure it out: 我的程序将抛出每一行并计算属于主列表的对象数,例如,在这种情况下,结果是:2 3 2我总是得到0,错误一定很愚蠢,但我无法弄清楚:
from __future__ import print_function
import csv
primary_data = open('test_list.txt','r')
primary_list = []
for line in primary_data.readlines():
line.strip('\n')
primary_list.append(line)
recipes_reader = csv.reader(open('test.csv','r'), delimiter =',')
for row in recipes_reader:
primary_count = 0
for i in row:
if i in primary_list:
primary_count += 1
print (primary_count)
The reading into primary_list
adds \\n
to each number - you should remove it: 读入
primary_list
会为每个数字添加\\n
您应该将其删除:
When appending to primary_list
do: 在附加到
primary_list
请执行以下操作:
for line in primary_data:
primary_list.append(line.strip())
Note the strip
call. 注意
strip
。 Also, as you can see, you don't really need realines
, since for line in primary_data
already does what you need when primary_data
is a file object. 而且,如您所见,您实际上并不需要
realines
,因为当primary_data
是文件对象时, for line in primary_data
已经realines
您的需要。
Now, as a general comment, since you're using the primary list for lookup, I suggest replacing the list by a set - this will make things much faster if the list is large. 现在,作为一般性评论,由于您使用的是主要列表进行查找,因此建议您将列表替换为一组-如果列表很大,这将使处理速度更快。 Python sets are very efficient for key-based lookup, lists are not designed for that purpose.
Python集对于基于键的查找非常有效,而列表并不是为此目的而设计的。
Following code would solve the problem. 以下代码可以解决问题。
from __future__ import print_function
import csv
primary_data = open('test_list.txt','r')
primary_list = [line.rstrip() for line in primary_data]
recipies_reader = csv.reader(open('recipies.csv','r'), delimiter =',')
for row in recipies_reader:
count = 0
for i in row:
if i in primary_list:
count += 1
print (count)
Output 输出量
2
3
2
Here's the bare-essentials pedal-to-the-metal version: 这是基本的踏板到金属版本:
from __future__ import print_function
import csv
with open('test_list.txt', 'r') as f: # with statement ensures your file is closed
primary_set = set(line.strip() for line in f)
with open('test.csv', 'rb') as f: #### see note below ###
for row in csv.reader(f): # delimiter=',' is the default
print(sum(i in primary_set for i in row)) # i in primary_set has int value 0 or 1
Note: In Python 2.x, always open csv files in binary mode. 注意:在Python 2.x中,始终以二进制模式打开csv文件。 In Python3.x, always open csv files with
newline=''
在Python3.x中,请始终使用
newline=''
打开CSV文件newline=''
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.