[英]How can I read in from a csv file by assigning last column as the second value of a tuple?
I have a csv file in the with three columns so that, each row is in the format: 我在的csv文件中包含三列,因此每一行的格式为:
"abcdef" "uvwxyz" 0
“ abcdef”“ uvwxyz” 0
I want to generate a list of tuples, where the first element of the tuple is a dictionary of some features extracted from the first two columns, while the second element is simply the third column (0 or 1 value) values representing the label of the features. 我想生成一个元组列表,其中元组的第一个元素是从前两列中提取的一些特征的字典,而第二个元素只是第三列(0或1值)的值,代表值的标签特征。
I tried the following but it throws some syntax error saying i is undefined in the last line: 我尝试了以下操作,但是它抛出一些语法错误,说我在最后一行中未定义:
dataframe = pd.read_csv(csv_file, header = None, delimiter = "\t")
a = dataframe[0]
b = dataframe[1]
label = dataframe[2]
feature = [(findFeature(x,y), labels) for x,y in i for i, labels in zip(zip(a,b), label)]
Where am I wrong? 我哪里错了?
看来您需要:
feature = [(findFeature(x,y), label) for x,y, label in zip(a,b,label)]
if you don't need any further transformations you may use csv
library instead of pandas
: 如果您不需要任何进一步的转换,可以使用
csv
库而不是pandas
:
import csv
with open(csv_file) as f:
reader = csv.reader(f)
feature = [(findFeature(x,y), z) for x,y,z in reader]
you can find and example for csv
package usage here 您可以在此处找到
csv
包用法的示例
I'm guessing you need to transform this ("abcdef", "uvwxyz", 0)
into ("abcdef", 0, "uvwxyz")
: 我猜您需要将此
("abcdef", "uvwxyz", 0)
转换为("abcdef", 0, "uvwxyz")
:
with open(csv_file, "r") as f:
dataframe = [(a,c,b) for a,b,c in map(lambda x: x.split("\t"), f)]
unpacking the tuple a,b,c
in when splitting each line and repacking to (a,c,b)
拆分每行并重新打包为
(a,c,b)
时a,b,c
拆开元组a,b,c
包装
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.