简体   繁体   English

使用python将csv文件转换为元组列表

[英]Converting a csv file into a list of tuples with python

I am to take a csv with 4 columns: brand, price, weight, and type. 我将采用4列的csv:品牌,价格,重量和类型。

The types are orange, apple, pear, plum. 类型有橙色,苹果,梨,李子。

Parameters: I need to select the most possible weight, but by selecting 1 orange, 2 pears, 3 apples, and 1 plum by not exceeding as $20 budget. 参数:我需要选择最可能的重量,但是选择1个橙子,2个梨子,3个苹果和1个李子,不超过20美元的预算。 I cannot repeat brands of the same fruit (like selecting the same brand of apple 3 times, etc). 我不能重复相同水果的品牌(比如选择同一品牌的苹果3次等)。

I can open and read the csv file through Python, but I'm not sure how to create a dictionary or list of tuples from the csv file? 我可以通过Python打开并读取csv文件,但我不确定如何从csv文件创建字典或元组列表?

For more clarity, here's an idea of the data. 为了更清楚,这里是数据的概念。

Brand, Price, Weight, Type
brand1, 6.05, 3.2, orange
brand2, 8.05, 5.2, orange
brand3, 6.54, 4.2, orange
brand1, 6.05, 3.2, pear
brand2, 7.05, 3.6, pear
brand3, 7.45, 3.9, pear
brand1, 5.45, 2.7, apple
brand2, 6.05, 3.2, apple
brand3, 6.43, 3.5, apple
brand4, 7.05, 3.9, apple
brand1, 8.05, 4.2, plum
brand2, 3.05, 2.2, plum

Here's all I have right now: 这就是我现在所拥有的一切:

import csv
test_file = 'testallpos.csv'
csv_file = csv.DictReader(open(test_file, 'rb'), ["brand"], ["price"], ["weight"], ["type"])

You can ponder this: 你可以考虑一下:

import csv

def fitem(item):
    item=item.strip()
    try:
        item=float(item)
    except ValueError:
        pass
    return item        

with open('/tmp/test.csv', 'r') as csvin:
    reader=csv.DictReader(csvin)
    data={k.strip():[fitem(v)] for k,v in reader.next().items()}
    for line in reader:
        for k,v in line.items():
            k=k.strip()
            data[k].append(fitem(v))

print data 

Prints: 打印:

{'Price': [6.05, 8.05, 6.54, 6.05, 7.05, 7.45, 5.45, 6.05, 6.43, 7.05, 8.05, 3.05],
 'Type': ['orange', 'orange', 'orange', 'pear', 'pear', 'pear', 'apple', 'apple', 'apple', 'apple', 'plum', 'plum'], 
 'Brand': ['brand1', 'brand2', 'brand3', 'brand1', 'brand2', 'brand3', 'brand1', 'brand2', 'brand3', 'brand4', 'brand1', 'brand2'], 
 'Weight': [3.2, 5.2, 4.2, 3.2, 3.6, 3.9, 2.7, 3.2, 3.5, 3.9, 4.2, 2.2]}

If you want the csv file literally as tuples by rows: 如果你希望csv文件按字面顺序排列:

import csv
with open('/tmp/test.csv') as f:
    data=[tuple(line) for line in csv.reader(f)]

print data
# [('Brand', ' Price', ' Weight', ' Type'), ('brand1', ' 6.05', ' 3.2', ' orange'), ('brand2', ' 8.05', ' 5.2', ' orange'), ('brand3', ' 6.54', ' 4.2', ' orange'), ('brand1', ' 6.05', ' 3.2', ' pear'), ('brand2', ' 7.05', ' 3.6', ' pear'), ('brand3', ' 7.45', ' 3.9', ' pear'), ('brand1', ' 5.45', ' 2.7', ' apple'), ('brand2', ' 6.05', ' 3.2', ' apple'), ('brand3', ' 6.43', ' 3.5', ' apple'), ('brand4', ' 7.05', ' 3.9', ' apple'), ('brand1', ' 8.05', ' 4.2', ' plum'), ('brand2', ' 3.05', ' 2.2', ' plum')]
import csv
with open("some.csv") as f:
       r = csv.reader(f)
       print filter(None,r)

or with list comprehension 或者列表理解

import csv
with open("some.csv") as f:
       r = csv.reader(f)
       print [row for row in r if row]

for comparison 为了比较

In [3]: N = 100000

In [4]: the_list = [randint(0,3) for _ in range(N)]

In [5]: %timeit filter(None,the_list)
1000 loops, best of 3: 1.91 ms per loop

In [6]: %timeit [i for i in the_list if i]
100 loops, best of 3: 4.01 ms per loop

[edit] since your actual output does not have blanks you donot need the list comprehension or the filter you can just say list(r) [编辑]因为你的实际输出没有空白你不需要列表理解或过滤器你可以只说list(r)

Final answer without blank lines 没有空行的最终答案

import csv
with open("some.csv") as f:
       print list(csv.reader(f))

if you want dicts you can do 如果你想要你可以做的决定

import csv
with open("some.csv") as f:
       reader = list(csv.reader(f))
       print [dict(zip(reader[0],x)) for x in reader]
       #or
       print map(lambda x:dict(zip(reader[0],x)), reader)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM