简体   繁体   English

如何在不将整数列表转换为字符串的情况下将 csv 文件中的整数列表读入 python?

[英]How to read list of integers from a csv file into python without converting them to strings?

I have a list of integers saved in a csv sheet, the rows are not all the same length.我有一个整数列表保存在 csv 工作表中,行的长度并不相同。 Like the following example:像下面的例子:

22,-14,-24,2,-26,18,20,-4,12,16,8,-6,-10
20,12,-16,18,28,24,4,-22,26,8,-10,-14,2,6
10,-26,-20,30,24,-22,18,-28,12,14,-6,-2,8,-16,-4
16, 22, 30, -18, -26, -28, 24, -8, 32, -14, 12, 4, 20, -10, 2, 6
32, 10, -14, 20, -22, 24, -4, -26, 34, 28, -30, 2, 12, 18, 6, -8, 16
8, -20, 34, 18, 30, 24, -4, 6, 28, -32, -12, -36, 10, 16, -38, 2, 14, -22, -26

I need to call a function where the input is an array consisting of one such row.我需要调用 function,其中输入是一个由这样的行组成的数组。 So I need exactly the following.所以我需要以下内容。

input = [22,-14,-24,2,-26,18,20,-4,12,16,8,-6,-10]

Using the standard approach使用标准方法

import csv
with open(file.csv, 'r') as f:
        reader = csv.reader(f)
        for line in reader:
            print(line)

yields the output产生 output

['22', '-14', '-24', '2', '-26', '18', '20', '-4', '12', '16', '8', '-6', '-10']

which I can't use since the elements are not integers.我不能使用,因为元素不是整数。 I have tried to use different formatting parameters, like csv.QUOTE_NONE but nothing works.我尝试使用不同的格式化参数,例如csv.QUOTE_NONE但没有任何效果。 This makes sense as far as I know since csv files do not know integer data types.据我所知,这是有道理的,因为 csv 文件不知道 integer 数据类型。

My files have between 100'000-1'000'000 rows so any solution must be efficient.我的文件有 100'000-1'000'000 行,因此任何解决方案都必须高效。 Since the number of columns is not fixed I also was not able to cast manually, I couldn't figure out how to loop through the columns of one row.由于列数不固定,我也无法手动投射,我无法弄清楚如何遍历一行的列。 Does anyone have an idea how I could solve this problem?有谁知道我该如何解决这个问题? I don't know if it could help but I am not bound to csv files, I could probably use something else.我不知道它是否有帮助,但我不受 csv 文件的约束,我可能可以使用其他东西。

You can just convert them to int:您可以将它们转换为 int:

elems = ['22', '-14', '-24', '2', '-26', '18', '20', '-4', '12', '16', '8', '-6', '-10']
elems = [int(i) for i in elems]

Output: [22, -14, -24, 2, -26, 18, 20, -4, 12, 16, 8, -6, -10] Output: [22, -14, -24, 2, -26, 18, 20, -4, 12, 16, 8, -6, -10]

The better handle the csv, you could also use Pandas:更好地处理 csv,你也可以使用 Pandas:

import pandas as pd

df = pd.read_csv('line.csv', header=None, sep = ';')
df = df.T
for row, col in df.iteritems():
    line = list(df[row].dropna())
    print(line)

and the output is: output 是:

[22.0, -14.0, -24.0, 2.0, -26.0, 18.0, 20.0, -4.0, 12.0, 16.0, 8.0, -6.0, -10.0]
[20.0, 12.0, -16.0, 18.0, 28.0, 24.0, 4.0, -22.0, 26.0, 8.0, -10.0, -14.0, 2.0, 6.0]
[10.0, -26.0, -20.0, 30.0, 24.0, -22.0, 18.0, -28.0, 12.0, 14.0, -6.0, -2.0, 8.0, -16.0, -4.0]
[16.0, 22.0, 30.0, -18.0, -26.0, -28.0, 24.0, -8.0, 32.0, -14.0, 12.0, 4.0, 20.0, -10.0, 2.0, 6.0]
[32.0, 10.0, -14.0, 20.0, -22.0, 24.0, -4.0, -26.0, 34.0, 28.0, -30.0, 2.0, 12.0, 18.0, 6.0, -8.0, 16.0]
[8.0, -20.0, 34.0, 18.0, 30.0, 24.0, -4.0, 6.0, 28.0, -32.0, -12.0, -36.0, 10.0, 16.0, -38.0, 2.0, 14.0, -22.0, -26.0]

As your CSV doesn't have any column names you don't really need the csv module (let alone pandas ).由于您的 CSV 没有任何列名,因此您实际上不需要csv模块(更不用说pandas 了)。 You could just do this:你可以这样做:

FILENAME = 'file.csv'

def parse(filename):
    with open(filename) as data:
        for line in data:
            yield list(map(int, line.split(',')))

for line in parse(FILENAME):
    print(line)

Output: Output:

[22, -14, -24, 2, -26, 18, 20, -4, 12, 16, 8, -6, -10]
[20, 12, -16, 18, 28, 24, 4, -22, 26, 8, -10, -14, 2, 6]
[10, -26, -20, 30, 24, -22, 18, -28, 12, 14, -6, -2, 8, -16, -4]
[16, 22, 30, -18, -26, -28, 24, -8, 32, -14, 12, 4, 20, -10, 2, 6]
[32, 10, -14, 20, -22, 24, -4, -26, 34, 28, -30, 2, 12, 18, 6, -8, 16]
[8, -20, 34, 18, 30, 24, -4, 6, 28, -32, -12, -36, 10, 16, -38, 2, 14, -22, -26]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM