简体   繁体   English

在列表理解中将字符串转换为整数

[英]converting strings into integers in list comprehension

I am trying to write a function which takes the file and split it with the new line and then again split it using comma delimiter(,) after that I want to convert each string inside that list to integers using only list comprehension 我正在尝试编写一个函数,该函数接受文件并用新行将其拆分,然后再次使用逗号delimiter(,)对其进行拆分,之后我想仅使用列表理解将列表内的每个字符串转换为整数

# My code but it's not converting the splitted list into integers.
def read_csv(filename):
    string_list = open(filename, "r").read().split('\n')
    string_list = string_list[1:len(string_list)]
    splitted = [i.split(",") for i in string_list]
    final_list = [int(i) for i in splitted]
    return final_list

read_csv("US_births_1994-2003_CDC_NCHS.csv")

Output:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'list' 

How the data looks after splitting with comma delimiter(,) 用逗号分隔符(,)分割后的数据外观

us = open("US_births_1994-2003_CDC_NCHS.csv", "r").read().split('\n')
splitted = [i.split(",") for i in us]
print(splitted)

Output:
 [['year', 'month', 'date_of_month', 'day_of_week', 'births'],
 ['1994', '1', '1', '6', '8096'],
 ['1994', '1', '2', '7', '7772'],
 ['1994', '1', '3', '1', '10142'],
 ['1994', '1', '4', '2', '11248'],
 ['1994', '1', '5', '3', '11053'],
 ['1994', '1', '6', '4', '11406'],
 ['1994', '1', '7', '5', '11251'],
 ['1994', '1', '8', '6', '8653'],
 ['1994', '1', '9', '7', '7910'],
 ['1994', '1', '10', '1', '10498']]

How do I convert each string inside this output as integers and assign it to a single list using list comprehension . 如何将此输出中的每个字符串转换为整数,并使用列表推导将其分配给单个列表。

str.split() produces a new list; str.split()产生一个新列表; so splitted is a list of lists. 如此splitted是一个列表列表。 You'd want to convert the contents of each contained list: 您想要转换每个包含列表的内容:

[[int(v) for v in row] for row in splitted]

Demo: 演示:

>>> csvdata = '''\
... year,month,date_of_month,day_of_week,births
... 1994,1,1,6,8096
... 1994,1,2,7,7772
... '''
>>> string_list = csvdata.splitlines()  # better way to split lines
>>> string_list = string_list[1:]  # you don't have to specify the second value
>>> splitted = [i.split(",") for i in string_list]
>>> splitted
[['1994', '1', '1', '6', '8096'], ['1994', '1', '2', '7', '7772']]
>>> splitted[0]
['1994', '1', '1', '6', '8096']
>>> final_list = [[int(v) for v in row] for row in splitted]
>>> final_list
[[1994, 1, 1, 6, 8096], [1994, 1, 2, 7, 7772]]
>>> final_list[0]
[1994, 1, 1, 6, 8096]

Note that you could just loop directly over the file to get separate lines too: 请注意,您可以直接在文件上循环以获取单独的行:

string_list = [line.strip().split(',') for line in openfileobject]

and skipping an entry in such an object could be done with next(iterableobject, None) . 并可以使用next(iterableobject, None)跳过此类对象中的条目。

Rather than read the whole file into memory and manually split the data, you could just use the csv module : 与其将整个文件读入内存并手动拆分数据,不如使用csv模块

import csv

def read_csv(filename):
    with open(filename, 'r', newline='') as csvfile:
        reader = csv.reader(csvfile)
        next(reader, None)  # skip first row
        for row in reader:
            yield [int(c) for c in row]

The above is a generator function, producing one row at a time as you loop over it: 上面是一个生成器函数,当您遍历它时,一次生成一行:

for row in read_csv("US_births_1994-2003_CDC_NCHS.csv"):
    print(row)

You can still get a list with all rows with list(read_csv("US_births_1994-2003_CDC_NCHS.csv")) . 您仍然可以获得带有list(read_csv("US_births_1994-2003_CDC_NCHS.csv"))所有行的list(read_csv("US_births_1994-2003_CDC_NCHS.csv"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM