简体   繁体   English

在python中,如何将字典中的字符串列表转换为整数列表?

[英]in python, how do you convert a list of strings within a dictionary to a list of integers?

I have a function (main) that takes data from a csv file and converts it into a dictionary whose keys are the entries in the first column and their values are a list of all the other entries in that row (eg: one row is: 2020-12-20,0,0,0,0,206, so the key is 2020-12-20 and the rest of the entries are strings in a list: ['0', '0', '0', '0', '206'] ):我有一个函数(main),它从 csv 文件中获取数据并将其转换为字典,其键是第一列中的条目,它们的值是该行中所有其他条目的列表(例如:一行是: 2020-12-20,0,0,0,0,206,因此键是2020-12-20 ,其余条目是列表中的字符串: ['0', '0', '0', '0', '206'] ):

def main():
    import csv
    # doses_data_mar_20.csv
    dict_doses_by_date = {}

    filename_input = str(input("Please enter a .csv file to read: "))
    with open(filename_input, "r") as inp, open('doses.csv', 'w') as out:
        header = inp.readline()
        reader = csv.reader(inp, delimiter=",", quotechar='"')
        for line in reader:
            dict_doses_by_date[line[0]] = line[1:6]
    return dict_doses_by_date

def count_doses_by_date(dict_dose_by_date):

now I need to define a new function count_doses_by_date that takes each list of strings as an input and converts each of these lists of strings into a list of integers and add all the integers to get their totals.现在我需要定义一个新函数count_doses_by_date ,它将每个字符串列表作为输入,并将这些字符串列表中的每一个转换为整数列表,并将所有整数相加以获得它们的总数。 then outputs this into another csv file.然后将其输出到另一个 csv 文件中。

I tried doing this:我试过这样做:

def count_doses_by_date(dict_dose_by_date):
    import csv
    # doses_data_mar_20.csv
    dict_doses_by_date = {}
    filename_input = str(input("Please enter a .csv file to read: "))
    with open(filename_input, "r") as inp, open('doses.csv', 'w') as out:
        header = inp.readline()
        reader = csv.reader(inp, delimiter=",", quotechar='"')
        for line in reader:
            dict_doses_by_date[line[0]] = line[1:6]
        for k in dict_doses_by_date:
            list_integers = [int(x) for x in dict_doses_by_date[k]]
            sum_integers = sum(list_integers)
            print_value = "{}, {} \n".format(k, sum_integers)
    return out.write(print_value)

but I'm getting errors since some of the lists contain strings like '1,800' which contain commas that prevent it from be converted to an integer.但我收到错误,因为某些列表包含像 '1,800' 这样的字符串,其中包含阻止将其转换为整数的逗号。 I don't know how to get rid of there's thousands commas without disrupting the commas that separate the csv values.我不知道如何在不破坏分隔 csv 值的逗号的情况下摆脱数千个逗号。

I'm stuck.. how would this be done?我被卡住了..这将如何完成?

So, if your string is something like "1234" you can do因此,如果您的字符串类似于“1234”,您可以这样做

int(number, base=base)

And you will obtain an integer.你会得到一个整数。 So for example:例如:

print(int("1234"))

Will print the 1234 number.将打印 1234 号码。

Please check the rest of documentation here: https://docs.python.org/3/library/functions.html#int请在此处查看其余文档: https ://docs.python.org/3/library/functions.html#int

Then to actually achieve what you want you can proceed as suggested on the other comments or any way you would like, just loop through the list of elements and keep adding them (a+= int("1234")) then return the total and write it to the file.然后要真正实现您想要的,您可以按照其他评论的建议或您想要的任何方式进行操作,只需遍历元素列表并继续添加它们 (a+= int("1234")) 然后返回总数并写入它到文件中。

Of course, if your strings have unexpected symbols such as "thousands commas" then you need to normalize strings before calling int() by removing the character with replace() or by other means.当然,如果您的字符串有意外的符号,例如“千个逗号”,那么您需要在调用int()之前通过使用replace()或其他方式删除字符来规范化字符串。

Would you try this?你会试试这个吗? Use string.isdigit() to determine whether it is a number or not使用string.isdigit()判断是否为数字

line = ['2020-12-20', '0', '0', '0', '0', '206']
filtered_line = [int(e) if e.isdigit() else '' for e in line[1:6]]
print([x for x in filtered_line if x != ''])

Output输出

[0, 0, 0, 0, 206]

Edit : I missed the part about thousand separator.编辑:我错过了关于千位分隔符的部分。 In your use case, the code could be this:在您的用例中,代码可能是这样的:

dict_doses_by_date = {}
reader = [['2020-12-20', '0', '0', '0', '10', '206'], ['2020-12-21', '0', '0', '0', '20', '316'], ['2020-12-22', '0', '0', '0', '30', '1,426']]

for line in reader:
    list_integers = [int(x.replace(',', '')) for x in line[1:6]]
    dict_doses_by_date[line[0]] = list_integers
    print_value = "{}, {} \n".format(line[0], sum(list_integers))
    print(print_value)

print(dict_doses_by_date)

Output输出

2020-12-20, 216

2020-12-21, 336

2020-12-22, 1456

{'2020-12-20': [0, 0, 0, 10, 206], '2020-12-21': [0, 0, 0, 20, 316], '2020-12-22': [0, 0, 0, 30, 1426]}

You should use the pandas library.您应该使用 pandas 库。 You can use pd.read_csv to get a dataframe directly from the file, and you can set the first column to the index column.您可以使用pd.read_csv直接从文件中获取数据框,并且可以将第一列设置为索引列。 You can use df.applymap(lamba x : int(x.replace(',','')) to get rid of the commas and convert to int, then do df.sum(axis = 1) to get a row-by-row sum.您可以使用df.applymap(lamba x : int(x.replace(',',''))摆脱逗号并转换为 int,然后执行df.sum(axis = 1)以获得一行-逐行总和。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM