繁体   English   中英

读取和拆分a.csv文件,该文件中包含带逗号的字符串

[英]Reading and splitting a .csv file, which contains strings with commas in

我有一个 .csv 文件,它看起来像这样:

1,2,"a,b",3
4,"c,d",5,6

我正在读取并存储在这样的数组中:

with open(filename, 'r') as f:
    data = f.readlines()
data = [line.split(',') for line in data]

这导致这样的数组:

[['1','2','"a','b"','3']['4','"c','d"','5','6']]

但是,我想将项目保留在双引号中,例如数据数组的一个元素中的“a,b”(这是它们在 Excel 中打开的方式),如下所示:

[[1,2,'a,b',3][4,'c,d',5,6]]

在 Python 中是否有一种简单的方法可以实现这一点?

编辑:如果可能的话,最好不使用 csv 模块?

您应该使用csv模块:

import csv

with open('test.csv') as f:
    reader = csv.reader(f)
    
    for row in reader:
        print(row)

Output:

['1', '2', 'a,b', '3']
['4', 'c,d', '5', '6']

或者,如果您不想懒惰地阅读行并希望将所有行都放在一个列表中,就像您的问题一样,您可以简单地执行以下操作:

with open('test.csv') as f:
    reader = csv.reader(f)
    data = list(reader)

print(data)        
# [['1', '2', 'a,b', '3'], ['4', 'c,d', '5', '6']]   

使用csv模块:

import csv

with open('test.csv') as file:
    reader = csv.reader(file)
    
data = [row for row in reader]

如果您不想使用csv模块,此 function 将返回您想要的 output

def function(file_name):
    with open(file_name, 'r') as file:
        file_read = file.readlines()
        raw_data = [line.split(',') for line in file_read]

        file_data = list()
        place_0 = 0
        place_1 = 0
        ext_item = str()
        added = list()
        pre_final_list = list()
        pre_pure_list = list()
        pure_data = str()
        final_list = list()

        for List in raw_data:
            for k, v in enumerate(List):
                List[k] = v.rstrip()
        
        for line in raw_data:
            if line == ['']:
                continue
            file_data.append(line)

        for line in file_data:
            for key, value in enumerate(line):
                if '"' in value[0] and '"' in value[-1]:
                    continue
                if '"' in value[0]:
                    place_0 = key
                if '"' in value[-1]:
                    place_1 = key
                if place_1 != 0:
                    for ind in range(place_0, place_1+1):
                        added.append(line[ind])
                    for e_item in added:
                        if e_item == added[-1]:
                            ext_item += e_item
                        else:
                            ext_item += e_item + ','
                    line[place_0] = ext_item
                    for r_item_index in range(place_0+1, place_1+1):
                        line[r_item_index] = None
                    place_0 = 0
                    place_1 = 0
                    ext_item = str()
                    added = list()

        for line in file_data:
            for value in line:
                try:
                    value = int(value)
                except: 
                    pass
                if value == '\n':
                    continue
                if not value is None:
                    pre_pure_list.append(value)
            pre_final_list.append(pre_pure_list)
            pre_pure_list = list()
        

        for List in pre_final_list:
            for key, item in enumerate(List):
                if type(item) is int or '"' not in item:
                    continue
                for string in item:
                    if string == '"':
                        continue
                    pure_data += string
                List[key] = pure_data
                pure_data = str()
            final_list.append(List)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM