繁体   English   中英

如何使用 Python 将 csv 文件转换为多维列表?

[英]How to transform a csv file into a multi-dimensional list using Python?

我从一个 4d 列表开始,比如

tokens = [[[["a"], ["b"], ["c"]], [["d"]]], [[["e"], ["f"], ["g"]],[["h"], ["i"], ["j"], ["k"], ["l"]]]]

所以我使用代码将其转换为 csv 文件

import csv
def export_to_csv(tokens):
    csv_list = [["A", "B", "C", word]]
    for h_index, h in enumerate(tokens):
        for i_index, i in enumerate(h):
            for j_index, j in enumerate(i):
                csv_list.append([h_index, i_index, j_index, j])
    
    with open('TEST.csv', 'w') as f:
      
        # using csv.writer method from CSV package
        write = csv.writer(f)

        write.writerows(csv_list)

但是现在我想做相反的过程,想把这个格式得到的csv文件,转换回上面提到的列表格式。

假设您希望您的 csv 文件看起来像这样(发布的代码中有几个拼写错误):

A,B,C,word                                                                          
0,0,0,a                                                                             
0,0,1,b                                                                             
0,0,2,c
...

这是一个解决方案:

import csv                                                                          
                                                                                    
def import_from_csv(filename):                                                      
    retval = []                                                                     
    with open(filename) as fh:                                                      
        reader = csv.reader(fh)                                                     
        # discard header row                                                        
        next(reader)   
        # process data rows                                                             
        for (x,y,z,word) in reader:                                                 
            x = int(x)                                                              
            y = int(y)                                                              
            z = int(z)                                                              
            retval.extend([[[]]] * (x + 1 - len(retval)))                           
            retval[x].extend([[]] * (y + 1 - len(retval[x])))                       
            retval[x][y].extend([0] * (z + 1 - len(retval[x][y])))                  
                                                                                    
            retval[x][y][z] = [word]                                                
                                                                                    
    return retval  
def import_from_csv(file):
    import ast
    import csv

    data = []
    # Read the CSV file
    with open(file) as fp:
        reader = csv.reader(fp)
        # Skip the first line, which contains the headers
        next(reader)
        
        for line in reader:
            # Read the first 3 elements of the line
            a, b, c = [int(i) for i in line[:3]]
            # When we read it back, everything comes in as strings. Use
            # `literal_eval` to convert it to a Python list
            value = ast.literal_eval(line[3])

            # Extend the list to accomodate the new element
            data.append([[[]]]) if len(data) < a + 1 else None
            data[a].append([[]]) if len(data[a]) < b + 1 else None
            data[a][b].append([]) if len(data[a][b]) < c + 1 else None

            data[a][b][c] = value
    return data

# Test
assert import_from_csv("TEST.csv") == tokens

首先,我会以独立于尺寸的 CSV 格式编写此结构:

import csv

def deep_iter(seq):
    for i, val in enumerate(seq):
        if type(val) is list:
            for others in deep_iter(val):
                yield i, *others
        else:
            yield i, val   
    
with open('TEST.csv', 'w') as f:
    csv.writer(f).writerows(deep_iter(tokens))

接下来,我们可以使用索引的字典顺序来重新创建结构。 我们所要做的就是根据一个单词的索引依次深入到 output 列表中。 我们在倒数第二个索引处停止以获取最后一个列表,因为最后一个索引仅指向该列表中单词的位置,并且由于自然排序而无关紧要:

with open('TEST.csv', 'r') as f:
    rows = [*csv.reader(f)]

res = []
for r in rows:
    index = r[:-2]   # skip the last index and word
    e = res
    while index:
        i = int(index.pop(0))    # get next part of a current index
        if i < len(e):
            e = e[i]
        else:
            e.append([])   # add new record at this level
            e = e[-1]
    e.append(r[-1])   # append the word to the corresponding list

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM