简体   繁体   English

如何从 python 中的文件文件夹中获取列表列表?

[英]How can I get a list of lists out a folder of files in python?

I have a folder of csv files.我有一个包含 csv 个文件的文件夹。 I want to get a list in which each element is a list of all the lines of the files (so, a list of lists).我想得到一个列表,其中每个元素都是文件所有行的列表(因此,列表列表)。

I have tried the following:我尝试了以下方法:

from os import listdir
folder = listdir("folder") 
for files in folder:
    individualFiles = open(f"folder/{files}", "r")
    alist = individualFiles.readlines() 

But this only returns a huge list with all the data instead of a list of lists.但这只会返回一个包含所有数据的巨大列表,而不是列表列表。

What could I do?我能做什么?

I am a beginner and I am trying to learn the logic of programming, so I would appreciate if the solution doesn't require fancy functions but rather logic.我是初学者,我正在尝试学习编程的逻辑,所以如果解决方案不需要花哨的功能而是需要逻辑,我将不胜感激。

The problem in your code is just that you assign to alist :您的代码中的问题只是您分配alist

alist = individualFiles.readlines()

instead of append to it:而不是append到它:

alist.append(individualFiles.readlines())

And you will have to create a list before the loop: alist = list()你必须在循环之前创建一个列表: alist = list()

This your code, modified a little, to explain the logic:这是您的代码,稍微修改一下,以解释逻辑:

from os import listdir

# Name of the folder containing the files
folder_path = "textfiles"

# Get a list of filenames
filenames = listdir(folder_path)

# List to store the content of the files
files_content = list()

# For each file
for filename in filenames:
    # Create the filepath
    file_path = f"{folder_path}/{filename}"

    # Open the file (using "with" for file opening will autoclose the file at the end. It's a good practice)
    with open(file_path, "r") as f:
        # Get the file content
        file_content = f.readlines()
        # Append the conten to the list
        files_content.append(file_content)

print(files_content)

One very simple approach in shown in the example below.下面的示例显示了一种非常简单的方法。

Steps / logic:步骤/逻辑:

  • Collect the files.收集文件。 I'm using the glob library to do this because glob returns the full file path , so there's no need to build it yourself.我正在使用glob库来执行此操作,因为glob会返回完整的文件路径,因此无需自己构建它。
  • Iterate through each file in the list.遍历列表中的每个文件。
  • Read the file using the with statement.使用with语句读取文件。 This is important to ensure the file handle is closed automatically.这对于确保自动关闭文件句柄很重要
  • Append the container list with a list of the file's content. Append 包含文件内容列表的container列表。

Example code:示例代码:

import os
from glob import glob

files = glob('/home/user/Desktop/list_of_files/*.csv')
container = []

for file in files:
    with open(file, 'r') as f:
        container.append(list(f))

Files:档案:

['/home/user/Desktop/list_of_files/5.csv',
 '/home/user/Desktop/list_of_files/1.csv',
 '/home/user/Desktop/list_of_files/3.csv',
 '/home/user/Desktop/list_of_files/2.csv',
 '/home/user/Desktop/list_of_files/4.csv']

Output: Output:

[['File content,5\n'],
 ['File content,1\n'],
 ['File content,3\n'],
 ['File content,2\n'],
 ['File content,4\n']]

You can use the csv module.您可以使用 csv 模块。 This returns a list of lists of the rows (each row it's own list) for each csv in the folder, skipping the header:这将返回文件夹中每个 csv 的行列表列表(每行都有自己的列表),跳过 header:

# folder/file1.csv

A,B,C,D
1,foo,bar,null
42,foo,baz,null
7,bar,for,x
from os import listdir
import csv

folder = 'data_folder'
files = listdir(folder)
files = [folder + '/' + file for file in files]

files_list = []

for file in files:    
    with open(file) as csv_file: 
        csv_reader = csv.reader(csv_file, delimiter=',')
        
        # skip header
        next(csv_reader, None)
        
        row_list = []
        for row in csv_reader:
            row_list.append(row)
        files_list.append(row_list)

Output: Output:

[[['1', 'foo', 'bar', 'null'], # file1.csv
  ['42', 'foo', 'baz', 'null'],
  ['7', 'bar', 'for', 'x']],
 [['2', 'foo', 'bar', 'null'], # file2.csv
  ['42', 'foo', 'baz', 'null'],
  ['7', 'bar', 'for', 'x']],
 [['3', 'foo', 'bar', 'null'], # file3.csv
  ['42', 'foo', 'baz', 'null'],
  ['7', 'bar', 'for', 'x']]]
...

To access the rows in file1.csv :要访问file1.csv中的行:

file1 = files_list[0]
print(file1)

Output: Output:

[['1', 'foo', 'bar', 'null'],
 ['42', 'foo', 'baz', 'null'],
 ['7', 'bar', 'for', 'x']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM