简体   繁体   English

Python - 从 CSV 文件创建多个文件夹

[英]Python - Create multiple folders from CSV file

I want to create multiple folders/directories (if they don't exist) using information from a CSV file.我想使用 CSV 文件中的信息创建多个文件夹/目录(如果它们不存在)。

I have the information from csv as follows:我有来自 csv 的信息如下:

    Column0 Column1 Column2 Column3
    51  TestName1   0   https://siteAdress//completed/file.txt
    53  TestName2   0   https://siteAdress//completed/file.txt
    67  TestName1   2   https://siteAdress//uploads/file.txt
    68  TestName1   2   https://siteAdress//uploads/file.txt

I want to iterate column3, if it contains 'uploads' then it should make a folder with the corresponding jobname mentioned on column1 then create 'input' folder and within it create respective file.txt file, and if column3 contains 'completed' then it should make 'output' folder (within that same jobname folder next to input folder) and then within it the 'file.txt' file.我想迭代 column3,如果它包含“上传”,那么它应该使用 column1 上提到的相应作业名称创建一个文件夹,然后创建“输入”文件夹并在其中创建相应的 file.txt 文件,如果 column3 包含“已完成”,则它应该创建“输出”文件夹(在输入文件夹旁边的同一个作业名文件夹中),然后在其中创建“file.txt”文件。 And do this for all jobs mentioned in column1.并对第 1 列中提到的所有作业执行此操作。

Something like this:像这样的东西:

TestName1/input/file.txt
TestName1/output/file.txt
TestName1/output2/file.txt

TestName2/input/file.txt
TestName2/output/file.txt

Note: Most of data will contain multiple output folders for each jobname.注意:大多数数据将包含每个作业名称的多个输出文件夹。 In this case it should create as many output folders as are mentioned in the csv file.在这种情况下,它应该创建与 csv 文件中提到的一样多的输出文件夹。

So far, I have done this:到目前为止,我已经这样做了:

import csv, os
#reads from csv file
with open('limitedresult.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter = ',')
    for row in readCSV:
        print(row)

Your help would be highly appreciated, please let me know if the question is still confusing and I will try to explain in more detail.您的帮助将不胜感激,如果问题仍然令人困惑,请告诉我,我将尝试更详细地解释。

The following approach should help get you started:以下方法应该可以帮助您入门:

  1. Open the CSV file and skip the header row.打开 CSV 文件并跳过标题行。
  2. Read a row, splitting it into named columns.读取一行,将其拆分为命名列。
  3. If the file_url contains input , use a sub folder of input , etc.如果file_url包含input ,请使用input等的子文件夹。
  4. Create a folder based on output_root and the sub folder name.根据output_root和子文件夹名称创建一个文件夹。
  5. Use a Python Counter to keep track of the number of times each sub folder is used.使用 Python Counter来跟踪每个子文件夹的使用次数。
  6. Add the current sub folder count to the folder name and create any necessary output folders.将当前子文件夹计数添加到文件夹名称并创建任何必要的输出文件夹。
  7. Use the Python requests library to download the text file from the website.使用 Python requests库从网站下载文本文件。
  8. Extract the filename from the URL and use this to write the file contents.从 URL 中提取文件名并使用它来写入文件内容。

The script is as follows:脚本如下:

from collections import Counter
import requests
import csv
import os

output_root = r'/myroot'
output_counter = Counter()

with open('limitedresult.csv', newline='') as csvfile:
    readCSV = csv.reader(csvfile)
    header = next(readCSV)

    for number, test, col2, file_url in readCSV:
        if 'completed' in file_url:
            sub_folder = 'input'
        elif 'uploads' in file_url:
            sub_folder = 'output' 
        else:
            sub_folder = None
            print('Invalid URL -', file_url)

        if sub_folder:
            output_folder = os.path.join(output_root, test, sub_folder)
            output_counter.update([output_folder])
            output_folder += str(output_counter[output_folder])
            os.makedirs(output_folder, exist_ok=True)
            data = requests.get(file_url)
            file_name = os.path.split(file_url)[1]

            with open(os.path.join(output_folder, file_name), 'w') as f_output:
                f_output.write(data.text)

Note, you may need to install requests , this can usually be done using pip install requests .请注意,您可能需要安装requests ,这通常可以使用pip install requests来完成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM