简体   繁体   English

根据文件夹名称创建数组

[英]Creating arrays based on folder name

I have data that has been collected and organized in multiple folders.我有在多个文件夹中收集和组织的数据。
In each folder, there can be multiple similar runs -- eg collected data under the same conditions, at different times.在每个文件夹中,可以有多个类似的运行——例如,在相同条件下、不同时间收集的数据。 These filenames contain a number in them that increments.这些文件名中包含一个递增的数字。 Each folder contains similar data collected under different conditions.每个文件夹都包含在不同条件下收集的相似数据。 For example, I can have an idle folder , and in it can be files named idle_1.csv , idle_2.csv , idle_3.csv , etc. Then I can have another folder pos1 folder , and similarly, pos1_1.csv , pos1_2.csv , etc.例如,我可以有一个idle folder ,其中可以是名为idle_1.csvidle_2.csvidle_3.csv等的文件。然后我可以有另一个文件夹pos1 folder ,类似地, pos1_1.csvpos1_2.csv , 等等。

In order to keep track of what folder and what file the data in the arrays came from, I want to use the folder name, "idle", "pos1", etc, as the array name.为了跟踪数组中的数据来自哪个文件夹和哪个文件,我想使用文件夹名称“idle”、“pos1”等作为数组名称。 Then, each file within that folder (or the data resulting from processing each file in that folder, rather) becomes another row in that array.然后,该文件夹中的每个文件(或处理该文件夹中每个文件所产生的数据)成为该数组中的另一行。

For example, if the name of the folder is stored in variable arrname, and the file index is stored in variable arrndx, I want to write the value into that array:例如,如果文件夹的名称存储在变量 arrname 中,而文件索引存储在变量 arrndx 中,我想将该值写入该数组:

arrname[arrndx]=value

This doesn't work, giving the following error:这不起作用,出现以下错误:

TypeError: 'str' object does not support item assignment

Then, I thought about using a dictionary to do this, but I think I still would run into the same issue.然后,我想使用字典来做到这一点,但我想我仍然会遇到同样的问题。 If I use a dictionary, I think I need each dictionary's name to be the name derived from the folder name -- creating the same issue.如果我使用字典,我想我需要每个字典的名称都是从文件夹名称派生的名称 - 产生相同的问题。 If I instead try to use it as a key in a dictionary, the entries get overwritten with data from every file from the same folder since the name is the same:如果我尝试将其用作字典中的键,则条目将被同一文件夹中每个文件的数据覆盖,因为名称相同:

    arrays['name']=arrname
    arrays['index']=int(arrndx)
    arrays['val']=value

    arrays['name': arrname, 'index':arrndx, 'val':value]

I can't use 'index' either since it is not unique across each different folder.我也不能使用“索引”,因为它在每个不同的文件夹中都不是唯一的。

So, I'm stumped.所以,我很难过。 I guess I could predefine all the arrays, and then write to the correct one based on the variable name, but that could result in a large case statement (is there such a thing in python?) or a big if statement.我想我可以预定义所有数组,然后根据变量名写入正确的数组,但这可能会导致一个大的 case 语句(python 中有这样的东西吗?)或一个大的 if 语句。 Maybe there is no avoiding this in my case, but I'm thinking there has to be a more elegant way...也许在我的情况下无法避免这一点,但我认为必须有一种更优雅的方式......

EDIT编辑

I was able to work around my issue using globals():我能够使用 globals() 解决我的问题:

globals()[arrname].insert(int(arrndx),value)

However, I believe this is not the "correct" solution, although I don't understand why it is frowned upon to do this.但是,我相信这不是“正确”的解决方案,尽管我不明白为什么不赞成这样做。

Use a nested dictionary with the folder names at the first level and the file indices (or names) at the second.使用第一级文件夹名称和第二级文件索引(或名称)的嵌套字典。

from pathlib import Path

data = {}
base_dir = 'base'
for folder in Path(base_dir).resolve().glob('*'):
    if not folder.is_dir():
        continue
    data[folder.name] = {}
    for csv in folder.glob('*.csv'):
        file_id = csv.stem.split('_')[1]
        data[folder.name][file_id] = csv

The above example just saves the file name in the structure but you could alternatively load the file's data (eg using Pandas) and save that to the dictionary.上面的例子只是将文件名保存在结构中,但您也可以加载文件的数据(例如使用 Pandas)并将其保存到字典中。 It all depends what you want to do with it afterwards.这一切都取决于你之后想用它做什么。

What about :关于什么 :

foldername = 'idle'  # Say your folder name is idle for example 
files = {}
files[filename] = [filenmae + "_" + str(i) + ".csv" for i in range(1, number_of_files_inside_folder + 2)]

does that solve your problem ?这能解决你的问题吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM