Reading text files from subfolders and folders and creating a dataframe in pandas for each file text as one observation

Question

I have the following architecture of the text files in the folders and subfolders.

I want to read them all and create a df. I am using this code, but it dont work well for me as the text is not what I checked and the files are not equivalent to my counting.

l = [pd.read_csv(filename,header=None, encoding='iso-8859-1') for filename in glob.glob("2018_01_01/*.txt")]
main_df = pd.concat(l, axis=1)
main_df = main_df.T
for i in range(2):
    l = [pd.read_csv(filename, header=None, encoding='iso-8859-1',quoting=csv.QUOTE_NONE) for filename in glob.glob(str(foldernames[i+1])+ '/' + '*.txt')]
    df = pd.concat(l, axis=1)
    df = df.T
    main_df = pd.merge(main_df, df)

file

Answer 1

Assuming those directories contain txt files in which information have the same structure on all of them:

import os
import pandas as pd

df = pd.DataFrame(columns=['observation'])

path = '/path/to/directory/of/directories/'

for directory in os.listdir(path):
    if os.path.isdir(directory):
        for filename in os.listdir(directory):
            with open(os.path.join(directory, filename)) as f:
                observation = f.read()
                current_df = pd.DataFrame({'observation': [observation]})
                df = df.append(current_df, ignore_index=True)

Once all your files have been iterated, df should be the DataFrame containing all the information in your different txt files.

Answer 2

You can do that using a for loop. But before that, you need to give a sequenced name to all the files like 'fil_0' within 'fol_0', 'fil_1' within 'fol_1', 'fil_2' within 'fol_2' and so on. That would facilitate the use of a for loop:

dataframes = []
import pandas as pd
for var in range(1000):
    name  = "fol_" + str(var) + "/fil_" + str(var) + ".txt"
    dataframes.append(pd.read_csv(name)) # if you need to use all the files at once
    #otherwise
    df = pd.read_csv(name) # you can use file one by one

It will automatically create dataframes for each file.

Reading text files from subfolders and folders and creating a dataframe in pandas for each file text as one observation

Question

2 answers

solution1
4 ACCPTED 2018-07-24 06:54:36

solution2
1 2018-07-24 06:42:36

Reading text files from subfolders and folders and creating a dataframe in pandas for each file text as one observation

Question

2 answers

solution1 4 ACCPTED 2018-07-24 06:54:36

solution2 1 2018-07-24 06:42:36

solution1
4 ACCPTED 2018-07-24 06:54:36

solution2
1 2018-07-24 06:42:36