简体   繁体   中英

creating a dataframe from contents of multiple files in a folder

We need to write a program that can access all the files in a given folder. Each file contains a single line string and we need to store the file name as well as the content of the file in a dataframe and return the csv file. how to solve this question?

You didn't clearly state, what file you would like to open, so assumed it is a.txt file. You can use os.listdir(path) to get a list of all files stored at a certain path. Then load the text files and append the content and the filename in a list. Finally, create a DataFrame and save to csv.

import os
import pandas as pd

# set the path to your file location
path = r'path\to\Text'
# create a empty list, where you store the content
list_of_text = []

# loop over the files in the folder
for file in os.listdir(path):
    # open the file
    with open(os.path.join(path, file)) as f:
        text = f.read()
    # append the text and filename
    list_of_text.append((text, file))

# create a dataframe and save
df = pd.DataFrame(list_of_text, columns = ['Text', 'Filename'])
df.to_csv(os.path.join(path, 'new_csv_file.csv'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM