简体   繁体   中英

How to write one particular column present in multiple dataframe to a list using python?

I have 4 csv files in a folder, and I load them individually as dataframes in python as dataframes. I process each of these dataframes, to get the unique 'file name' alone as a list and write it to a new csv file.

Now I want to write all the file names of all the dataframes into output file.

file_list=[]
for fileno in data.groupby(['date','age'])['File_No']:
    file_list.append(fileno)
with open(r'D:\Data\core_data\file1.csv', "w") as csvFile:
    writer = csv.writer(csvFile)
    writer.writerows(file_list)

here data is one dataframe. This yields me the list of files names present in this dataframe as follows:

[((Timestamp('2018-01-15 00:00:00'), '1', 1), 0      1011
  1      1012
  2      1013
  3      1014...]

So I need two things:

  1. I dont want the '((Timestamp('2018-01-15 00:00:00'), '1', 1) ' in the list output.

  2. The lists of all the dataframes should be written to one lists of list as :

[[list of file_1 file names],[list of file_2 file names],[list of file_3 file names]]

You intend to get a list of list of the file names present in your 4 csv files correct?

In this case why don't you loop over the CSV files and grab the expected list as follow:

import pandas as pd
files = ['file1.csv', 'file2.csv', 'file3.csv', 'file4.csv']

output = []
for file in files:
    temp_df = pd.read_csv(file) 
    output.append([x for x in list(temp_df['File_No'].unique()) if type(x) == int])

#write output to csv...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM