How to copy rows from multiple csv files to new files in pandas?

Question

I have 10 csv files. I want to copy first row from all csv files and save as new csv file, then copy second row from all csv files and save as second csv file and etc. My code in the following done only for first row and other rwos display NaN . Where is my error?

Code

import pandas as pd
import datetime
import glob

path = r'/Jupyter_Works/new_csv'
all_files = glob.glob(path + "/*.csv")

date_time = datetime.datetime(2018, 1, 1)
index = pd.date_range(start='1/1/2018', periods= 8760, freq='H')

columns = ['Lat','Lon','Alt','Temperature','Relative Humidity','Wind speed','Wind direction','Short-wave irradiation']
dfcsv = pd.DataFrame(index=index, columns=columns)

for filename in all_files:
    df = pd.read_csv(filename, index_col='time', header=0)
    dfcsv.iloc[0] = df.iloc[0]

dfcsv

Result

Lat Lon Alt Temperature Relative Humidity   Wind speed  Wind direction  Short-wave irradiation
2018-01-01 00:00:00 31.03   49.36   99  285.56  52.82   2.95    128.5   0
2018-01-01 01:00:00 NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-01 02:00:00 NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-01 03:00:00 NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-01 04:00:00 NaN NaN NaN NaN NaN NaN NaN NaN

Answer 1

First create one big DataFrame with list comprehension and concat , loop by unique values for select by loc and write to files by DataFrame.to_csv . It working, because each DataFrame has unique index, so if select by unique values then select rows with same position in all files.

path = r'/home/nickan/Jupyter_Works/new_csv'
all_files = glob.glob(path + "/*.csv")

dfs = [pd.read_csv(fp, index_col='time', parse_dates=['time']) for fp in all_files]
df = pd.concat(dfs)

for x in df.index.unique():
    #removed duplicated index by index=False
    df.loc[x].to_csv(f'csv/file_{x.strftime("%Y-%m-%d_%H")}.csv', index=False)

EDIT:

Because memory problems is possible use alternative solution with loop by each row in dataFrames and write in append mode:

for i, fp in enumerate(all_files):
    df = pd.read_csv(fp, index_col='time', parse_dates=['time']) 
    for x in df.index:
        f = f'out/file_{x.strftime("%Y-%m-%d_%H")}.csv'
        if i == 0:
            df.loc[[x]].to_csv(f, index=False)

        else:
            df.loc[[x]].to_csv(f, index=False,header=None, mode='a')

How to copy rows from multiple csv files to new files in pandas?

Question

1 answers

solution1
3 ACCPTED 2019-05-19 12:12:08

How to copy rows from multiple csv files to new files in pandas?

Question

1 answers

solution1 3 ACCPTED 2019-05-19 12:12:08

solution1
3 ACCPTED 2019-05-19 12:12:08