I am trying to load a number of csv files from a folder, use a function that calculates missing values on each file, then saves new csv files containing the output. When I edit the script to print the output, I get the expected result. However, the loop only ever saves the last file to the directory. The code I am using is:
from pathlib import Path
import pandas as pd
import os
import glob
files = glob("C:/Users/61437/Desktop/test_folder/*.csv") # get all csv's from folder
n = 0
for file in files:
print(file)
df = pd.read_csv(file, index_col = False)
d = calc_missing_prices(df) # calc_missing_prices is a user defined function
print(d)
d.to_csv(r'C:\Users\61437\Desktop\test_folder\derived_files\derived_{}.csv'.format(n+1), index = False)
The print() command returns the expected output, which for my data is:
C:/Users/61437/Desktop/test_folder\file1.csv
V_150 V_200 V_300 V_375 V_500 V_750 V_1000
0 3.00 2.75 4.50 6.03 8.35 12.07 15.00
1 2.32 3.09 4.63 5.00 9.75 12.50 12.25
2 1.85 2.47 3.70 4.62 6.17 9.25 12.33
3 1.75 2.00 4.06 6.50 6.78 10.16 15.20
C:/Users/61437/Desktop/test_folder\file2.csv
V_300 V_375 V_500 V_750 V_1000
0 4.00 4.50 6.06 9.08 11.00
1 3.77 5.00 6.50 8.50 12.56
2 3.00 3.66 4.88 7.31 9.50
C:/Users/61437/Desktop/test_folder\file3.csv
V_500 V_750 V_1000
0 5.50 8.25 11.00
1 6.50 8.50 12.17
2 4.75 7.12 9.50
However the only saved csv file is 'derived_1.csv' which contains the output from file3.csv
What am I doing that is preventing all three files from being created?
You are not incrementing n
inside the loop. Your data gets stored in the file derived_1.csv
, which is overwritten on every iteration. Once the for
loop finishes executing, only the last csv will be saved.
Include the line n += 1
inside the for
loop to increment it by 1 on every iteration.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.