My dataframe looks like this:
007538839
0 105586.180
1 105582.910
2 105585.230
3 105576.445
4 105580.016
df1.shape = (69302, 1)
This has only one column with name "007538839". And I have several others dataframes that also has one column like this but with different column names and different row sizes.
007543167
0 39886.620
1 39908.777
2 39886.574
3 39884.340
4 39871.098
df2.shape = (69778, 1)
I want to merge all of them together in a loop that looks like this:
import os
base_dir = ''
for root, dirs, files in os.walk(base_dir, topdown=False):
for name in files:
if root.count(os.sep) == 3 and name.endswith(".csv"):
file_path = os.path.join(root, name)
#merge all files
My goal is to not delete any rows, and for rows that do not yet have a value, NaN would be assigned. So for example, if I merge df1 and df2 I should get something with 69778 rows.
Create list of dictionaries first by append and then use concat
with axis=1
:
import os
dfs = []
base_dir = ''
for root, dirs, files in os.walk(base_dir, topdown=False):
for name in files:
if root.count(os.sep) == 3 and name.endswith(".csv"):
file_path = os.path.join(root, name)
df = pd.read_csv(file_path)
dfs.append(df)
df = pd.concat(dfs, axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.