[英]How to fill dataframe with another dataframe by condition and sum when it on same cell?
如下所示,有 2 个数据帧 df1 和 df2。
df1 说明了每列名称的数量数据。 所以每列将包含从地板到下的数量。
Column name Floor Under ConcType ConcTotal RebarTotal SteelTotal
0 CB1 ACP GF C80 21.6 7954.8 2265.0
1 CB1A B4F LGF C60 1.2 318.8 0.0
...
201 CT12B GF 8F C60 4.5 757.6 840.0
202 CT13A GF 8F C80 1.4 429.3 420.0
而 df2 是一个用于填充数量数据的单独楼层的表格。但是 df2 在 Fl.Name_h(列)上是不同的,意思是将包含所有列表格 df1 的数量来承载那个楼层。
H 表示承载该楼层的柱子高度。 它是通过高程楼层 - 高程前一层计算的。
例如F8层的H是48400 - 42800 = 5600
Fl.Name_Elev Elevation Fl.Name_h H C60_Conc C80_Conc Rebar Steel
0 8F 48400 8F 5600.0 NaN NaN NaN NaN
1 7F 42800 7F 6000.0 NaN NaN NaN NaN
2 6F 36800 6F 6000.0 NaN NaN NaN NaN
3 5F 30800 5F 5600.0 NaN NaN NaN NaN
4 4F 25200 4F 5600.0 NaN NaN NaN NaN
5 3F 19600 3F 5600.0 NaN NaN NaN NaN
6 2F 14000 2F 6000.0 NaN NaN NaN NaN
7 1F 8000 1F 6000.0 NaN NaN NaN NaN
8 GF 2000 GF 3300.0 NaN NaN NaN NaN
9 LGF MZ -1300 LGF MZ 3150.0 NaN NaN NaN NaN
10 LGF -4450 LGF 4650.0 NaN NaN NaN NaN
11 B1F -9100 B1F 3000.0 NaN NaN NaN NaN
12 B2F -12100 B2F 3000.0 NaN NaN NaN NaN
13 B3F -15100 B3F 4650.0 NaN NaN NaN NaN
14 B4F -19750 B4F 8250.0 NaN NaN NaN NaN
15 ACP -28000 NaN NaN NaN NaN NaN NaN
我想从 df1 填充到 df2 中“C60_Conc”、“C80_Conc”、“Rebar”、“Steel”的 NaN。
结果将如下所示:
Fl.Name_Elev Elevation Fl.Name_h H C60_Conc C80_Conc Rebar Steel
0 8F 48400 8F 5600.0 4.5 1.4 1186.9 1260
1 7F 42800 7F 6000.0 4.5 1.4 1186.9 1260
2 6F 36800 6F 6000.0 4.5 1.4 1186.9 1260
3 5F 30800 5F 5600.0 4.5 1.4 1186.9 1260
4 4F 25200 4F 5600.0 4.5 1.4 1186.9 1260
5 3F 19600 3F 5600.0 4.5 1.4 1186.9 1260
6 2F 14000 2F 6000.0 4.5 1.4 1186.9 1260
7 1F 8000 1F 6000.0 4.5 1.4 1186.9 1260
8 GF 2000 GF 3300.0 0 21.6 7954.8 2265
9 LGF MZ -1300 LGF MZ 3150.0 0 21.6 7954.8 2265
10 LGF -4450 LGF 4650.0 1.2 21.6 8276.6 2265
11 B1F -9100 B1F 3000.0 1.2 21.6 8276.6 2265
12 B2F -12100 B2F 3000.0 1.2 21.6 8276.6 2265
13 B3F -15100 B3F 4650.0 1.2 21.6 8276.6 2265
14 B4F -19750 B4F 8250.0 0 21.6 7954.8 2265
15 ACP -28000 NaN NaN NaN NaN NaN NaN
PS对不起我的沟通。 英语不是我的母语。
floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]),
columns="ConcType",
values="ConcTotal").reset_index().groupby("Floors").sum()
output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})
>>> output
Fl.Name_Elev Elevation Fl.Name_h ... SteelTotal C60_Conc C80_Conc
0 8F 48400 8F ... 1260 4.5 1.4
1 7F 42800 7F ... 1260 4.5 1.4
2 6F 36800 6F ... 1260 4.5 1.4
3 5F 30800 5F ... 1260 4.5 1.4
4 4F 25200 4F ... 1260 4.5 1.4
5 3F 19600 3F ... 1260 4.5 1.4
6 2F 14000 2F ... 1260 4.5 1.4
7 1F 8000 1F ... 1260 4.5 1.4
8 GF 2000 GF ... 3525 4.5 23.0
9 LGF MZ -1300 LGF MZ ... 2265 0.0 21.6
10 LGF -4450 LGF ... 2265 1.2 21.6
11 B1F -9100 B1F ... 2265 1.2 21.6
12 B2F -12100 B2F ... 2265 1.2 21.6
13 B3F -15100 B3F ... 2265 1.2 21.6
14 B4F -19750 B4F ... 2265 1.2 21.6
15 ACP -28000 NaN ... 2265 0.0 21.6
首先,我们使用pandas.explode
包含 df1 中从 Floor 到 Under 的所有楼层:
floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
>>> df1.head()
Column name Floor Under ConcType ConcTotal RebarTotal SteelTotal Floors
0 CB1 ACP GF C80 21.6 7954.8 2265 ACP
0 CB1 ACP GF C80 21.6 7954.8 2265 B4F
0 CB1 ACP GF C80 21.6 7954.8 2265 B3F
0 CB1 ACP GF C80 21.6 7954.8 2265 B2F
0 CB1 ACP GF C80 21.6 7954.8 2265 B1F
0 CB1 ACP GF C80 21.6 7954.8 2265 LGF
0 CB1 ACP GF C80 21.6 7954.8 2265 LGF MZ
0 CB1 ACP GF C80 21.6 7954.8 2265 GF
1 CB1A B4F LGF C60 1.2 318.8 0 B4F
1 CB1A B4F LGF C60 1.2 318.8 0 B3F
然后,我们通过使用pandas.pivot
创建一个映射器来获取 C60 和 C80 的单独列,并使用pandas.groupby
和sum
来获取每个楼层的总数:
mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]),
columns="ConcType",
values="ConcTotal") \
.reset_index() \
.groupby("Floor").sum()
>>> mapper
ConcType RebarTotal SteelTotal C60 C80
Floors
1F 1186.9 1260 4.5 1.4
2F 1186.9 1260 4.5 1.4
3F 1186.9 1260 4.5 1.4
4F 1186.9 1260 4.5 1.4
5F 1186.9 1260 4.5 1.4
6F 1186.9 1260 4.5 1.4
7F 1186.9 1260 4.5 1.4
8F 1186.9 1260 4.5 1.4
ACP 7954.8 2265 0.0 21.6
B1F 8273.6 2265 1.2 21.6
B2F 8273.6 2265 1.2 21.6
B3F 8273.6 2265 1.2 21.6
B4F 8273.6 2265 1.2 21.6
GF 9141.7 3525 4.5 23.0
LGF 8273.6 2265 1.2 21.6
LGF MZ 7954.8 2265 0.0 21.6
最后,我们在 df2 和 mapper 上使用pandas.merge
来获取输出:
output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.