如何按条件用另一个数据框填充数据框并在同一单元格上求和？

Question

如下所示，有 2 个数据帧 df1 和 df2。

df1 说明了每列名称的数量数据。 所以每列将包含从地板到下的数量。

    Column name    Floor    Under  ConcType   ConcTotal     RebarTotal     SteelTotal
0   CB1            ACP      GF     C80        21.6          7954.8         2265.0
1   CB1A           B4F      LGF    C60        1.2           318.8          0.0
...   
201 CT12B          GF       8F     C60        4.5           757.6         840.0
202 CT13A          GF       8F     C80        1.4           429.3         420.0

而 df2 是一个用于填充数量数据的单独楼层的表格。但是 df2 在 Fl.Name_h（列）上是不同的，意思是将包含所有列表格 df1 的数量来承载那个楼层。

H 表示承载该楼层的柱子高度。 它是通过高程楼层 - 高程前一层计算的。

例如F8层的H是48400 - 42800 = 5600

    Fl.Name_Elev    Elevation    Fl.Name_h   H         C60_Conc  C80_Conc   Rebar   Steel
0   8F              48400        8F          5600.0    NaN       NaN        NaN     NaN 
1   7F              42800        7F          6000.0    NaN       NaN        NaN     NaN
2   6F              36800        6F          6000.0    NaN       NaN        NaN     NaN
3   5F              30800        5F          5600.0    NaN       NaN        NaN     NaN
4   4F              25200        4F          5600.0    NaN       NaN        NaN     NaN
5   3F              19600        3F          5600.0    NaN       NaN        NaN     NaN
6   2F              14000        2F          6000.0    NaN       NaN        NaN     NaN
7   1F              8000         1F          6000.0    NaN       NaN        NaN     NaN
8   GF              2000         GF          3300.0    NaN       NaN        NaN     NaN
9   LGF MZ          -1300        LGF MZ      3150.0    NaN       NaN        NaN     NaN 
10  LGF             -4450        LGF         4650.0    NaN       NaN        NaN     NaN
11  B1F             -9100        B1F         3000.0    NaN       NaN        NaN     NaN
12  B2F             -12100       B2F         3000.0    NaN       NaN        NaN     NaN
13  B3F             -15100       B3F         4650.0    NaN       NaN        NaN     NaN 
14  B4F             -19750       B4F         8250.0    NaN       NaN        NaN     NaN
15  ACP             -28000       NaN         NaN       NaN       NaN        NaN     NaN

我想从 df1 填充到 df2 中“C60_Conc”、“C80_Conc”、“Rebar”、“Steel”的 NaN。

结果将如下所示：

Fl.Name_Elev    Elevation    Fl.Name_h       H         C60_Conc C80_Conc    Rebar    Steel
0   8F              48400        8F          5600.0    4.5      1.4         1186.9   1260   
1   7F              42800        7F          6000.0    4.5      1.4         1186.9   1260
2   6F              36800        6F          6000.0    4.5      1.4         1186.9   1260
3   5F              30800        5F          5600.0    4.5      1.4         1186.9   1260
4   4F              25200        4F          5600.0    4.5      1.4         1186.9   1260
5   3F              19600        3F          5600.0    4.5      1.4         1186.9   1260
6   2F              14000        2F          6000.0    4.5      1.4         1186.9   1260
7   1F              8000         1F          6000.0    4.5      1.4         1186.9   1260
8   GF              2000         GF          3300.0    0        21.6        7954.8   2265
9   LGF MZ          -1300        LGF MZ      3150.0    0        21.6        7954.8   2265   
10  LGF             -4450        LGF         4650.0    1.2      21.6        8276.6   2265
11  B1F             -9100        B1F         3000.0    1.2      21.6        8276.6   2265
12  B2F             -12100       B2F         3000.0    1.2      21.6        8276.6   2265
13  B3F             -15100       B3F         4650.0    1.2      21.6        8276.6   2265   
14  B4F             -19750       B4F         8250.0    0        21.6        7954.8   2265
15  ACP             -28000       NaN         NaN       NaN      NaN         NaN      NaN

PS对不起我的沟通。 英语不是我的母语。

Answer 1

解决方案：

floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]), 
                   columns="ConcType", 
                   values="ConcTotal").reset_index().groupby("Floors").sum()
output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})
>>> output
   Fl.Name_Elev  Elevation Fl.Name_h  ...  SteelTotal  C60_Conc  C80_Conc
0            8F      48400        8F  ...        1260       4.5       1.4
1            7F      42800        7F  ...        1260       4.5       1.4
2            6F      36800        6F  ...        1260       4.5       1.4
3            5F      30800        5F  ...        1260       4.5       1.4
4            4F      25200        4F  ...        1260       4.5       1.4
5            3F      19600        3F  ...        1260       4.5       1.4
6            2F      14000        2F  ...        1260       4.5       1.4
7            1F       8000        1F  ...        1260       4.5       1.4
8            GF       2000        GF  ...        3525       4.5      23.0
9        LGF MZ      -1300    LGF MZ  ...        2265       0.0      21.6
10          LGF      -4450       LGF  ...        2265       1.2      21.6
11          B1F      -9100       B1F  ...        2265       1.2      21.6
12          B2F     -12100       B2F  ...        2265       1.2      21.6
13          B3F     -15100       B3F  ...        2265       1.2      21.6
14          B4F     -19750       B4F  ...        2265       1.2      21.6
15          ACP     -28000       NaN  ...        2265       0.0      21.6

解释

首先，我们使用pandas.explode包含 df1 中从 Floor 到 Under 的所有楼层：

floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
>>> df1.head()
  Column name Floor Under ConcType  ConcTotal  RebarTotal  SteelTotal  Floors
0         CB1   ACP    GF      C80       21.6      7954.8        2265     ACP
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B4F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B3F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B2F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B1F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     LGF
0         CB1   ACP    GF      C80       21.6      7954.8        2265  LGF MZ
0         CB1   ACP    GF      C80       21.6      7954.8        2265      GF
1        CB1A   B4F   LGF      C60        1.2       318.8           0     B4F
1        CB1A   B4F   LGF      C60        1.2       318.8           0     B3F

然后，我们通过使用pandas.pivot创建一个映射器来获取 C60 和 C80 的单独列，并使用pandas.groupby和sum来获取每个楼层的总数：

mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]),
                   columns="ConcType",
                   values="ConcTotal") \
            .reset_index() \
            .groupby("Floor").sum()
>>> mapper
ConcType  RebarTotal  SteelTotal  C60   C80
Floors                                     
1F            1186.9        1260  4.5   1.4
2F            1186.9        1260  4.5   1.4
3F            1186.9        1260  4.5   1.4
4F            1186.9        1260  4.5   1.4
5F            1186.9        1260  4.5   1.4
6F            1186.9        1260  4.5   1.4
7F            1186.9        1260  4.5   1.4
8F            1186.9        1260  4.5   1.4
ACP           7954.8        2265  0.0  21.6
B1F           8273.6        2265  1.2  21.6
B2F           8273.6        2265  1.2  21.6
B3F           8273.6        2265  1.2  21.6
B4F           8273.6        2265  1.2  21.6
GF            9141.7        3525  4.5  23.0
LGF           8273.6        2265  1.2  21.6
LGF MZ        7954.8        2265  0.0  21.6

最后，我们在 df2 和 mapper 上使用pandas.merge来获取输出：

output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})

如何按条件用另一个数据框填充数据框并在同一单元格上求和？

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-07-23 22:16:54

解决方案：

解释

如何按条件用另一个数据框填充数据框并在同一单元格上求和？

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-07-23 22:16:54

解决方案：

解释

解决方案1
0 已采纳 2021-07-23 22:16:54