繁体   English   中英

如何使用 Pandas 将不同数据框中的不同列相乘

[英]How to multiply different columns in different dataframes using Pandas

我有 2 个要相乘的数据帧。 我想将 dataframe 1 中的多列与 dataframe 2 中的一列相乘

raw_material_LCI = dataframe1[["climate change","ozone depletion",
              "ionising radiation, hh","photochemical ozone formation, hh",
              "particulate matter","human toxicity, non-cancer",
              "human toxicity, cancer","acidification",
              "eutrophication, freshwater","eutrophication, marine",
              "eutrophication, terrestrial","ecotoxicity, freshwater",
              "land use", "resource use, fossils","resource use, minerals and metals",
              "water scarcity"]] * dataframe2["mass_frac"]

上面的代码返回一个 dataframe,其中所有值都是 NaN。 列的名称都是其中包含数值的字段。

我决定尝试将 dataframe1 与一个值相乘,看看它是否有效,例如下面的示例

raw_material_LCI = dataframe1[["climate change","ozone depletion",
              "ionising radiation, hh","photochemical ozone formation, hh",
              "particulate matter","human toxicity, non-cancer",
              "human toxicity, cancer","acidification",
              "eutrophication, freshwater","eutrophication, marine",
              "eutrophication, terrestrial","ecotoxicity, freshwater",
              "land use", "resource use, fossils","resource use, minerals and metals",
              "water scarcity"]] * 0.7

具有单个值的示例返回带有数字的 dataframe,因此它可以工作。 有谁知道为什么一开始的乘法不起作用? 我看过 Python 中多篇关于在不同数据框中乘以列的文章,但找不到解决方案。

当您将两个数据帧相乘时,您必须同时对齐行和列索引;当您将 DataFrame 乘以一个系列时,您必须对齐行索引:

>>> df
          A         B         C         D         E
0  0.787081  0.350508  0.058542  0.492340  0.489379
1  0.512436  0.501375  0.108115  0.960808  0.841969
2  0.055247  0.305830  0.976043  0.016188  0.006424
3  0.303570  0.914876  0.157100  0.767454  0.340381
4  0.446077  0.595001  0.307799  0.115410  0.568281
5  0.226516  0.636902  0.086790  0.079260  0.402414
6  0.451920  0.526025  0.012470  0.931610  0.267155
7  0.472778  0.137005  0.227569  0.941355  0.584782
8  0.944396  0.769115  0.497214  0.531419  0.570797
9  0.788023  0.310288  0.336480  0.585466  0.432246

>>> sr
0    0.920878
1    0.445332
2    0.894407
3    0.613317
4    0.242270
5    0.299121
6    0.843052
7    0.279014
8    0.526778
9    0.249538
dtype: float64

所以,这会产生nan值:

>>> df * sr
          A         B         C         D         E
0  0.724805  0.322775  0.053910  0.453385  0.450658
1  0.228204  0.223279  0.048147  0.427878  0.374956
2  0.049413  0.273536  0.872980  0.014479  0.005745
3  0.186185  0.561109  0.096352  0.470693  0.208762
4  0.108071  0.144151  0.074571  0.027961  0.137678
5  0.067756  0.190511  0.025961  0.023708  0.120371
6  0.380992  0.443466  0.010513  0.785396  0.225226
7  0.131912  0.038226  0.063495  0.262651  0.163162
8  0.497487  0.405153  0.261921  0.279940  0.300683
9  0.196642  0.077429  0.083965  0.146096  0.107862

但沿索引轴使用mul按预期工作:

>>> df.mul(sr, axis=0)  # but not df.mul(sr) (same as df*sr)
          A         B         C         D         E
0  0.724805  0.322775  0.053910  0.453385  0.450658
1  0.228204  0.223279  0.048147  0.427878  0.374956
2  0.049413  0.273536  0.872980  0.014479  0.005745
3  0.186185  0.561109  0.096352  0.470693  0.208762
4  0.108071  0.144151  0.074571  0.027961  0.137678
5  0.067756  0.190511  0.025961  0.023708  0.120371
6  0.380992  0.443466  0.010513  0.785396  0.225226
7  0.131912  0.038226  0.063495  0.262651  0.163162
8  0.497487  0.405153  0.261921  0.279940  0.300683
9  0.196642  0.077429  0.083965  0.146096  0.107862

如果您的系列和 dataframe 的长度不同,您会得到部分结果:

>>> df.mul(sr.iloc[:5], axis=0)
          A         B         C         D         E
0  0.724805  0.322775  0.053910  0.453385  0.450658
1  0.228204  0.223279  0.048147  0.427878  0.374956
2  0.049413  0.273536  0.872980  0.014479  0.005745
3  0.186185  0.561109  0.096352  0.470693  0.208762
4  0.108071  0.144151  0.074571  0.027961  0.137678
5       NaN       NaN       NaN       NaN       NaN
6       NaN       NaN       NaN       NaN       NaN
7       NaN       NaN       NaN       NaN       NaN
8       NaN       NaN       NaN       NaN       NaN
9       NaN       NaN       NaN       NaN       NaN

>>> df.mul(sr.iloc[5:], axis=0)
          A         B         C         D         E
0       NaN       NaN       NaN       NaN       NaN
1       NaN       NaN       NaN       NaN       NaN
2       NaN       NaN       NaN       NaN       NaN
3       NaN       NaN       NaN       NaN       NaN
4       NaN       NaN       NaN       NaN       NaN
5  0.067756  0.190511  0.025961  0.023708  0.120371
6  0.380992  0.443466  0.010513  0.785396  0.225226
7  0.131912  0.038226  0.063495  0.262651  0.163162
8  0.497487  0.405153  0.261921  0.279940  0.300683
9  0.196642  0.077429  0.083965  0.146096  0.107862

注意在实例之间具有相同的索引。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM