[英]Replacing pandas DataFrame column with Series
Consider the following DataFrame and Series:考虑以下 DataFrame 和系列:
>>> base_data = pd.DataFrame({'out_temp': {Timestamp('2021-09-01 05:40:00+0000', tz='UTC'): 15.0, Timestamp('2021-09-01 05:50:00+0000', tz='UTC'): 10.0, Timestamp('2021-09-01 06:00:00+0000', tz='UTC'): 18.0, Timestamp('2021-09-01 06:10:00+0000', tz='UTC'): 16.0, Timestamp('2021-09-01 06:20:00+0000', tz='UTC'): 9.0, Timestamp('2021-09-01 06:30:00+0000', tz='UTC'): 8.0, Timestamp('2021-09-01 06:40:00+0000', tz='UTC'): 14.0, Timestamp('2021-09-01 06:50:00+0000', tz='UTC'): 10.0, Timestamp('2021-09-01 07:00:00+0000', tz='UTC'): 13.0, Timestamp('2021-09-01 07:10:00+0000', tz='UTC'): 17.0}, 'out_humidity': {Timestamp('2021-09-01 05:40:00+0000', tz='UTC'): 95.0, Timestamp('2021-09-01 05:50:00+0000', tz='UTC'): 95.0, Timestamp('2021-09-01 06:00:00+0000', tz='UTC'): 93.0, Timestamp('2021-09-01 06:10:00+0000', tz='UTC'): 92.0, Timestamp('2021-09-01 06:20:00+0000', tz='UTC'): 91.0, Timestamp('2021-09-01 06:30:00+0000', tz='UTC'): 92.0, Timestamp('2021-09-01 06:40:00+0000', tz='UTC'): 93.0, Timestamp('2021-09-01 06:50:00+0000', tz='UTC'): 93.0, Timestamp('2021-09-01 07:00:00+0000', tz='UTC'): 89.0, Timestamp('2021-09-01 07:10:00+0000', tz='UTC'): 90.0}})
>>> base_data
out_temp out_humidity
datetime
2021-09-01 05:40:00+00:00 15.0 95.0
2021-09-01 05:50:00+00:00 10.0 95.0
2021-09-01 06:00:00+00:00 18.0 93.0
2021-09-01 06:10:00+00:00 16.0 92.0
2021-09-01 06:20:00+00:00 9.0 91.0
2021-09-01 06:30:00+00:00 8.0 92.0
2021-09-01 06:40:00+00:00 14.0 93.0
2021-09-01 06:50:00+00:00 10.0 93.0
2021-09-01 07:00:00+00:00 13.0 89.0
2021-09-01 07:10:00+00:00 17.0 90.0
>>> updated = pd.Series({Timestamp('2021-09-01 06:30:00+0000', tz='UTC'): 8.0, Timestamp('2021-09-01 06:40:00+0000', tz='UTC'): 14.0, Timestamp('2021-09-01 06:50:00+0000', tz='UTC'): nan, Timestamp('2021-09-01 07:00:00+0000', tz='UTC'): 13.0, Timestamp('2021-09-01 07:10:00+0000', tz='UTC'): nan})
>>> updated
datetime
2021-09-01 06:30:00+00:00 8.0
2021-09-01 06:40:00+00:00 14.0
2021-09-01 06:50:00+00:00 NaN
2021-09-01 07:00:00+00:00 13.0
2021-09-01 07:10:00+00:00 NaN
Name: out_temp, dtype: float64
The DataFrame contains column, out_temp, which is the same as the name of the Series. DataFrame 包含列 out_temp,与系列名称相同。 The index is the same, however, the Series is missing some index values.索引是相同的,但是,系列缺少一些索引值。 I want to combine these two together, replacing all values in the DataFrame by the values that are different in the Series.我想将这两者结合在一起,用系列中不同的值替换 DataFrame 中的所有值。 So in this case, I want the following result:所以在这种情况下,我想要以下结果:
out_temp out_humidity
datetime
2021-09-01 05:40:00+00:00 15.0 95.0
2021-09-01 05:50:00+00:00 10.0 95.0
2021-09-01 06:00:00+00:00 18.0 93.0
2021-09-01 06:10:00+00:00 16.0 92.0
2021-09-01 06:20:00+00:00 9.0 91.0
2021-09-01 06:30:00+00:00 8.0 92.0
2021-09-01 06:40:00+00:00 14.0 93.0
2021-09-01 06:50:00+00:00 NaN 93.0
2021-09-01 07:00:00+00:00 13.0 89.0
2021-09-01 07:10:00+00:00 NaN 90.0
The temperature got replaced by the NaN values in the Series.温度被系列中的 NaN 值取代。
Because you have nan
, you have to subsetting your original dataframe to update values:因为您有nan
,所以您必须对原始 dataframe 进行子集化以更新值:
base_data.loc[updated.index, 'out_temp'] = updated
print(base_data)
# Output
out_temp out_humidity
2021-09-01 05:40:00+00:00 15.0 95.0
2021-09-01 05:50:00+00:00 10.0 95.0
2021-09-01 06:00:00+00:00 18.0 93.0
2021-09-01 06:10:00+00:00 16.0 92.0
2021-09-01 06:20:00+00:00 9.0 91.0
2021-09-01 06:30:00+00:00 8.0 92.0
2021-09-01 06:40:00+00:00 14.0 93.0
2021-09-01 06:50:00+00:00 NaN 93.0
2021-09-01 07:00:00+00:00 13.0 89.0
2021-09-01 07:10:00+00:00 NaN 90.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.