[英]How to fill a pandas dataframe column using a value from another dataframe column
Firstly we can import some packages which might be useful首先我们可以导入一些可能有用的包
import pandas as pd
import datetime
Say I now have a dataframe which has a date, name and age column.假设我现在有一个 dataframe,它有一个日期、名称和年龄列。
df1 = pd.DataFrame({'date': ['10-04-2020', '04-07-2019', '12-05-2015' ], 'name': ['john', 'tim', 'sam'], 'age':[20, 22, 27]})
Now say I have another dataframe with some random columns现在说我有另一个带有一些随机列的 dataframe
df2 = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
Question:问题:
How can I take the age value in df1
filtered on the date (can select this value) and populate a whole new column in df2
with this value?如何获取在日期过滤的df1
中的年龄值(可以 select 这个值)并用这个值在df2
中填充一个全新的列? Ideally this method should generalise for any number of rows in the dataframe.理想情况下,这种方法应该适用于 dataframe 中的任意数量的行。
Tried试过了
The following is what I have tried (on a similar example) but for some reason it doesn't seem to work (it just shows nan values in the majority of column entries except for a few which randomly seem to populate).以下是我尝试过的(在类似的示例中),但由于某种原因它似乎不起作用(它只在大多数列条目中显示 nan 值,除了一些似乎随机填充的值)。
y = datetime.datetime(2015, 5, 12)
df2['new'] = df1[(df1['date'] == y)].age
Expected Output预计 Output
Since I have filtered above based on sams age (date corresponds to the row with sams name) I would like the new column to be added to df2 with his age as all the entries (in this case 27 repeated 3 times).由于我已经根据 sams 年龄在上面进行了过滤(日期对应于 sams 名称的行),我希望将新列添加到 df2 中,并将他的年龄作为所有条目(在本例中为 27 重复 3 次)。
df2 = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'new': [27, 27, 27]})
Try:尝试:
y = datetime.datetime(2015, 5, 12).strftime('%d-%m-%Y')
df2.loc[:, 'new'] = df1.loc[df1['date'] == y, "age"].item()
# Output
a b new
0 1 4 27
1 2 5 27
2 3 6 27
You'd like to change format of y to Str and try df.loc method您想将 y 的格式更改为 Str 并尝试 df.loc 方法
y = datetime.datetime(2015, 5, 12)
y=y.strftime('%d-%m-%Y')
df2['new']=int(df1.loc[df1['date']==y,'age'].values)
df2
Convert df1 date
column to datetime
type将 df1 date
列转换为datetime
时间类型
df1['date'] = pd.to_datetime(df1.date, format='%d-%m-%Y')
Filter dataframe and get the age过滤 dataframe 并得到年龄
req_date = '2015-05-12'
age_for_date = df1.query('date == @req_date').age.iloc[0]
NOTE: This assumes that there is only one age per date (As explained by OP in comments)注意:这假设每个日期只有一个年龄(正如 OP 在评论中解释的那样)
Create a new column创建一个新列
df2 = df2.assign(new=age_for_date)
Output Output
a b new
0 1 4 27
1 2 5 27
2 3 6 27
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.