[英]Selecting rows in one dataframe based on data in another dataframe in Python Pandas
I have two dataframes created with Pandas. 我有两个使用Pandas创建的数据框。 The first one has co-occurrences of items happening in certain years:
第一个是在某些年份发生的项目的同时发生:
Date Item1 Item2
0 1975 a b
1 1976 b c
2 1977 b a
3 1977 a b
4 1978 c d
5 1979 e f
6 1980 a f
The second one has birthdates of the items: 第二个项目的生日:
Birthdate Item
1975 a
1975 b
1976 c
1978 d
1979 f
1979 e
Now, I want to set an age variable, for example: 现在,我想设置一个年龄变量,例如:
age = 2
And then populate a third dataframe (alternative transform the first one) so that I get a version of the first one keeping only rows of co-occurrences that happened when Item1 was below the defined 'age'. 然后填充第三个数据框(对第一个数据框进行替代转换),以便获得第一个数据框的版本,仅保留在Item1低于定义的“年龄”时发生的共现行。
You could merge
DataFrames - it is similar to join
in SQL 您可以
merge
DataFrames-与join
SQL相似
import pandas
data = [
[1975,'a','b'],
[1976,'b','c'],
[1977,'b','a'],
[1977,'a','b'],
[1978,'c','d'],
[1979,'e','f'],
[1980,'a','f'],
]
birthdate = [
[1975,'a'],
[1975,'b'],
[1976,'c'],
[1978,'d'],
[1979,'f'],
[1979,'e']
]
df1 = pandas.DataFrame(data, columns = ['Date', 'Item1', 'Item2'])
df2 = pandas.DataFrame(birthdate, columns = ['Birthdate', 'Item'])
#print df1
#print df2
newdf = pandas.merge(left=df1, right=df2, left_on='Item1', right_on='Item')
print newdf
print newdf[ newdf['Birthdate'] > 1975 ]
. 。
Date Item1 Item2 Birthdate Item
0 1975 a b 1975 a
1 1977 a b 1975 a
2 1980 a f 1975 a
3 1976 b c 1975 b
4 1977 b a 1975 b
5 1978 c d 1976 c
6 1979 e f 1979 e
Date Item1 Item2 Birthdate Item
5 1978 c d 1976 c
6 1979 e f 1979 e
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.