[英]How to compare one row from df1 from other rows from df2 based on some condition in pandas?
I have two files(some rows could be same and some could be different) which have data like this- 我有两个文件(有些行可能相同,有些行可能不同),它们的数据如下:
PID, STARTED,%CPU,%MEM,COMMAND
1,Wed Sep 12 10:10:21 2018, 0.0, 0.0,init
2,Wed Sep 12 10:10:21 2018, 0.0, 0.0,kthreadd
Now, I want to perform following operations on these dataframes- 现在,我要对这些数据框执行以下操作-
Since file has 10000 rows. 由于文件有10000行。 so I am implementing it via python pandas but not getting the proper way. 所以我正在通过python pandas实现它,但是没有正确的方法。 Any help would be appreciable. 任何帮助将是可观的。
Raw data 原始数据
First dataframe: 第一个数据框:
df = pd.DataFrame({
'Started': [*np.repeat(pd.Timestamp(2018, 9, 12, 12, 12, 21), 2)],
'%CPI': [0.0, 0.0],
'%MEM': [0.0, 0.0],
'COMMAND': ['init', 'kthreadd']
})
Output: 输出:
Started %CPI %MEM COMMAND
0 2018-09-12 12:12:21 0.0 0.0 init
1 2018-09-12 12:12:21 0.0 0.0 kthreadd
Second dataframe: 第二个数据框:
df2 = pd.DataFrame({
'Started': [pd.Timestamp(2018, 9, 12, 12, 12, 21), pd.Timestamp(2020, 9, 12, 12, 12, 21)],
'%CPI': [0.0, 1.0],
'%MEM': [0.0, 1.0],
'COMMAND': ['init', 'different']
})
Output (row 0 the same, row 1 different): 输出(行0相同,行1不同):
Started %CPI %MEM COMMAND
0 2018-09-12 12:12:21 0.0 0.0 init
1 2020-09-12 12:12:21 1.0 1.0 different
Answer 回答
Create new dataframe with only matching rows: 创建仅包含匹配行的新数据框:
columns = df.columns.tolist()
matches = pd.merge(df, df2, left_on=columns, right_on=columns)
Output: 输出:
Started %CPI %MEM COMMAND
0 2018-09-12 12:12:21 0.0 0.0 init
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.