[英]Iterating through rows in Pandas
I am having an issue applying some maths to my dataframe我在将一些数学应用于我的 dataframe 时遇到问题
Current df:当前 df:
name![]() |
lastConnected![]() |
check![]() |
---|---|---|
test1![]() |
1647609274746 ![]() |
Connection![]() |
test2![]() |
1647609274785 ![]() |
Connection![]() |
test3![]() |
1647000000000 ![]() |
Connection![]() |
test4![]() |
1647609274756 ![]() |
Connection![]() |
Desired df: i now want to create a new column and check if server is still online所需的 df:我现在想创建一个新列并检查服务器是否仍然在线
name![]() |
lastConnected![]() |
check![]() |
---|---|---|
test1![]() |
1647609274746 ![]() |
Connection![]() |
test2![]() |
1647609274785 ![]() |
Connection![]() |
test3![]() |
1647000000000 ![]() |
No connection![]() |
test4![]() |
1647609274756 ![]() |
Connection![]() |
current code:当前代码:
def checkServer():
for index, row in df.iterrows():
timeNow = int((time.time_ns) // 1000000)
lastSeenTime = row['lastConnected']
timeDifference = currentTime - lastSeenTime
if timeDifference > 5000:
df['check'] = "No connection"
else:
df['check'] = "Connection"
return df
My issue:我的问题:
As you can see in my current dataframe it gives Connection to them all even though test3 should have No connection.正如您在我当前的 dataframe 中看到的那样,即使 test3 应该没有连接,它也会为它们提供连接。 From my troubleshooting, i printed the timeDifference into each row and i got the same time difference even though its all got different times.
从我的故障排除中,我将 timeDifference 打印到每一行中,即使时间不同,我也得到了相同的时间差。 As a result I think my for loop might be the issue.
因此,我认为我的 for 循环可能是问题所在。
Use this site to get current time in miliseconds: currentmillis.com使用此站点以毫秒为单位获取当前时间:currentmillis.com
Where am I going wrong?我哪里错了?
IIUC, don't use iterrows
but a vectorial function: IIUC,不要使用
iterrows
,而是使用矢量 function:
N = 5000
df.loc[df['lastConnected'].diff().lt(-N), 'check'] = 'No connection'
or to create the column from sratch:或者从头开始创建列:
N = 5000
df['check'] = np.where(df['lastConnected'].diff().lt(-N),
'No connection', 'connection')
output: output:
name lastConnected check
0 test1 1647609 Connection
1 test2 1647579 Connection
2 test3 1640009 No connection
Maybe you expect:也许你期望:
df['check'] = np.where(pd.Timestamp.today().timestamp() * 1000 - df['lastConnected'] > 5000,
'No connection', 'Connection')
print(df)
# Output
name lastConnected check
0 test1 1647609274746 Connection
1 test2 1647609274785 Connection
2 test3 1647000000000 No connection
3 test4 1647609274756 Connection
Old answer旧答案
Use np.where
:使用
np.where
:
df['check'] = np.where(df['lastConnected'].diff().abs().gt(5000),
'No connection', 'Connection')
print(df)
# Output
name lastConnected check
0 test1 1647609 Connection
1 test2 1647579 Connection
2 test3 1640009 No connection
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.