简体   繁体   English

遍历 Pandas 中的行

[英]Iterating through rows in Pandas

I am having an issue applying some maths to my dataframe我在将一些数学应用于我的 dataframe 时遇到问题

Current df:当前 df:

name名称 lastConnected最后连接 check查看
test1测试1 1647609274746 1647609274746 Connection联系
test2测试2 1647609274785 1647609274785 Connection联系
test3测试3 1647000000000 1647000000000 Connection联系
test4测试4 1647609274756 1647609274756 Connection联系

Desired df: i now want to create a new column and check if server is still online所需的 df:我现在想创建一个新列并检查服务器是否仍然在线

name名称 lastConnected最后连接 check查看
test1测试1 1647609274746 1647609274746 Connection联系
test2测试2 1647609274785 1647609274785 Connection联系
test3测试3 1647000000000 1647000000000 No connection无连接
test4测试4 1647609274756 1647609274756 Connection联系

current code:当前代码:

def checkServer():
        for index, row in df.iterrows():
                timeNow = int((time.time_ns) // 1000000)
                lastSeenTime = row['lastConnected']
                timeDifference = currentTime - lastSeenTime
                if timeDifference > 5000:
                        df['check'] = "No connection"
                else:
                        df['check'] = "Connection"
        return df

My issue:我的问题:

As you can see in my current dataframe it gives Connection to them all even though test3 should have No connection.正如您在我当前的 dataframe 中看到的那样,即使 test3 应该没有连接,它也会为它们提供连接。 From my troubleshooting, i printed the timeDifference into each row and i got the same time difference even though its all got different times.从我的故障排除中,我将 timeDifference 打印到每一行中,即使时间不同,我也得到了相同的时间差。 As a result I think my for loop might be the issue.因此,我认为我的 for 循环可能是问题所在。

Use this site to get current time in miliseconds: currentmillis.com使用此站点以毫秒为单位获取当前时间:currentmillis.com

Where am I going wrong?我哪里错了?

IIUC, don't use iterrows but a vectorial function: IIUC,不要使用iterrows ,而是使用矢量 function:

N = 5000
df.loc[df['lastConnected'].diff().lt(-N), 'check'] = 'No connection'

or to create the column from sratch:或者从头开始创建列:

N = 5000
df['check'] = np.where(df['lastConnected'].diff().lt(-N),
                       'No connection', 'connection')

output: output:

    name  lastConnected          check
0  test1        1647609     Connection
1  test2        1647579     Connection
2  test3        1640009  No connection

Maybe you expect:也许你期望:

df['check'] = np.where(pd.Timestamp.today().timestamp() * 1000 - df['lastConnected'] > 5000,
                       'No connection', 'Connection')
print(df)

# Output
    name  lastConnected          check
0  test1  1647609274746     Connection
1  test2  1647609274785     Connection
2  test3  1647000000000  No connection
3  test4  1647609274756     Connection

Old answer旧答案

Use np.where :使用np.where

df['check'] = np.where(df['lastConnected'].diff().abs().gt(5000),
                       'No connection', 'Connection')
print(df)

# Output
    name  lastConnected          check
0  test1        1647609     Connection
1  test2        1647579     Connection
2  test3        1640009  No connection

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM