如何在不使用 for 循環的情況下將給定值與 Pandas dataframe 值進行比較

Question

我有以下示例 dataframe：

d = {'target': [1, 2, 4, 3, 6, 5]}
df = pd.DataFrame(data=d)
df

Output：

我需要一個 function 來執行以下操作：

讓 function 的名稱為find_index_of_first_hit(value) 。

function...

將 function 輸入value與列target的元素進行比較。
將搜索大於或等於 function 輸入value 。
並將返回第一個匹配項的 dataframe 行的index 。

例子：

find_index_of_first_hit(3)

應該返回2 ，它是target列值 4 的索引，這是列值（即 4）>= function 輸入值 3 在該列中第一次出現的位置。 而index為2，預計會返回。

如果沒有列值 >= function 輸入值，則 function 應返回 -1。

原來的 dataframe 相當大，我想知道如何在不使用 for 循環的情況下編寫這樣的程序。

這個 function 需要寫成 Python 並且它需要是一個快速的解決方案，這就是為什么我想避免 for 循環。 性能在這里很重要。

我怎么能寫這樣的 Python function 做這個工作？

Answer 1

使用Series.idxmax測試是否值存在於if-else和Series.any ：

def find_index_of_first_hit(val):
    a = df['target'].ge(val)
    return a.idxmax() if a.any() else -1

print (find_index_of_first_hit(3))
2
print (find_index_of_first_hit(30))
-1

Answer 2

使用等式檢查.eq和idxmax

您會發現您很少需要為 Pandas 編寫任何函數（除非您需要 package 編寫可重復使用的代碼片段），因為大部分內容都在 API 中可用。

index = df.ge(3).idxmax()

target    2
dtype: int64

如何在不使用 for 循環的情況下將給定值與 Pandas dataframe 值進行比較

問題描述

2 個解決方案

解決方案1
2 已采納 2022-02-09 11:50:55

解決方案2
1 2022-02-09 11:48:20

如何在不使用 for 循環的情況下將給定值與 Pandas dataframe 值進行比較

問題描述

2 個解決方案

解決方案1 2 已采納 2022-02-09 11:50:55

解決方案2 1 2022-02-09 11:48:20

解決方案1
2 已采納 2022-02-09 11:50:55

解決方案2
1 2022-02-09 11:48:20