在While循環中從Pandas Dataframe查找特定的數據行

Question

我正在嘗試使用csv，並將其作為Pandas Dataframe讀取。
該數據框包含4行數字。
我想從數據框中選擇特定的數據行。
在While循環中，我想從Dataframe中選擇一個隨機行，並將其與我選擇的行進行比較。
我希望它繼續運行while循環，直到該隨機行等於我之前選擇的行的100％。
然后，我希望While循環中斷，並希望它計算出與隨機數匹配所需的嘗試次數。

這是我到目前為止的內容：

這是數據框的示例：

    A  B  C  D
1   2  7  12 14
2   4  5  11 23
3   4  6  14 20
4   4  7  13 50
5   9  6  14 35

這是我努力的一個例子：

import time
import pandas as pd

then = time.time()

count = 0

df = pd.read_csv('Get_Numbers.csv')
df.columns = ['A', 'B', 'C', 'D']

while True:
    df_elements = df.sample(n=1)
    random_row = df_elements
    print(random_row)
    find_this_row = df['A','B','C','D' == '4','7','13,'50']
    print(find_this_row)
    if find_this_row != random_row:
        count += 1
    else:
        break

print("You found the correct numbers! And it only took " + str(count) + " tries to get there! Your numbers were: " + str(find_this_row))

now = time.time()

print("It took: ", now-then, " seconds")

上面的代碼給出了一個明顯的錯誤...但是我現在嘗試了很多不同的版本來查找find_this_row數字，我什至不知道該怎么做，所以我放棄了這一嘗試。

我想避免的是對要查找的行使用特定的索引，我寧願僅使用值來查找該行。

我正在使用df_elements = df.sample(n=1)隨機選擇一行。 這是為了避免使用random.choice因為我不確定這是否行得通，或者哪種方式更節省時間/內存，但是我也random.choice提供建議。

在我看來，隨機選擇一行數據似乎很簡單，如果它與我想要的數據行不匹配，請繼續隨機選擇數據行直到匹配為止。 但是我似乎無法執行它。

非常感謝任何幫助！

Answer 1

您可以使用返回shape=(1, 2) np.ndarray的values[0] ，使用values[0]僅獲得一維數組。

然后將數組與any()

import time
import pandas as pd

then = time.time()

df = pd.DataFrame(data={'A': [1, 2, 3],
                        'B': [8, 9, 10]})

find_this_row = [2, 9]
print("Looking for: {}".format(find_this_row))

count = 0
while True:
    random_row = df.sample(n=1).values[0]
    print(random_row)

    if any(find_this_row != random_row):
        count += 1
    else:
        break

print("You found the correct numbers! And it only took " + str(count) + " tries to get there! Your numbers were: " + str(find_this_row))

now = time.time()

print("It took: ", now-then, " seconds")

Answer 2

如何使用values ？

values將返回一個值列表。 然后，您可以輕松比較兩個列表。

list1 == list2將在比較相應列表的索引時返回True和False值的數組。 您可以檢查返回的所有值是否均為True

Answer 3

這是一次一次測試一行的方法。 我們檢查是否values所選擇的行等於采樣的數值DataFrame 。 我們要求它們all匹配。

row = df.sample(1)

counter = 0
not_a_match = True

while not_a_match:
    not_a_match = ~(df.sample(n=1).values == row.values).all()
    counter+=1

print(f'It took {counter} tries and the numbers were\n{row}')
#It took 9 tries and the numbers were
#   A  B   C   D
#4  4  7  13  50

如果要更快一點，請選擇一行，然后對DataFrame進行多次采樣。 然后，您可以第一次檢查采樣行是否等於采樣的DataFrame ，從而獲得一次while循環中需要進行的“重DataFrame ”次數，但所需的時間要少得多。 循環可以防止我們找不到匹配的可能性，因為它是通過替換采樣的。

row = df.sample(1)

n = 0
none_match = True
k = 10  # Increase to check more matches at once.

while none_match:
    matches = (df.sample(n=len(df)*k, replace=True).values == row.values).all(1)
    none_match = ~matches.any()  # Determine if none still match
    n += k*len(df)*none_match  # Only increment if none match
n = n + matches.argmax() + 1

print(f'It took {n} tries and the numbers were\n{row}')
#It took 3 tries and the numbers were
#   A  B   C   D
#4  4  7  13  50

Answer 4

首先有一些提示。 這行對我不起作用：

find_this_row = df['A','B','C','D' == '4','7','13,'50']

有兩個原因：

在'13之后缺少一個'''
df是一個DataFrame（），因此不支持使用如下所示的鍵

df ['A'，'B'，'C'，'D'...

使用鍵返回DataFrame（）：

df[['A','B','C','D']]

或作為Series（）：

df['A']

由於您需要整行包含多列，因此請執行以下操作：

df2.iloc[4].values

array（['4'，'7'，'13'，'50']，dtype = object）

對示例行執行相同的操作：

df2.sample(n=1).values

行之間的比較需要針對all（）元素/列進行：

df2.sample(n=1).values == df2.iloc[4].values

array（[[True，False，False，False]]）

添加.all（）如下所示：

(df2.sample(n=1).values == df2.iloc[4].values).all()

哪個返回

真假

全部一起：

import time
import pandas as pd

then = time.time()
count = 0
while True:
    random_row = df2.sample(n=1).values
    find_this_row = df2.iloc[4].values
    if (random_row == find_this_row).all() == False:
        count += 1
    else:
        break

print("You found the correct numbers! And it only took " + str(count) + " tries to get there! Your numbers were: " + str(find_this_row))

now = time.time()

print("It took: ", now-then, " seconds")

在While循環中從Pandas Dataframe查找特定的數據行

問題描述

4 個解決方案

解決方案1
1 已采納 2018-10-26 02:27:11

解決方案2
0 2018-10-26 02:20:20

解決方案3
0 2018-10-26 02:36:08

解決方案4
0 2018-10-26 09:57:21

在While循環中從Pandas Dataframe查找特定的數據行

問題描述

4 個解決方案

解決方案1 1 已采納 2018-10-26 02:27:11

解決方案2 0 2018-10-26 02:20:20

解決方案3 0 2018-10-26 02:36:08

解決方案4 0 2018-10-26 09:57:21

解決方案1
1 已采納 2018-10-26 02:27:11

解決方案2
0 2018-10-26 02:20:20

解決方案3
0 2018-10-26 02:36:08

解決方案4
0 2018-10-26 09:57:21