如何根據另一列的值將 pandas dataframe 的某些行添加到列表中

Question

我有一個 csv 文件，其中一列標記為“計數”，然后是 10 列，標記為 1-10。 總共有 100 行。 對於十列中的每一列，我想將該列中的所有值添加到列表中，其中“計數”值在 100-400 之間。 這將產生 10 個列表。 我附上了數據的樣例，以及我擁有的一些代碼。 謝謝在此處輸入圖像描述

本質上，對於每一列，我想要該行的“計數”值在 100-400 之間的所有值的列表。 我想將所有列表保存在字典中，將列 header 映射到所需間隔內的所有值的列表。

到目前為止，我有：

import pandas as pd
dict ={}
data = pd.read_csv('Data.csv') 
headers = data.columns.values
headers = headers[1:]
count = 1
for header in headers:
    for index, row in data.iterrows():
        dict[str(count)] = []
        if 100<=data.loc[index, 'count'] <= 400:
            dict[str(count)].append(data.loc[index, header])
count+=1

但這似乎在 jupyter notebook 中崩潰了。 謝謝！

Answer 1

從概念上講，您快到了，但您可能只需要內置的pandas function 來幫助您執行此操作： to_dict 。

# Get the data which falls into the range of interest
range_data = data[(100<=data['count'])&(data['count']<=400)]

# Convert column names to strings (rather than numbers)
range_data.columns = range_data.columns.astype(str)

# Convert to a dictionary
value_dict = range_data.drop(columns=['counts']).to_dict(orient='list')```

范圍比較

您將無法在 python 中執行此操作：

100<=data.loc[index, 'count'] <= 400

您需要將每個比較分開，如下所示：

100<=data.loc[index, 'count'] and data.loc[index, 'count'] <= 400

命名

將字典命名為dict是個壞主意。 這將用您的字典覆蓋基本 function dict ，因此您將無法再調用dict來創建新字典。 此外，它可能會很混亂。

如何根據另一列的值將 pandas dataframe 的某些行添加到列表中

問題描述

1 個解決方案

解決方案1
0 2020-07-21 00:17:27

范圍比較

命名

如何根據另一列的值將 pandas dataframe 的某些行添加到列表中

問題描述

1 個解決方案

解決方案1 0 2020-07-21 00:17:27

范圍比較

命名

解決方案1
0 2020-07-21 00:17:27