根据行提取具有不同索引列的pandas数据帧的值

Question

I have this dataframe with 1 000 000 rows and 1 00 columns . 我有这个数据帧有1 000 000行和100 列。

   0       1          2         3         4         5         6          ...
0  2.645751  2.828427  3.000000  3.000000  3.000000  3.000000  3.000000   
1  2.645751  2.828427  2.828427  3.000000  3.000000  3.000000  3.000000   
2  2.449490  2.449490  2.645751  2.645751  2.645751  2.645751  2.645751   
3  2.000000  2.236068  2.449490  2.449490  2.449490  2.449490  2.449490   
4  2.449490  2.828427  2.828427  2.828427  2.828427  2.828427  2.828427   
5  1.414214  1.414214  1.414214  1.414214  1.414214  1.414214  1.732051

Reproductible example (convert it to df) : 可重复的例子（将其转换为df）：

df={0: {0: 2.6457513110645907, 1: 2.6457513110645907},
 1: {0: 2.8284271247461903, 1: 2.8284271247461903},
 2: {0: 3.0, 1: 2.8284271247461903},
 3: {0: 3.0, 1: 3.0},
 4: {0: 3.0, 1: 3.0},
 5: {0: 3.0, 1: 3.0},
 6: {0: 3.0, 1: 3.0},
 7: {0: 3.0, 1: 3.0},
 8: {0: 3.0, 1: 3.0},
 9: {0: 3.0, 1: 3.0},
 10: {0: 3.0, 1: 3.0},
 11: {0: 3.0, 1: 3.0},
 12: {0: 3.0, 1: 3.0},
 13: {0: 3.0, 1: 3.0},
 14: {0: 3.0, 1: 3.0},
 15: {0: 3.0, 1: 3.0},
 16: {0: 3.0, 1: 3.0},
 17: {0: 3.1622776601683795, 1: 3.0},
 18: {0: 3.1622776601683795, 1: 3.0},
 19: {0: 3.1622776601683795, 1: 3.0},
 20: {0: 3.1622776601683795, 1: 3.0},
 21: {0: 3.1622776601683795, 1: 3.0},
 22: {0: 3.1622776601683795, 1: 3.0},
 23: {0: 3.1622776601683795, 1: 3.0},
 24: {0: 3.1622776601683795, 1: 3.0},
 25: {0: 3.1622776601683795, 1: 3.0},
 26: {0: 3.1622776601683795, 1: 3.1622776601683795},
 27: {0: 3.1622776601683795, 1: 3.1622776601683795},
 28: {0: 3.1622776601683795, 1: 3.1622776601683795},
 29: {0: 3.1622776601683795, 1: 3.1622776601683795},
 30: {0: 3.1622776601683795, 1: 3.1622776601683795},
 31: {0: 3.1622776601683795, 1: 3.1622776601683795},
 32: {0: 3.1622776601683795, 1: 3.1622776601683795},
 33: {0: 3.1622776601683795, 1: 3.3166247903554},
 34: {0: 3.1622776601683795, 1: 3.3166247903554},
 35: {0: 3.1622776601683795, 1: 3.3166247903554},
 36: {0: 3.3166247903554, 1: 3.3166247903554},
 37: {0: 3.3166247903554, 1: 3.3166247903554},
 38: {0: 3.3166247903554, 1: 3.3166247903554},
 39: {0: 3.3166247903554, 1: 3.3166247903554},
 40: {0: 3.3166247903554, 1: 3.3166247903554},
 41: {0: 3.3166247903554, 1: 3.3166247903554},
 42: {0: 3.3166247903554, 1: 3.3166247903554},
 43: {0: 3.3166247903554, 1: 3.3166247903554},
 44: {0: 3.3166247903554, 1: 3.3166247903554},
 45: {0: 3.3166247903554, 1: 3.3166247903554},
 46: {0: 3.3166247903554, 1: 3.3166247903554},
 47: {0: 3.3166247903554, 1: 3.3166247903554},
 48: {0: 3.3166247903554, 1: 3.3166247903554},
 49: {0: 3.3166247903554, 1: 3.3166247903554},
 50: {0: 3.3166247903554, 1: 3.3166247903554},
 51: {0: 3.3166247903554, 1: 3.3166247903554},
 52: {0: 3.3166247903554, 1: 3.3166247903554},
 53: {0: 3.3166247903554, 1: 3.3166247903554},
 54: {0: 3.3166247903554, 1: 3.3166247903554},
 55: {0: 3.3166247903554, 1: 3.3166247903554},
 56: {0: 3.3166247903554, 1: 3.3166247903554},
 57: {0: 3.3166247903554, 1: 3.3166247903554},
 58: {0: 3.3166247903554, 1: 3.3166247903554},
 59: {0: 3.3166247903554, 1: 3.3166247903554},
 60: {0: 3.3166247903554, 1: 3.3166247903554},
 61: {0: 3.3166247903554, 1: 3.3166247903554},
 62: {0: 3.3166247903554, 1: 3.3166247903554},
 63: {0: 3.3166247903554, 1: 3.3166247903554},
 64: {0: 3.3166247903554, 1: 3.3166247903554},
 65: {0: 3.3166247903554, 1: 3.3166247903554},
 66: {0: 3.3166247903554, 1: 3.3166247903554},
 67: {0: 3.3166247903554, 1: 3.3166247903554},
 68: {0: 3.3166247903554, 1: 3.3166247903554},
 69: {0: 3.3166247903554, 1: 3.3166247903554},
 70: {0: 3.3166247903554, 1: 3.3166247903554},
 71: {0: 3.3166247903554, 1: 3.3166247903554},
 72: {0: 3.3166247903554, 1: 3.3166247903554},
 73: {0: 3.3166247903554, 1: 3.3166247903554},
 74: {0: 3.3166247903554, 1: 3.3166247903554},
 75: {0: 3.3166247903554, 1: 3.3166247903554},
 76: {0: 3.3166247903554, 1: 3.3166247903554},
 77: {0: 3.3166247903554, 1: 3.3166247903554},
 78: {0: 3.3166247903554, 1: 3.3166247903554},
 79: {0: 3.3166247903554, 1: 3.3166247903554},
 80: {0: 3.3166247903554, 1: 3.3166247903554},
 81: {0: 3.3166247903554, 1: 3.3166247903554},
 82: {0: 3.3166247903554, 1: 3.3166247903554},
 83: {0: 3.3166247903554, 1: 3.3166247903554},
 84: {0: 3.3166247903554, 1: 3.3166247903554},
 85: {0: 3.3166247903554, 1: 3.3166247903554},
 86: {0: 3.3166247903554, 1: 3.3166247903554},
 87: {0: 3.3166247903554, 1: 3.3166247903554},
 88: {0: 3.3166247903554, 1: 3.3166247903554},
 89: {0: 3.3166247903554, 1: 3.3166247903554},
 90: {0: 3.3166247903554, 1: 3.3166247903554},
 91: {0: 3.3166247903554, 1: 3.3166247903554},
 92: {0: 3.3166247903554, 1: 3.3166247903554},
 93: {0: 3.3166247903554, 1: 3.3166247903554},
 94: {0: 3.3166247903554, 1: 3.3166247903554},
 95: {0: 3.3166247903554, 1: 3.3166247903554},
 96: {0: 3.3166247903554, 1: 3.3166247903554},
 97: {0: 3.3166247903554, 1: 3.3166247903554},
 98: {0: 3.3166247903554, 1: 3.3166247903554},
 99: {0: 3.3166247903554, 1: 3.3166247903554}}

I have a list of list with different lengths which contains the index of columns which I need. 我有一个不同长度的列表列表，其中包含我需要的列索引。

list_idx = [[array([ 7, 12, 49])], [array([ 4, 34, 41, 45, 80, 82])]]

The first element of list_idx ([array([ 7, 12, 49])]) is values to extract for the first row. list_idx的第一个元素（[array（[ 7,12,49 ]）]）是要为第一行提取的值。 <-> Row 1 :I need the value of 7th, 12th and 49th columns of my dataframe. < - > 第1行：我需要数据帧的第7列，第12列和第49列的值。

Here the code to do this, but is there a faster way to extract values? 这里是执行此操作的代码，但是有更快的方法来提取值吗？

finalListofList=[
for (row,idx) in zip(df.iterrows(),list_idx ):
   finalListofList.append(list(row[1][idx[0]]))

Answer 1

You can simply use DataFrame.loc : 您只需使用DataFrame.loc ：

finalListofList = df.loc[0,list_idx[0][0]].values
# array([3.        , 3.        , 3.31662479])

Note that the extra [0] in list_idx[0][0] is because you have a nested list, ie list_idx[0] still gives a list which is not valid to index in this case. 请注意， list_idx[0][0] [0]中的list_idx[0][0]是因为你有一个嵌套列表，即list_idx[0]仍然给出一个在这种情况下无法索引的列表。

You can read more about indexing and selecting data here 您可以在此处阅读有关索引和选择数据的更多信息

Answer 2

在列表理解中使用numpy索引：

finalListofList = [row[idx[0]].tolist() for row, idx in zip(df.values, list_idx)]

根据行提取具有不同索引列的pandas数据帧的值

问题描述

2 个解决方案

解决方案1
2 2019-01-07 10:04:36

解决方案2
1 已采纳 2019-01-07 10:03:15

根据行提取具有不同索引列的pandas数据帧的值

问题描述

2 个解决方案

解决方案1 2 2019-01-07 10:04:36

解决方案2 1 已采纳 2019-01-07 10:03:15

解决方案1
2 2019-01-07 10:04:36

解决方案2
1 已采纳 2019-01-07 10:03:15