在Pandas数据帧中查找任何单元格值= = x，并返回单元格值，列标题，行和相邻单元格值

Question

I realise this is quite a lengthy ask, but I have been trying to solve this for days now with no success and wondered if anyone might have some ideas. 我意识到这是一个冗长的问题，但我一直试图解决这个问题几天没有成功，并想知道是否有人可能有一些想法。

Consider a spreadsheet like so: 考虑一下这样的电子表格：

        apple1  grape1  apple2  grape2  apple3  grape3
1          0       4     -0.2     2       0       4
2          0       4       0      6       0       3
3        -0.1      2       0      4       0       4
4        -0.5      5       0      6     -0.2      5
5        -0.4      4       0      5       0       2
6          0       6     -0.1     5       0       3

I would like to search my dataframe for any cell with a value less than -0.1, and write the value, column header, row number, and neighbouring value out. 我想在我的数据帧中搜索值小于-0.1的任何单元格，并写出值，列标题，行号和邻近值。

At the start, I though it might be as simple as something along the lines of: 一开始，我可能会像以下一样简单：

Newlist()

if df >= -0.1:
   Newlist.append(cell.value)
   Newlist.append(row.value)
   Newlist.append(column.value)
   Newlist.append(cell.value.shift(1))

I fully realise the above makes no sense, but I hope it conveys the idea of what I've been trying to do. 我完全意识到上述内容毫无意义，但我希望它能传达出我一直想做的事情。

Next, I could convert the df to a list and work from there( using an ifnot >= -0.1 to delete objects?), but this seems inelegant and far from ideal. 接下来，我可以将df转换为列表并从那里开始工作（使用ifnot> = -0.1删除对象？），但这看起来不够优雅且远非理想。 I am however open to this if anyone can get it to work. 但是，如果有人能够让它工作，我对此持开放态度。

I must have looked at every stack exchange question ever posted on this without managing anything so apologies if I've overlooked something very obvious. 如果我忽略了一些非常明显的事情，我一定已经看过每一次发布的堆栈交换问题而没有管理任何事情。

Thanks! 谢谢！

Answer 1

First, to filter your dataframe you can use boolean indexing like this : 首先，要过滤您的数据帧，您可以使用这样的布尔索引：

df[df >= -0.1]

This way, all the data that is not superior to -0.1 will be displayed as nan, you can then use Pandas.isnull() to identify them. 这样，所有不优于-0.1的数据都将显示为nan，然后您可以使用Pandas.isnull（）来识别它们。

To get the row and columns of the data you want, you could turn your dataframe into an array with df.to_numpy() and iterate over the rows and columns with enumerate to keep the id of row/column you are currently iterating through : 要获取所需数据的行和列，可以将数据帧转换为带有df.to_numpy（）的数组，并使用枚举迭代行和列，以保留当前正在迭代的行/列的ID：

my_data = df[df >= -0.1].to_numpy()
for idrow, row in enumerate(my_data):
   for idcol, col in enumerate(row):
       if not pd.isnull(col):
           print("Value :"+str(col)+" column:"+str(idcol)+" row:"+str(idrow))

This will result in something like this : 这将导致类似这样的事情：

Value :0.0 column:0 row:0
Value :4.0 column:1 row:0
Value :2.0 column:3 row:0

You can get columns name by using this in the loop : 你可以在循环中使用它来获取列名：

df.columns[idcol]

Once you got those ids, you can get the neighbouring values by direct access ie. 获得这些ID后，您可以通过直接访问来获取相邻值。

my_data[x][y]

Just remember to set a condition to not access value that are not in the array ! 只记得设置条件不访问不在数组中的值！

在Pandas数据帧中查找任何单元格值= = x，并返回单元格值，列标题，行和相邻单元格值

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-03-22 12:52:59

在Pandas数据帧中查找任何单元格值= = x，并返回单元格值，列标题，行和相邻单元格值

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-03-22 12:52:59

解决方案1
0 已采纳 2019-03-22 12:52:59