简体   繁体   English

如何在 Pandas 中获取 dataframe 中的行号和列号?

[英]How to get row and colum number in dataframe in Pandas?

How can I get the number of the row and the column in a dataframe that contains a certain value using Pandas?如何使用 Pandas 获取包含特定值的 dataframe 中的行数和列数? For example, I have the following dataframe:例如,我有以下 dataframe:

数据框

For example, i need to know the row and column of "Smith" (row 1, column LastName)例如,我需要知道“Smith”的行和列(第 1 行,LastName 列)

Maybe this is a solution or a first step to a solution.也许这是一个解决方案或解决方案的第一步。

If you filter for the value you are looking for all items which are not the value you want are replaced with NaN .如果您过滤您正在寻找的值,所有不是您想要的值的项目都将替换为NaN Now you can drop all columns where all values are NaN .现在您可以删除所有值为NaN的所有列。 This leaves a DataFrame with your item and the indices.这会留下一个 DataFrame 与您的项目和索引。 Then you can ask for index and name.然后你可以要求索引和名称。

import numpy as np
import pandas as pd
df = pd.DataFrame({'LastName':['a', 'Smith', 'b'], 'other':[1,2,3]})

value = df[df=='Smith'].dropna(axis=0, how='all').dropna(axis=1, how='all')
print(value.index.values)
print(value.columns.values)

But I think this can be improved.但我认为这可以改进。

Here's a one liner that efficiently gets the row and column of a value:这是一个可以有效获取值的行和列的单行:

df = pd.DataFrame({"ClientID": [34, 67, 53], "LastName": ["Johnson", "Smith", "Brows"] })
result = next(x[1:] for x in ((v, i, j) for i, row_tup in enumerate(df.itertuples(index=False)) for j, v in zip(df.columns, row_tup)) if x[0] == "Smith")
print(result)

Output Output

(1, "LastName")

Unpacking that one liner拆开那一个衬垫

# This is a generator that unpacks the dataframe and gets the value, row number (i) and column name (j) for every value in the dataframe
item_generator = ((v, i, j) for i, row_tup in enumerate(df.itertuples(index=False)) for j, v in zip(df.columns, row_tup))
# This iterates through the generator until it finds a match
# It outputs just the row and column number by leaving off the first item in the tuple
next(x[1:] for x in item_generator if x[0] == "Smith")

Props to this this answer for the second half of the solution 对此解决方案后半部分的答案的支持

Just to add another possible solution to the bucket.只是为存储桶添加另一种可能的解决方案。 If you really need to your search your whole DataFrame, you may consider using numpy.where , such as:如果您确实需要搜索整个 DataFrame,您可以考虑使用numpy.where ,例如:

import numpy as np

value = 'Smith'
rows, cols = np.where(df.values == value)

where_are_you = [(df.index[row], df.columns[col]) for row, col in zip(rows, cols)]

So, if your DataFrame is like所以,如果你的 DataFrame 就像

   ClientID First Name LastName
0        34         Mr    Smith
1        67      Keanu   Reeves
2        53     Master     Yoda
3        99      Smith    Smith
4       100      Harry   Potter

The code output will be:代码 output 将是:

[(0, 'LastName'), (3, 'First Name'), (3, 'LastName')]

Edit: Just to satisfy everybody's curiosity, here it is a benchmark of all answers编辑:只是为了满足大家的好奇心,这里是所有答案的基准

在此处输入图像描述

The code is written below.代码写在下面。 I removed the print statements to be fair, because they would make codes really slow for bigger dataframes.为了公平起见,我删除了print语句,因为它们会使代码对于更大的数据帧非常慢。

val = 0

def setup(n=10):
    return pd.DataFrame(np.random.randint(-100, 100, (n, 3)))


def nested_for(df):
    index = df.index  # Allows to get the row index
    columns = df.columns  # Allows to get the column name
    value_to_be_checked = val
    for i in index[df.isin([value_to_be_checked]).any(axis=1)].to_list():
        for j, e in enumerate(df.iloc[i]):
            if e == value_to_be_checked:
                _ = "(row {}, column {})".format(i, columns[j])


def df_twin_dropna(df):
    value = df[df == val].dropna(axis=0, how='all').dropna(axis=1, how='all')
    return value.index.values, value.columns.values


def numpy_where(df):
    rows, cols = np.where(df.values == val)
    return [(df.index[row], df.columns[col]) for row, col in zip(rows, cols)]


def one_line_generator(df):
    return [x[1:] for x in ((v, i, j) for i, row_tup in enumerate(df.itertuples(index=False))
                            for j, v in zip(df.columns, row_tup)) if x[0] == "Smith"]

I tried to simplify the code and make it more readable.我试图简化代码并使其更具可读性。 This is my attempt:这是我的尝试:

df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
                   'assists': [5, 7, 7, 9, 12],
                   'rebounds': [11, 8, 10, 6, 6]})
index = df.index # Allows to get the row index
columns = df.columns # Allows to get the column name
value_to_be_checked = 6
for i in index[df.isin([value_to_be_checked]).any(axis=1)].to_list():
  for j, e in enumerate(df.iloc[i]):
    if e == value_to_be_checked:
      print("(row {}, column {})".format(i, column[j])

You can do this by looping though all the columns and finding the matching rows.您可以通过遍历所有列并找到匹配的行来做到这一点。 This will give you a list of all the cells that matches your criteria:这将为您提供与您的条件匹配的所有单元格的列表:

Method 1(without comprehension):方法1(不理解):

import pandas as pd
# assume this df and that we are looking for 'abc'
df = pd.DataFrame({
    'clientid': [34, 67, 53],
    'lastname': ['Johnson', 'Smith', 'Brows']
})

Searchval = 'Smith'
l1 = []
#loop though all the columns
for col in df.columns:
    #finding the matching rows
    for i in range(len(df[col][df[col].eq(Searchval)].index)):
        #appending the output to the list
        l1.append((df[col][df[col].eq(Searchval)].index[i], col))
print(l1)

Method 2 (With comprehension):方法2(有理解):

import pandas as pd

df = pd.DataFrame({
    'clientid': [34, 67, 53],
    'lastname': ['Johnson', 'Smith', 'Brows']
})
#Value to search
Searchval = 'Smith'
#using list comprehension to find the rows in each column which matches the criteria
#and saving it in a list in case we get multiple matches
l = [(df[col][df[col].eq(Searchval)].index[i], col) for col in df.columns
     for i in range(len(df[col][df[col].eq(Searchval)].index))]

print(l)

Thanks for submitting your request.感谢您提交您的请求。 This is something you can find with a Google search.这是您可以通过 Google 搜索找到的内容。 Please make some attempt to find answers before asking a new question.在提出新问题之前,请尝试寻找答案。

You can find simple and excellent dataframe examples that include column and row selection here: https://studymachinelearning.com/python-pandas-dataframe/您可以在此处找到简单而出色的 dataframe 示例,其中包括列和行选择: https://studymachinelearning.com/python-pandas-dataframe/

You can also see the official documentation here: https://pandas.pydata.org/pandas-docs/stable/也可以在这里查看官方文档: https://pandas.pydata.org/pandas-docs/stable/

Select a column by column name: Select 一列一列名:

df['col']

select a row by index: select 按索引一行:

df.loc['b']  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM