索引 Pandas Dataframe 混合行号和列名

Question

Coming from R and finding the index rules for pandas dataframes to be not easy to use.来自R，发现pandas数据帧的索引规则不好用。 I have a dataframe where I want to get the ith row and some columns by their names.我有一个 dataframe ，我想在其中获取第 i 行和一些列的名称。 I can clearly understand using either iloc or loc as shown below.我可以清楚地理解使用iloc或loc ，如下所示。

df = pd.DataFrame(np.random.randn(8, 4),columns=['A', 'B', 'C', 'D'])
df.loc[:,['A', 'B']]
df.iloc[0:,0:2]

Conceptually what I want is something like:从概念上讲，我想要的是：

df.loc[0:,['A', 'B']]

Meaning the first row with those columns.意思是这些列的第一行。 Of course that code fails.当然，该代码失败。 I can seemingly use:我似乎可以使用：

df.loc[0:0,['A', 'B']]

But, this seems strange, though it works.但是，这似乎很奇怪，尽管它有效。 How does one properly index using a combination of row number and column names?如何使用行号和列名的组合正确索引？ In R we would do something like:在 R 中，我们将执行以下操作：

df = data.frame(matrix(rnorm(32),8,4))
colnames(df) <- c("A", "B", "C", "D") 
df[1, c('A', 'B')]

*** UPDATE *** I was mistaken, the example code above indeed works on this toy dataframe. *** 更新 *** 我弄错了，上面的示例代码确实适用于这个玩具 dataframe。 But, on my real data, I see the following?但是，根据我的真实数据，我看到以下内容？ Both objects are of same type and code is the same, not understanding the error here.两个对象的类型相同，代码相同，这里不理解错误。

type(poly_set)
<class 'pandas.core.frame.DataFrame'>
poly_set.loc[:,['P1', 'P2', 'P3']]
                      P1            P2           P3
29   -2.0897226679999998  -1.237649556         None
361  -2.0789117340000001   0.144751427  1.572417454
642  -2.0681314259999999  -0.196563749  1.500834574

poly_set.loc[0,['P1', 'P2', 'P3']]
Traceback (most recent call last):
  File "C:\Users\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 998, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1005, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0

Answer 1

You are using slicing which means between two given index.您正在使用切片，这意味着在两个给定索引之间。 If you only want first row data just use:如果您只想要第一行数据，请使用：

Try:尝试：

df = df.reset_index()    
df.loc[0,['A', 'B']]

Answer 2

You can use .iloc (to get the i-th row) and .loc (to get columns by name) together:您可以一起使用.iloc （获取第 i 行）和.loc （按名称获取列）：

row_number = 0
df.iloc[row_number].loc[['A', 'B']]

You can even remove the .loc :您甚至可以删除.loc ：

df.iloc[row_number][['A', 'B']]

Answer 3

I agree that pandas slicing rules are not as easy to use as they should be.我同意 pandas 切片规则并不像应有的那样易于使用。 I believe the suggested approach these days is to use loc[] with a nested index lookup我相信这些天建议的方法是将loc[]与嵌套索引查找一起使用

df.loc[df.index[row_numbers], ['A','B']]

I have no idea why pandas still does not have an xloc[] or something similar that allows for row numbers and column names.我不知道为什么 pandas 仍然没有允许行号和列名的xloc[]或类似的东西。 See this answer to the same question.请参阅此对同一问题的答案。

In your answer update, you use loc[] , which can only look up row and column indexes , but you can see from the previous printout that there is no row with an index of 0. The row that is in location 0 has an index of 29. If you use my approach or the others mentioned here, you will have success.在您的答案更新中，您使用loc[] ，它只能查找行和列索引，但您可以从之前的打印输出中看到没有索引为 0 的行。位置0 中的行有索引29. 如果您使用我的方法或这里提到的其他方法，您将获得成功。

索引 Pandas Dataframe 混合行号和列名

问题描述

3 个解决方案

解决方案1
1 已采纳 2020-07-06 09:56:41

解决方案2
0 2021-09-24 00:56:07

解决方案3
0 2022-08-14 01:39:34

索引 Pandas Dataframe 混合行号和列名

问题描述

3 个解决方案

解决方案1 1 已采纳 2020-07-06 09:56:41

解决方案2 0 2021-09-24 00:56:07

解决方案3 0 2022-08-14 01:39:34

解决方案1
1 已采纳 2020-07-06 09:56:41

解决方案2
0 2021-09-24 00:56:07

解决方案3
0 2022-08-14 01:39:34