獲取Pandas DataFrame第一列

Question

這個問題很奇怪，因為我知道如何做某事，但我不知道為什么我不能這樣做。

假設簡單的數據框：

import pandasas pd
a = pd.DataFrame([[0,1], [2,3]])

我可以很容易地切割這個數據幀，第一列是a[[0]] ，第二列是a[[1]] 。 簡單不是嗎？

現在，讓我們有更復雜的數據框架。 這是我的代碼的一部分：

var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in 
             range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab

數據框架frame也是pandas DataFrame，如a。 我很容易將第二列作為frame[[1]] 。 但是當我嘗試frame[[0]]我收到一個錯誤：

Traceback (most recent call last):

  File "<ipython-input-55-0c56ffb47d0d>", line 1, in <module>
    frame[[0]]

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 1991, in __getitem__
    return self._getitem_array(key)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 2035, in     _getitem_array
    indexer = self.ix._convert_to_indexer(key, axis=1)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1184, in     _convert_to_indexer
    indexer = labels._convert_list_indexer(objarr, kind=self.name)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\indexes\base.py", line 1112, in     _convert_list_indexer
    return maybe_convert_indices(indexer, len(self))

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1856, in     maybe_convert_indices
    raise IndexError("indices are out-of-bounds")

IndexError: indices are out-of-bounds

我仍然可以使用frame.iloc[:,0]但問題是我不明白為什么我不能使用[[]]簡單切片？ 如果有幫助我使用winpython spyder 3。

Answer 1

使用你的代碼：

import pandas as pd

var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in 
             range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab

如果你要打印出“框架”，你會得到：

    Variable    1
loc_1   0       0
loc_2   1       1
loc_3   2       8
loc_4   3       27
loc_5   4       64
loc_6   5       125
......

所以問題的原因變得很明顯，你沒有名為'0'的列。 在第一行，您指定一個名為var_vec的lista。 在第4行，您可以從該列表中創建一個數據框，但是您可以指定索引值和列名稱（通常這是一種很好的做法）。 第一個示例中的數字列名稱“0”，“1”，...僅在未指定列名時發生，而不是列位置索引器。

如果您想按位置訪問列，您可以：

df[df.columns[0]]

會發生什么，是你得到df列的列表，你選擇術語'0'並將其作為參考傳遞給df。

希望能幫助你理解

編輯：

另一種方式（更好）將是：

df.iloc[:,0]

其中“：”代表所有行。 （也用從0到行范圍的數字索引）

獲取Pandas DataFrame第一列

問題描述

1 個解決方案

解決方案1
8 已采納 2017-01-31 10:37:02

獲取Pandas DataFrame第一列

問題描述

1 個解決方案

解決方案1 8 已采納 2017-01-31 10:37:02

解決方案1
8 已采納 2017-01-31 10:37:02