简体   繁体   English

熊猫DataFrame到numpy数组ValueError

[英]Pandas DataFrame to Numpy Array ValueError

I am trying to convert a single column of a dataframe to a numpy array. 我正在尝试将数据帧的单列转换为numpy数组。 Converting the entire dataframe has no issues. 转换整个数据框没有问题。

df DF

  viz  a1_count  a1_mean     a1_std
0   0         3        2   0.816497
1   1         0      NaN        NaN 
2   0         2       51  50.000000

Both of these functions work fine: 这两个功能都可以正常工作:

X = df.as_matrix()
X = df.as_matrix(columns=df.columns[1:])

However, when I try: 但是,当我尝试:

y = df.as_matrix(columns=df.columns[0])

I get: 我得到:

TypeError: Index(...) must be called with a collection of some kind, 'viz' was passed

The problem here is that you're passing just a single element which in this case is just the string title of that column, if you convert this to a list with a single element then it works: 这里的问题是您只传递了一个元素,在这种情况下,该元素只是该列的字符串标题,如果将其转换为具有单个元素的列表,则它可以工作:

In [97]:
y = df.as_matrix(columns=[df.columns[0]])
y

Out[97]:
array([[0],
       [1],
       [0]], dtype=int64)

Here is what you're passing: 这是您要传递的内容:

In [101]:
df.columns[0]

Out[101]:
'viz'

So it's equivalent to this: 因此,这等效于:

y = df.as_matrix(columns='viz')

which results in the same error 导致相同的错误

The docs show the expected params: 文档显示了预期的参数:

DataFrame.as_matrix(columns=None) Convert the frame to its Numpy-array representation. DataFrame.as_matrix(columns = None)将框架转换为其Numpy数组表示形式。

Parameters: columns: list, optional, default:None If None, return all columns, otherwise, returns specified columns 参数:columns:列表,可选,默认值:None如果为None,则返回所有列,否则,返回指定的列

as_matrix expects a list for the columns keyword and df.columns[0] isn't a list. as_matrix需要columns关键字的list ,而df.columns[0]不是列表。 Try df.as_matrix(columns=[df.columns[0]]) instead. 尝试使用df.as_matrix(columns=[df.columns[0]])

Using the index tolist function works as well 使用索引列表功能也可以

df.as_matrix(columns=df.columns[0].tolist())

When giving multiple columns, for example, the ten first, then the command 当给出多个列时,例如,先输入十列,然后输入命令

df.as_matrix(columns=[df.columns[0:10]])

does not work as it returns an index. 由于返回索引而无法正常工作 However, using 但是,使用

df.as_matrix(columns=df.columns[0:10].tolist())

works well. 效果很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM