简体   繁体   English

numpy 数组和 numpy 向量之间的区别

[英]Difference between a numpy array and a numpy vector

I wanted to know the difference between these two lines of code我想知道这两行代码的区别

 X_train = training_dataset.iloc[:, 1].values
 X_train = training_dataset.iloc[:, 1:2].values

My guess is that the latter is a 2-D numpy array and the former is a 1-D numpy array.我的猜测是后者是二维 numpy 阵列,前者是一维 numpy 阵列。 For inputs in a neural network, the latter is the proper way for input data, is there are specific reason for that?对于神经网络中的输入,后者是输入数据的正确方式,是否有具体原因?

Please help!请帮忙!

Not quite that, they have both have ndim=2, just check by doing this:不完全是,他们都有ndim = 2,只需这样做检查:

X_train.ndim

The difference is that in the second one it doesn't have a defined second dimension if you want to see the difference between the shapes I suggest reading this: Difference between numpy.array shape (R, 1) and (R,)不同之处在于,在第二个维度中,如果您想查看形状之间的差异,我建议您阅读以下内容: Difference between numpy.array shape (R, 1) and (R,)

Difference is iloc returns a Series with a single row or column is selected but a Dataframe with a multiple row or column ranges reference不同之处在于 iloc 返回一个带有单行或单列的系列,但 Dataframe 具有多行或多列范围参考

Although they both refer to column 1, 1 and 1:2 are different types, with 1 representing an int and 1:2 representing a slice.虽然它们都引用第 1 列,但 1 和 1:2 是不同的类型,1 表示 int,1:2 表示 slice。

With,和,

X_train = training_dataset.iloc[:, 1].values

You specify a single column so training_dataset.iloc[:, 1] is a Pandas Series, so.values is a 1D Numpy array您指定单个列,因此 training_dataset.iloc[:, 1] 是 Pandas 系列,所以.values 是一维 Numpy 数组

Vs.,与,

X_train = training_dataset.iloc[:, 1:2].values

Although it becomes one column, [1:2] is a slice you represents a column range so training_dataset.iloc[:, 1:2] is a Pandas Dataframe.虽然它变成一列,但 [1:2] 是一个切片,您表示列范围,因此 training_dataset.iloc[:, 1:2] 是 Pandas Dataframe。 Thus, .values is a 2D Numpy array因此, .values 是一个二维 Numpy 数组

Test as follows:测试如下:

Create training_dataset Dataframe创建training_dataset Dataframe

data = {'Height':[1, 14, 2, 1, 5], 'Width':[15, 25, 2, 20, 27]} 
training_dataset = pd.DataFrame(data)

Using.iloc[:, 1]使用.iloc[:, 1]

print(type(training_dataset.iloc[:, 1]))
print(training_dataset.iloc[:, 1].values)

# Result is: 
<class 'pandas.core.series.Series'>
# Values returns a 1D Numpy array
0    15
1    25
2     2
3    20
4    27
Name: Width, dtype: int64, 

Using iloc[:, 1:2]使用 iloc[:, 1:2]

print(type(training_dataset.iloc[:, 1:2]))
print(training_dataset.iloc[:, 1:2].values)
# Result is: 
<class 'pandas.core.frame.DataFrame'>
# Values is a 2D Numpy array (since values of Pandas Dataframe)
[[15]
 [25]
 [ 2]
 [20]
 [27]], 
X_train Values Var Type <class 'numpy.ndarray'>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM