[英]how to convert a Series of arrays into a single matrix in pandas/numpy?
I somehow got a pandas.Series
which contains a bunch of arrays in it, as the s
in the code below.我以某种方式得到了一个包含一堆数组的
pandas.Series
,如下面的代码中的s
。
data = [[1,2,3],[2,3,4],[3,4,5],[2,3,4],[3,4,5],[2,3,4],
[3,4,5],[2,3,4],[3,4,5],[2,3,4],[3,4,5]]
s = pd.Series(data = data)
s.shape # output ---> (11L,)
# try to convert s to matrix
sm = s.as_matrix()
# but...
sm.shape # output ---> (11L,)
How can I convert the s
into a matrix with shape (11,3)?如何将
s
转换为形状为 (11,3) 的矩阵? Thanks!谢谢!
If, for some reason, you have found yourself with that abomination of a Series
, getting it back into the sort of matrix
or array
you want is straightforward: 如果出于某种原因,你发现自己对
Series
憎恶,那么将它恢复到你想要的那种matrix
或array
是很简单的:
In [16]: s
Out[16]:
0 [1, 2, 3]
1 [2, 3, 4]
2 [3, 4, 5]
3 [2, 3, 4]
4 [3, 4, 5]
5 [2, 3, 4]
6 [3, 4, 5]
7 [2, 3, 4]
8 [3, 4, 5]
9 [2, 3, 4]
10 [3, 4, 5]
dtype: object
In [17]: sm = np.matrix(s.tolist())
In [18]: sm
Out[18]:
matrix([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[2, 3, 4],
[3, 4, 5],
[2, 3, 4],
[3, 4, 5],
[2, 3, 4],
[3, 4, 5],
[2, 3, 4],
[3, 4, 5]])
In [19]: sm.shape
Out[19]: (11, 3)
But unless it's something you can't change, having that Series makes little sense to begin with. 但除非它是你无法改变的东西,否则开始使用该系列毫无意义。
Another way is to extract the values of your series and use numpy.stack on them. 另一种方法是提取系列的值并对它们使用numpy.stack。
np.stack(s.values)
PS. PS。 I've run into similar situations often.
我经常遇到类似的情况。
对于pandas> = 0.24,您还可以使用np.stack(s.to_numpy())
或np.concatenate(s.to_numpy())
,具体取决于您的要求。
I tested above methods with 5793 of 100D vectors.我用 5793 个 100D 向量测试了上述方法。 The old method, converting to list first, is fastest.
先转换为列表的旧方法最快。
%time print(np.stack(df.features.values).shape)
%time print(np.stack(df.features.to_numpy()).shape)
%time print(np.array(df.features.tolist()).shape)
%time print(np.array(list(df.features)).shape)
Result结果
(5793, 100)
CPU times: user 11.7 ms, sys: 3.42 ms, total: 15.1 ms
Wall time: 22.7 ms
(5793, 100)
CPU times: user 11.1 ms, sys: 137 µs, total: 11.3 ms
Wall time: 11.9 ms
(5793, 100)
CPU times: user 5.96 ms, sys: 0 ns, total: 5.96 ms
Wall time: 6.91 ms
(5793, 100)
CPU times: user 5.74 ms, sys: 0 ns, total: 5.74 ms
Wall time: 6.43 ms
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.