[英]Restructuring a 2-D numpy array into a 3-D numpy array according to values in a column of a dataframe
I have a 2-D numpy array let's say like this:我有一个二维 numpy 数组让我们这样说:
matrix([[1., 0., 0., ..., 1., 0., 0.],
[1., 0., 0., ..., 0., 1., 1.],
[1., 0., 0., ..., 1., 0., 0.],
[1., 1., 0., ..., 1., 0., 0.],
[1., 1., 0., ..., 1., 0., 0.],
[1., 1., 0., ..., 1., 0., 0.]])
I want to transform it into a 3-D numpy array based on the values of a column of a dataframe.我想根据 dataframe 的列的值将其转换为 3-D numpy 数组。 Let's say the column is like this:
假设列是这样的:
df = pd.DataFrame({"Case":[1,1,2,2,3,4]})
The final 3-D array should look like this:最终的 3-D 数组应如下所示:
matrix([
[
[1., 0., 0., ..., 1., 0., 0.], [1., 0., 0., ..., 0., 1., 1.]
],
[
[1., 0., 0., ..., 1., 0., 0.], [1., 1., 0., ..., 1., 0., 0.]
],
[
[1., 1., 0., ..., 1., 0., 0.]
],
[
[1., 1., 0., ..., 1., 0., 0.]
]
])
The first 2 arrays of the initial 2-D array becomes a 2-D array of the final 3-D array because from the column of the dataframe the first and second rows both have the same values of '1'.初始 2-D 数组的前 2 个 arrays 成为最终 3-D 数组的 2-D 数组,因为从 dataframe 的列开始,第一行和第二行都具有相同的值“1”。 Similarly, the next 2 arrays become another 2-D array of 2 arrays because the next two values of the column of the dataframe are '2' so the belong together.
同样,接下来的 2 个 arrays 成为另一个 2-D 数组 2 arrays 因为 dataframe 的列的下两个值是 '2' 所以属于一起。 There is only one row for the values '3' and '4' so the next 2-D arrays of the 3-D array has only 1 array each.
值“3”和“4”只有一行,因此 3-D 阵列的下一个 2-D arrays 每个只有 1 个阵列。
So, basically if two or more numbers of the column of the dataframe are same, then those indices of rows of the 2-D initial matrix belong together and are transformed into a 2-D matrix and pushed as an element of the final 3-D matrix.因此,基本上如果 dataframe 的列的两个或多个数字相同,则 2-D 初始矩阵的行的那些索引属于一起并被转换为 2-D 矩阵并作为最终 3- D 矩阵。
How do I do this?我该怎么做呢?
Numpy doesn't have very good support for arrays with rows of different length , but you can make it a list
of 2D arrays instead: Numpy 对具有不同长度的行的 arrays 没有很好的支持,但您可以将其设为二维 arrays
list
:
M = np.ndarray(
[[1., 0., 0., ..., 1., 0., 0.],
[1., 0., 0., ..., 0., 1., 1.],
[1., 0., 0., ..., 1., 0., 0.],
[1., 1., 0., ..., 1., 0., 0.],
[1., 1., 0., ..., 1., 0., 0.],
[1., 1., 0., ..., 1., 0., 0.]]
)
df = pd.DataFrame({"Case":[1,1,2,2,3,4]})
M_per_case = [
np.stack([M[index] for index in df.index[df['Case'] == case]])
for case in set(df['Case'])
]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.