[英]Sort parts of 2D numpy array
So I have a numpy array A of dimensions (8760,12). 所以我有一个尺寸为(8760,12)的numpy数组A。 Basically all the hours of 12 years.
基本上所有的时间都是12年。 I need to sort each month (730 hours) in each year in the array.
我需要对阵列中每年的每个月(730小时)进行排序。 I haven't found any way to do it inside the array.
我还没有找到在数组内部进行任何处理的方法。 So my solution was to take out each month, sort it and then create the entire 2d array again.
因此,我的解决方案是每个月取出一次,对其进行排序,然后再次创建整个2d数组。 I was thinking of doing something along the lines of what I have below, but it isn't working.
我当时正在考虑按照下面的方法做一些事情,但是没有用。
total=np.zeroes([8760,12])
for j in range(1,12):
for i in range (1,12):
#here i take out every month of every year
month=A[730*(i-1):-730*(12-i),(j-1):-(12-j)]
#here I sort the data
month_sorted=np.sort(month,axis=0,kind='quicksort')
#here I try to add the sorted months back into 1 big array
np.concatenate(total,month_sorted,axis=0)
np.concatenate(total,month_sorted,axis=1)
Concatenate doesn't work on arrays of different sizes. 串联不适用于不同大小的数组。
And I don't really have a way to place the month of year 2 in row 2 of my array. 而且我真的没有办法将第二年的月份放在数组的第二行中。 I guess it should be done with indexing idx or iloc or something like that.
我猜应该用索引idx或iloc或类似的东西来完成。
EDIT: My values are integers. 编辑:我的值是整数。
The result should be values ordered from low to high for each 730(hours in a month) values per row. 结果应该是每行每730(一个月中的小时)值从低到高排序。 So imagine I would have 3 years instead of 12 and 9 hours instead of 8760 hours which have to be sorted each 3 hours instead of each 730 hours.
想象一下,我将有3年而不是12和9小时,而不是8760小时,而这必须是每3小时而不是每730小时进行排序。 The array looks like this :
该数组如下所示:
[[30,40,10,20,50,60,80,200,100]
[8,20,5,6,8,1,5,3,2]
[520,840,600,525,430,20,1,506,703]]
And should be converted into : 并且应转换为:
[[10,30,40,20,50,60,80,100,200]
[5,8,20,1,6,8,2,3,5]
[520,600,840,20,430,525,1,506,703]]
So my current code take out the first part 30,40,10 and sorts it as 10,30,40. 因此,我当前的代码取出了第一部分30,40,10并将其排序为10,30,40。 But the part that I can't solve is how to create the big array again from all the smaller ones in the 2 loops.
但是我无法解决的部分是如何从2个循环中的所有较小数组中再次创建大型数组。
You can use python indexes and assignment instead of concatenate if you create the empty array first. 如果先创建空数组,则可以使用python索引和赋值而不是连接。
A = np.random.randint(0,99,(8760,12))
total=np.zeros([8760,12])
for j in range(12):
for i in range (12):
total[730*i:730*(i+1),j] = np.sort(A[730*i:730*(i+1),j])
If you want the same thing staring from no array and using concatenate-like function i would do it like this 如果您希望同一件事从无数组开始并且使用类似串联的函数,我会这样做
total2=None
for j in range(12):
app1 = None
for i in range (12):
app = np.sort(A[730*i:730*(i+1),j])
if app1 is None:
app1 = app
else:
app1 = np.hstack((app1,app))
if total2 is None:
total2 = app1
else:
total2 = np.vstack((total2,app1))
total2 = np.transpose(total2)
EDIT to answer comment(how to apply same sorting to different array) 编辑以回答评论(如何将相同的排序应用于不同的数组)
bs = 3
B2 = np.empty(B.shape)
for j in range(A.shape[1]):
for i in range(int(A.shape[0]/bs)):
A2_order = np.argsort(A[bs * i : bs * (i + 1), j])
B2[bs * i : bs * (i + 1),j] = B[A2_order+i*bs,j]
You can avoid looping alltogether. 您可以避免一起循环。
First transpose and reshape the array so that the array indices go from coarse to fine (year->month->hour). 首先对数组进行转置和重塑,以使数组索引从粗略变为精细(年->月->小时)。
A = np.transpose(A)
A = np.reshape(A, [12, 12, 730])
Now you can select all hours of a month as A[year, month]
现在您可以将一个月的所有小时选择为
A[year, month]
Conveniently, the np.sort
function by default sorts along the last axis of the array, so you can just call 方便地,默认情况下,
np.sort
函数沿数组的最后一个轴排序,因此您只需调用
A = np.sort(A)
and now each list of A[year, month]
entries will be sorted. 现在将对
A[year, month]
条目的每个列表进行排序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.