I have to setup an input tensor for machine learning purposes which look like:
tensor=array[array[object_1],array[object_2],...,array[object_n]]
np.shape(tensor)=(a,n,6)
now every object array is 1-dimensional with let's say 6 entries that are the variables to describe them. I want to extend these 6 entries with 4 more entries. The variables for this extended information is saved in an array
np.shape(extra_information)=(a,m,4) #m<n
m is less than n because each array in extra_information is for an intervall of objects. I can use for loops to do this but it has to scale well for large numbers of a. I thought about bringing extra_infromation
into the shape of np.shape(extra_information)=(a,m=n,4)
and then use something like np.dstack([tensor,extra_information],axis=1)
but I am not sure if this is the most elegant solution or how to get the extra_information
into the shape I want while scaling well. The inverals are saved into an array,too:
np.shape(extra_information_intervall)=(a,m,2) #m<n
Edit:
I have no following solution that works but is surely inefficient:
def extend(track,jet,padding_size):
event,jets,var=np.shape(jet)
jet_to_track_data=[]
for i in jet:
event_jet_to_track=[]
lasttrack=0
for k in i:
if k[4]!=0:
x=np.array([k[l] for l in range(var-2)])
shape=(int(k[var-1]-k[var-2]),var-2)
value=np.broadcast_to(x,shape)
event_jet_to_track.append(value)
lasttrack=k[var-1]
else:
x=np.array((var-2)*[0])
shape=(int(padding_size-lasttrack),var-2)
value=np.broadcast_to(x,shape)
event_jet_to_track.append(value)
break
jet_to_track_data.append(event_jet_to_track)
jet_to_track_data=[np.vstack(x) for x in jet_to_track_data]
jet_to_track_data=np.stack(jet_to_track_data)
extended=np.concatenate([track,jet_to_track_data],axis=2)
return extended
To clarify the problem further. with an example in 2-d:
tensor=[[1,2,3],[3,4,5],[6,7,8],....,[...]]
extra_information=[[a],[b],[c],....[...]]
extra_information_intervall=[[0,2],[3,4],...,[0,0]]
Due to zerro padding the shape of extra_information
is (a,n,4)
but it only contains information to the m-entry and will be filled with zeros inbetween n and m. this is true for extra_information_intervall
, too. now the goal is to merge these informations into tensor like:
tensor=[[1,2,3,a],[3,4,5,a],[6,7,8,b],....,[...,0]]
IIUC, you have an (a,n,6)
tensor which comprises of n
objects, some of which are interval objects. These interval objects each have 4 more features apart from the 6 features already available. The tensor that holds these 4 features is (a,m,4)
, where m<n and m = the number of interval objects among n objects.
Assuming that these intervals start from the 0th object AND they are repeating baseds on a given number of repeatitions (interval lengths) I can safely say that the structure is following -
ORIGINAL ADDITIONAL
Obj0 [......] --> Obj0 [....]
Obj1 [......]
Obj2 [......]
Obj3 [......] --> Obj3 [....]
Obj4 [......]
Obj5 [......]
Obj6 [......] --> Obj6 [....]
Obj7 [......]
Obj8 [......]
Assuming that you want to simply copy the additional information to the next few objects until the subsequent interval is reached, you basically will fill the gaps between the m
intervals such that now you have n
objects with additional info.
You can do this with an np.repeat
. You can calculate how many times to repeat each of the m
objects from your intervals list and store it in s
. A detailed explanation about this is in the last paragraph.
a = np.random.random((2,10,6))
b = np.random.random((2,5,4))
s = np.array([2,3,2,2,1]) #Number of elements = number of objects m
#Sum of elements = number of objects n
new_b = np.repeat(b, s, axis=1)
#new_b shape = (2,10,4)
out = np.dstack((a, new_b))
out.shape
(2, 10, 10)
This would do the following -
ORIGINAL ADDITIONAL
Obj0 [......] --> Obj0 [....]
Obj1 [......] --> Obj0 [....]
Obj2 [......] --> Obj0 [....]
Obj3 [......] --> Obj3 [....]
Obj4 [......] --> Obj3 [....]
Obj5 [......] --> Obj3 [....]
Obj6 [......] --> Obj6 [....]
Obj7 [......] --> Obj6 [....]
Obj8 [......] --> Obj6 [....]
EDIT: I have updated my answer based on your inputs. Since the intervals can be of variable length, you can still simply use np.repeat
but just pass another parameter s
which tells it how many times to repeat each element. This can be calculated from your list of intervals. The length of this s
array needs to be equal to m
objects so it can tell how many times each of these objects need to be repeated respectively. AND, the sum of s
needs to be equal to n
since the output array after repeating needs to be have same objects as the original array = n
s = np.array([2,3,2,2,1]) #Number of elements = number of objects m
#Sum of elements = number of objects n
#First object from m objects will repeat 2 times,
#second will repeat 3 times, etc...
#Total number of objects created after
#repetitions = 10 = objects in original tensor.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.