I have some datasets (lets stay at 2 here) which are dependent on a common variable t, like X1(t) and X2(t). However X1(t) and X2(t) don't have to share the same t values or even have the same amount of datapoints.
For example they could look like:
t1 = [2,6,7,8,10,13,14,16,17]
X1 = [10,10,10,20,20,20,30,30,30]
t2 = [3,4,5,6,8,10,11,14,15,16]
X2 = [95,100,100,105,158,150,142,196,200,204]
I am trying to create a new dataset YNew(XNew) (=X2(X1)) such that both datasets are linked without the shared variable t. In this case it should look like:
XNew = [10,20,30]
YNew = [100,150,200]
where to every occuring X1-value a corresponding X2-value (a mean value) is assigned.
Is there an easy already known way to achieve this(maybe with pandas)? My first guess would be to find all t-values for a certain X1-value (in the example case the X1-value 10 would lie in the range 2,...,7) and then look for all X2-values in that range and get their mean value. Then you should be able to assign YNew(XNew). Thanks for every advice!
Update: I added a graph, so maybe my intentions are a bit more clear. I want to assign the mean X2-value to the corresponding X1-value in the marked regions (where the same X1-values occur).
alright, I just tried to implement what I mentioned and it works as I liked it. Although I think that some things are still a little clumsy...
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# datasets to treat
t1 = [2,6,7,8,10,13,14,16,17]
X1 = [10,10,10,20,20,20,30,30,30]
t2 = [3,4,5,6,8,10,11,14,15,16]
X2 = [95,100,100,105,158,150,142,196,200,204]
X1Series = pd.Series(X1, index = t1)
X2Series = pd.Series(X2, index = t2)
X1Values = X1Series.drop_duplicates().values #returns all occuring values of X1 without duplicates as array
# lists for results
XNew = []
YNew = []
#find for every occuring value X1 the mean value of X2 in the range of X1
for value in X1Values:
indexpos = X1Series[X1Series == value].index.values
max_t = indexpos[indexpos.argmax()] # get max and min index of the range of X1
min_t =indexpos[indexpos.argmin()]
print("X1 = "+str(value)+" occurs in range from "+str(min_t)+" to "+str(max_t))
slicedX2 = X2Series[(X2Series.index >= min_t) & (X2Series.index <= max_t)] # select range of X2
print("in this range there are following values of X2:")
print(slicedX2)
mean = slicedX2.mean() #calculate mean value of selection and append extracted values
print("with the mean value of: " + str(mean))
XNew.append(value)
YNew.append(mean)
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax1.plot(t1, X1,'ro-',label='X1(t)')
ax1.plot(t2, X2,'bo',label='X2(t)')
ax1.legend(loc=2)
ax1.set_xlabel('t')
ax1.set_ylabel('X1/X2')
ax2.plot(XNew,YNew,'ro-',label='YNew(XNew)')
ax2.legend(loc=2)
ax2.set_xlabel('XNew')
ax2.set_ylabel('YNew')
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.