Is there a simpler way to write the following code? Currently the code is taking forever to run, which is strange. Something is wrong with it...seems way too complicated for the objective. I also get a SettingWithCopyWarning
notice that says:
a value is trying to be set on a copy of a slice from a DataFrame
The objective is: I have four time series/columns - a, b, c and d
, where d
is the output column that needs to be populated.
a
is larger than b
, then return 1 in column d
, a
is below c
, then return 0 in column d
, a
is between b
and c
, then return the previous item in column d
. Notice that this last if
statement references the previous item in column d.
data['d']=1
data['previous_d']=1
for i in range (len(data.a)):
data.previous_d.iloc[i]=data.d.iloc[i-1]
data.stance.iloc[i] = np.where((data.a.iloc[i]> data.b.iloc[i]),1,np.where((data.a.iloc[i]< data.c.iloc[i]),0,data.previous_d.iloc[i]))
Just some Boolean indexing will work
data.loc[data.a> data.b,'d'] = 1
data.loc[data.a < data.c,'d'] = 0
data.loc[(data.a < data.b) & (data.a > data.c),'d'] = data.d.shift(1)
Or if the third conditions means that every thing between unfilled numbers is filled from the last populated value, use
data.d.fillna(method='ffill',inplace=True)
instead of the last line
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.