I need to perform the following steps on a data-frame:
I have tried the following steps:
Created the data-frame:
df = pd.DataFrame(pd.date_range(start = '2019-01-01', end = '2019-12-31'),columns = ['dt_id'])
Created attribute called 'balance':
df["balance"] = 0
Tried to conditionally update the data-frame:
df["balance"] = np.where(df.index == 0, 100, df["balance"].shift(1) + 1)
From what I can observe, the value is being retrieved for subsequent update before it can be updated in the original data-frame.
The desired output for "balance" attribute :
Row 0 : 100
Row 1: 101
Row 2 : 102
And so on
If I understand correctly if you add this line of code after yours, you are ready:
df["balance"].cumsum()
0 100.0
1 101.0
2 102.0
3 103.0
4 104.0
...
360 460.0
361 461.0
362 462.0
363 463.0
364 464.0
It is a cumulative sum, it sums its value with the previous one and since you have the starting value and then ones it will do what you want.
The problem you have is, that you want to calculate an array and the elements are dependent on each other. So, eg, element 2 depends on elemen 1 in your array. Element 3 depends on element 2, and so on.
If there is a simple solution, depends on the formula you use, ie, if you can vectorize it. Here is a good explanation on that topic: Is it possible to vectorize recursive calculation of a NumPy array where each element depends on the previous one?
In your case a simple loop should do it:
balance = np.empty(len(df.index))
balance[0] = 100
for i in range(1, len(df.index)):
balance[i] = balance[i-1] + 1 # or whatever formula you want to use
Please note, that above is the general solution. Your formula can be vectorized, thus also be generated using:
balance = 100 + np.arange(0, len(df.index))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.