Suppose i have two farms, A and B. Each week there are different animals there. How can i get the cumulative number of the animal that is currently at each farm?
+---+-----+--------+-----+--------+
| | A | Farm_A | B | Farm_B |
+---+-----+--------+-----+--------+
| 0 | dog | 1 | cat | 1 |
| 1 | cat | 0 | dog | 1 |
| 2 | cat | 0 | dog | 1 |
| 3 | cat | 1 | dog | 0 |
| 4 | dog | 1 | dog | 1 |
| 5 | dog | 0 | dog | 0 |
| 6 | dog | 1 | cat | 1 |
+---+-----+--------+-----+--------+
With groupby i can get the cumsum from each farm:
df['A cumsum Farm_A'] = df.groupby(['A'])['Farm_A'].cumsum()
df['B cumsum Farm_B'] = df.groupby(['B'])['Farm_B'].cumsum()
+---+-----+--------+-----+--------+-----------------+-----------------+
| | A | Farm_A | B | Farm_B | A cumsum Farm_A | B cumsum Farm_B |
+---+-----+--------+-----+--------+-----------------+-----------------+
| 0 | dog | 1 | cat | 1 | 1 | 1 |
| 1 | cat | 0 | dog | 1 | 0 | 1 |
| 2 | cat | 0 | dog | 1 | 0 | 2 |
| 3 | cat | 1 | dog | 0 | 1 | 2 |
| 4 | dog | 1 | dog | 1 | 2 | 3 |
| 5 | dog | 0 | dog | 0 | 2 | 3 |
| 6 | dog | 1 | cat | 1 | 3 | 2 |
+---+-----+--------+-----+--------+-----------------+-----------------+
My problem is, how can i get the cumulative sum of animals from both farm A and B for each row?
For example row 3: The animal at Farm A is cat, then i want the sum of cats from both farm A and B from row 0, 1, 2, 3 = 2 cats.
At line 3 again, the animal at farm B is dog, then i want the total number of dogs from both farm from row 0, 1, 2, 3 = 3
This is what i want to achieve:
+---+-----+--------+-----+--------+-----------------+-----------------+-----------------+-----------------+
| | A | Farm_A | B | Farm_B | A cumsum Farm_A | B cumsum Farm_B | A at both farms | B at both farms |
+---+-----+--------+-----+--------+-----------------+-----------------+-----------------+-----------------+
| 0 | dog | 1 | cat | 1 | 1 | 1 | 1 | 1 |
| 1 | cat | 0 | dog | 1 | 0 | 1 | 1 | 2 |
| 2 | cat | 0 | dog | 1 | 0 | 2 | 1 | 3 |
| 3 | cat | 1 | dog | 0 | 1 | 2 | 2 | 3 |
| 4 | dog | 1 | dog | 1 | 2 | 3 | 4 | 5 |
| 5 | dog | 0 | dog | 0 | 2 | 3 | 5 | 5 |
| 6 | dog | 1 | cat | 1 | 3 | 2 | 6 | 3 |
+---+-----+--------+-----+--------+-----------------+-----------------+-----------------+-----------------+
The last two columns can be created working with dummies. This allows you to create a cumsum
per animal type across farms, which you can then lookup
to get the appropriate value for each row.
import pandas as pd
res = pd.get_dummies(df, columns=['A', 'B'])
# Animals only count if dummy & exists, so need to multiply.
res = pd.concat([res.filter(like='A_').multiply(res.Farm_A, axis=0),
res.filter(like='B_').multiply(res.Farm_B, axis=0)],
axis=1)
# Cumsum per animal
res = res.groupby(res.columns.str.split('_').str[1], axis=1).apply(lambda x: x.sum(1).cumsum())
# cat dog
#0 1 1
#1 1 2
#2 1 3
#3 2 3
#4 2 5
#5 2 5
#6 3 6
# Lookup
df['A at both'] = res.lookup(df.index, df.A)
df['B at both'] = res.lookup(df.index, df.B)
A Farm_A B Farm_B A at both B at both
0 dog 1 cat 1 1 1
1 cat 0 dog 1 1 2
2 cat 0 dog 1 1 3
3 cat 1 dog 0 2 3
4 dog 1 dog 1 5 5
5 dog 0 dog 0 5 5
6 dog 1 cat 1 6 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.