简体   繁体   中英

How to vectorize pandas code where it depends on previous row?

I am trying to vectorize a code snippet in pandas:

I have a pandas dataframe generated like this:

ids ftest vals
0 Q52EG 0 0
1 Q52EG 0 1
2 Q52EG 1 2
3 Q52EG 1 3
4 Q52EG 1 4
5 QQ8Q4 0 5
6 QQ8Q4 0 6
7 QQ8Q4 1 7
8 QQ8Q4 1 8
9 QVIPW 1 9

If any id in ids column has a value 1 in the ftest column, then all the subsequent rows with same id should be marked as 1 in has_hist column and it doesnt depend on the current ftest value as shown in the dataframe below:

ids ftest vals has_hist
0 Q52EG 0 0 0
1 Q52EG 0 1 0
2 Q52EG 1 2 0
3 Q52EG 1 3 1
4 Q52EG 1 4 1
5 QQ8Q4 0 5 0
6 QQ8Q4 0 6 0
7 QQ8Q4 1 7 0
8 QQ8Q4 1 8 1
9 QVIPW 1 9 0

I am doing this using a iterative approach like this:

previous_present = {}
has_prv_history = []
for index, value in id_df.iterrows():
    my_id = value["ids"]
    ftest_mentioned = value["ftest"]
    previous_flag = 0
    if my_id in previous_present.keys():
        previous_flag = 1
    elif (ftest_mentioned == 1):
        previous_present[my_id] = 1
    has_prv_history.append(previous_flag)
id_df["has_hist"] = has_prv_history

Can this code be vectorized without using apply ?

Two key functions for this kind of tasks are shift and ffill , applied per group. For this specific question:

df2["has_hist"] = df.groupby("ids").ftest.shift().where(lambda s: s.eq(1))
df2["has_hist"] = df2.groupby("ids").has_hist.ffill().fillna(0).astype("int32")

Here is a variant with transform , which however is often slower than "pure" Pandas operations in my experience:

df2 = (
    df
    .groupby("ids")
    .ftest.transform(
        lambda s: (
            s
            .shift()
            .where(lambda t: t.eq(1))
            .ffill()
            .fillna(0)
            .astype("int32")
        )
    )
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM