简体   繁体   中英

Python Previous Row Value

I have a data set with a header row and multiple sub lines that are associated like this.

Step status
0   010000000409139
1   00001
2   00002
3   00003
4   00004
5   00007
6   00005
7   00006
8   00008
9   010000000473498
10  00001
11  00002

What I want is just the header line repeated for all its' lines:

Step status
0   010000000409139
1   010000000409139
2   010000000409139
3   010000000409139
4   010000000409139
5   010000000409139
6   010000000409139
7   010000000409139
8   010000000409139
9   010000000473498
10  010000000473498
11  010000000473498

I tried to create a lambda function like this:

def logic(step):
    if len(step) == 15:
        return step
    else:
        return step.shift()
pm2['StepLogic'] = pm2.apply(lambda x: logic(x['Step status']),axis=1)

I'm getting error: AttributeError: ("'str' object has no attribute 'shift'", 'occurred at index 1')

Is there a smarter way to get what I'm after?

You can create a boolean series by checking the len of status , use cumsum to create a group number, and then groupby on it and finally transform :

df["status"] = df.groupby(df["status"].str.len().eq(15).cumsum())["status"].transform("first")

print (df)

    Step           status
0      0  010000000409139
1      1  010000000409139
2      2  010000000409139
3      3  010000000409139
4      4  010000000409139
5      5  010000000409139
6      6  010000000409139
7      7  010000000409139
8      8  010000000409139
9      9  010000000473498
10    10  010000000473498
11    11  010000000473498

尝试这个:

df['Status'] = df['Status'].where(df['Status'].str.len().gt(5)).fillna(method='ffill')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM