简体   繁体   中英

Conditional Row shift in Pandas

I'm attempting to shift a row based on whether or not another column is not null. There's inconsistent spacing in the Description column so I can't do a.shift()

Here's the original data

Permit Number    A      Description
1234            NaN    NaN
NaN             NaN    NaN
NaN             NaN    foo
3456            NaN    NaN
NaN             NaN    bar

And this is what I want my result to be

Permit Number    A      Description
1234            NaN    foo
NaN             NaN    NaN
NaN             NaN    NaN
3456            NaN    bar
NaN             NaN    NaN

Here's the code that I used from Align data in one column with another row, based on the last time some condition was true

mask = df['Description'].notnull()
fmask = (df['Permit Number'].notnull() & df['Description'].isnull())
df.assign(Description=df.groupby(mask[::-1].cumsum())['Description'].transform(lambda x: x.iloc[-1]).where(fmask))

However when I run it, no errors and no changes in the dataframe.

If you do not need the NaN rows, you can go like following. I can't test it, but let me know if there's any typo.

this is the original.

Permit Number    A      Description
1234            NaN    NaN
NaN             NaN    NaN
NaN             NaN    foo
3456            NaN    NaN
NaN             NaN    bar

I try to achieve this one first:

df['Permit Number'] = df['Permit Number'].ffill()
Permit Number    A      Description
1234            NaN    NaN
1234            NaN    NaN
1234            NaN    foo
3456            NaN    NaN
3456            NaN    bar

then you have the result as you ask.

df.groupby(['Permit Number','A'])['Description'].max().reset_index()
Permit Number    A      Description
1234            NaN    foo
3456            NaN    bar

FYI for anyone who sees this that might have a string in their column, this solutions works too

mask = df['Description'].notnull()
fmask = (df['Permit Number'].notnull() & df['Description'].isnull())
df = df.assign(Description=df.groupby(mask[::-1].cumsum())['Description'].transform(lambda x: x.iloc[-1]).where(fmask))

I had a simple error of not "writing" the newly grouped dataframe back to df

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM