简体   繁体   中英

Add new column for next consecutive row in pandas dataframe

I have a dataframe:

    id  value
0    1      0
1    1    100
2    1    200
3    1    300
4    1      0
5    2      0
6    2    500
7    2    600
8    2      0
9    3      0
10   3    700
11   3      0

for every entry in value column i want to add new column which belongs to the next row entry in value column,

for eg:

     id  value   value2
0    1      0     100
1    1    100     200
2    1    200     300
3    1    300       0
4    2    0       500
5    2    500     600
6    2    600       0
7    3      0     700
8    3    700       0 

Option 1
By using numpy instead of pd.Series.shift I avoid the insertion of np.nan and casting to float

df.groupby(
    'id', group_keys=False
).apply(lambda d: d.iloc[:-1].assign(value2=d.value.values[1:]))

    id  value  value2
0    1      0     100
1    1    100     200
2    1    200     300
3    1    300       0
5    2      0     500
6    2    500     600
7    2    600       0
9    3      0     700
10   3    700       0

Option 2
If all your 'id' are lumped together, I can do an np.roll on the entire column and to a simple groupby afterwards.

df.assign(
    value2=np.roll(df.value.values, -1)
).groupby('id', group_keys=False).apply(pd.DataFrame.head, n=-1)

    id  value  value2
0    1      0     100
1    1    100     200
2    1    200     300
3    1    300       0
5    2      0     500
6    2    500     600
7    2    600       0
9    3      0     700
10   3    700       0

You can use groupby and shift , followed by dropna .

df['value2'] = df.groupby('id').value.shift(-1)
df = df.dropna().astype(int)
df

    id  value  value2
0    1      0     100
1    1    100     200
2    1    200     300
3    1    300       0
5    2      0     500
6    2    500     600
7    2    600       0
9    3      0     700
10   3    700       0

I try see problem another way - create output with original df from previous question with numpy.append :

print (df)
   value  id
0    100   1
1    200   1
2    300   1
3    500   2
4    600   2
5    700   3

def f(x):
   return pd.DataFrame({'value1': np.append(x, 0), 'value2': np.append(0, x)})

df = df.groupby('id')['value'].apply(f).reset_index(level=1, drop=True).reset_index()
print (df)
   id  value1  value2
0   1     100       0
1   1     200     100
2   1     300     200
3   1       0     300
4   2     500       0
5   2     600     500
6   2       0     600
7   3     700       0
8   3       0     700

Somthing like that:

n = 1
df["value2"] = df["value1"].tolist()[n:] + df["value1"].tolist()[:n]

here n is the shift amount in rows

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM